curl --request POST \
--url https://api.crustdata.com/dataset/web/fetch \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--header 'x-api-version: <x-api-version>' \
--data '
{
"urls": [
"https://example.com"
]
}
'[
{
"success": true,
"url": "https://example.com",
"timestamp": 1774446519,
"pageTitle": "Example Domain",
"content": "<html lang=\"en\"><head><title>Example Domain</title></head><body><div><h1>Example Domain</h1><p>This domain is for use in documentation examples without needing permission.</p></div></body></html>"
}
]Fetches the web content of one or more URLs and returns the scraped content including page title and HTML body. Supports up to 10 URLs per request and optional Cloudflare bypass.
curl --request POST \
--url https://api.crustdata.com/dataset/web/fetch \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--header 'x-api-version: <x-api-version>' \
--data '
{
"urls": [
"https://example.com"
]
}
'[
{
"success": true,
"url": "https://example.com",
"timestamp": 1774446519,
"pageTitle": "Example Domain",
"content": "<html lang=\"en\"><head><title>Example Domain</title></head><body><div><h1>Example Domain</h1><p>This domain is for use in documentation examples without needing permission.</p></div></body></html>"
}
]Bearer token authentication. Pass your API key as Authorization: Bearer <your_api_key>.
API version to use for request routing and response shape.
2025-11-01 The list of URLs to fetch content from.
Request payload for /dataset/web/fetch. Provide one or more URLs (up to 10) to scrape web page content.
List of URLs to fetch web content from. Minimum 1, maximum 10 URLs per request.
1 - 10 elements2000["https://example.com"]Whether to attempt bypassing Cloudflare protection. Increases latency when enabled.
false
Successfully fetched web content.
Whether the page content was fetched successfully.
true
The URL that was fetched (may differ from the requested URL due to redirects).
"https://example.com"
Unix timestamp (seconds) of when the content was fetched.
1774446519
The title of the fetched web page extracted from the HTML title tag.
"Example Domain"
The full HTML content of the fetched web page.
"<html lang=\"en\"><head><title>Example Domain</title></head><body><div><h1>Example Domain</h1></div></body></html>"
[
{
"success": true,
"url": "https://example.com",
"timestamp": 1774446519,
"pageTitle": "Example Domain",
"content": "<html lang=\"en\"><head><title>Example Domain</title></head><body><div><h1>Example Domain</h1></div></body></html>"
}
]Was this page helpful?