Skip to main content
POST
https://api.crustdata.com
/
dataset
/
web
/
fetch
curl --request POST \
  --url https://api.crustdata.com/dataset/web/fetch \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --header 'x-api-version: <x-api-version>' \
  --data '
{
  "urls": [
    "https://example.com"
  ]
}
'
[
  {
    "success": true,
    "url": "https://example.com",
    "timestamp": 1774446519,
    "pageTitle": "Example Domain",
    "content": "<html lang=\"en\"><head><title>Example Domain</title></head><body><div><h1>Example Domain</h1><p>This domain is for use in documentation examples without needing permission.</p></div></body></html>"
  }
]

Authorizations

Authorization
string
header
required

Bearer token authentication. Pass your API key as Authorization: Bearer <your_api_key>.

Headers

x-api-version
enum<string>
required

API version to use for request routing and response shape.

Available options:
2025-11-01

Body

application/json

The list of URLs to fetch content from.

Request payload for /dataset/web/fetch. Provide one or more URLs (up to 10) to scrape web page content.

urls
string[]
required

List of URLs to fetch web content from. Minimum 1, maximum 10 URLs per request.

Required array length: 1 - 10 elements
Maximum string length: 2000
Example:
["https://example.com"]
solveCloudflare
boolean
default:false

Whether to attempt bypassing Cloudflare protection. Increases latency when enabled.

Example:

false

Response

Successfully fetched web content.

success
boolean

Whether the page content was fetched successfully.

Example:

true

url
string

The URL that was fetched (may differ from the requested URL due to redirects).

Example:

"https://example.com"

timestamp
integer

Unix timestamp (seconds) of when the content was fetched.

Example:

1774446519

pageTitle
string

The title of the fetched web page extracted from the HTML title tag.

Example:

"Example Domain"

content
string

The full HTML content of the fetched web page.

Example:

"<html lang=\"en\"><head><title>Example Domain</title></head><body><div><h1>Example Domain</h1></div></body></html>"

Example:
[
{
"success": true,
"url": "https://example.com",
"timestamp": 1774446519,
"pageTitle": "Example Domain",
"content": "<html lang=\"en\"><head><title>Example Domain</title></head><body><div><h1>Example Domain</h1></div></body></html>"
}
]