> ## Documentation Index
> Fetch the complete documentation index at: https://docs.crustdata.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Web Fetch

> Fetch the HTML content of public webpages by URL for content extraction, data collection, and monitoring.

**Use this when** you have specific URLs and need to retrieve their full
HTML content — for content extraction, data collection, SEO analysis, or
change tracking.

The Web Fetch API accepts a list of URLs and returns the page title and
full HTML content for each. You can fetch up to 10 URLs in a single
request.

Every request goes to the same endpoint:

```
POST https://api.crustdata.com/web/enrich/live
```

<Note>
  The endpoint path is `/web/enrich/live` (not `/web/fetch/live`) because it
  follows the Crustdata convention where "enrich" means adding data to a known
  identifier — in this case, enriching a URL with its page content.
</Note>

<Snippet file="web-auth-headers.mdx" />

<Callout icon="lock" color="#f59e0b">
  <strong>Pricing:</strong> <code>1 credit per page</code>.
</Callout>

<Note>
  Default `rate-limit` is 10 requests per minute. Send an email to
  [gtm@crustdata.co](mailto:gtm@crustdata.co) to discuss higher limits if
  needed for your use case.
</Note>

## Request body

| Parameter    | Type      | Required | Default | Description                                                                                       |
| ------------ | --------- | -------- | ------- | ------------------------------------------------------------------------------------------------- |
| `urls`       | string\[] | Yes      | —       | URLs to fetch. Min: 1, max: 10. Must include `http://` or `https://`.                             |
| `human_mode` | boolean   | No       | `false` | Attempt a browser-like fetch path when a site is protected by Cloudflare or similar bot controls. |

## Response body

The response is an **array** (not an object) — one entry per URL in your
request.

| Field       | Type     | Description                                                   |
| ----------- | -------- | ------------------------------------------------------------- |
| `success`   | boolean  | Whether this URL was fetched successfully.                    |
| `url`       | string?  | The URL that was fetched. `null` if the fetch failed.         |
| `timestamp` | integer? | Unix timestamp (**seconds**) when fetched. `null` on failure. |
| `title`     | string?  | The `<title>` tag content. `null` on failure.                 |
| `content`   | string?  | Full HTML content of the page. `null` on failure.             |

<Note>
  **Timestamps:** Fetch timestamps are in **seconds**. Search timestamps are
  in **milliseconds**. Account for this when comparing timestamps across
  endpoints.
</Note>

<CardGroup cols={2}>
  <Card title="Examples" icon="list-filter" href="/web-docs/fetch/examples">
    Fetch multiple URLs, handle partial failures, bypass bot protection with
    human mode, and process fetched HTML content.
  </Card>

  <Card title="Reference" icon="book" href="/web-docs/fetch/reference">
    Request parameters, response body details, error handling, and common
    gotchas.
  </Card>
</CardGroup>

***

## Fetch a single URL

The simplest request fetches one URL and returns its HTML content.

<CodeGroup>
  ```bash Request theme={"theme":"vitesse-black"}
  curl --request POST \
    --url https://api.crustdata.com/web/enrich/live \
    --header 'authorization: Bearer YOUR_API_KEY' \
    --header 'content-type: application/json' \
    --header 'x-api-version: 2025-11-01' \
    --data '{
      "urls": ["https://example.com"]
    }'
  ```

  ```json Response theme={"theme":"vitesse-black"}
  [
      {
          "success": true,
          "url": "https://example.com",
          "timestamp": 1775193366,
          "title": "Example Domain",
          "content": "<html lang=\"en\"><head><title>Example Domain</title>...</head><body><div><h1>Example Domain</h1><p>This domain is for use in documentation examples.</p></div></body></html>"
      }
  ]
  ```
</CodeGroup>

<Note>
  The `content` field is trimmed here. It contains the full HTML of the
  fetched page.
</Note>

**Extract:** Parse `content` using an HTML parser (BeautifulSoup for
Python, Cheerio for Node.js) to extract specific elements like text,
links, or metadata.

***

## What to do next

* **Try more patterns** — see [Examples](/web-docs/fetch/examples) for multi-URL fetches, partial failures, human mode, and content processing.
* **Look up request/response details** — see [Reference](/web-docs/fetch/reference).
* **Search then fetch** — see [Web Search examples](/web-docs/search/examples) for search-then-fetch workflow patterns.
* **Find URLs to fetch** — use [Web Search](/web-docs/search/introduction) to find URLs for downstream fetching.
