> ## Documentation Index
> Fetch the complete documentation index at: https://docs.crustdata.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Web Fetch examples

> Web Fetch examples: fetch multiple URLs, handle partial failures, bypass bot protection with human mode, and process fetched HTML content.

Ready-to-copy patterns for [Web Fetch](/web-docs/fetch/introduction). Each example shows
a real request, the response, and what to extract.

<Snippet file="web-auth-headers.mdx" />

***

## Fetch multiple URLs

Pass up to 10 URLs to fetch their content in parallel.

<CodeGroup>
  ```bash Request theme={"theme":"vitesse-black"}
  curl --request POST \
    --url https://api.crustdata.com/web/enrich/live \
    --header 'authorization: Bearer YOUR_API_KEY' \
    --header 'content-type: application/json' \
    --header 'x-api-version: 2025-11-01' \
    --data '{
      "urls": [
        "https://example.com",
        "https://example.org",
        "https://www.crustdata.com"
      ]
    }'
  ```

  ```json Response theme={"theme":"vitesse-black"}
  [
      {
          "success": true,
          "url": "https://example.org",
          "timestamp": 1775193386,
          "title": "Example Domain",
          "content": "<html lang=\"en\"><head><title>Example Domain</title></head><body>...</body></html>"
      },
      {
          "success": true,
          "url": "https://example.com",
          "timestamp": 1775193386,
          "title": "Example Domain",
          "content": "<html lang=\"en\"><head><title>Example Domain</title></head><body>...</body></html>"
      },
      {
          "success": true,
          "url": "https://www.crustdata.com",
          "timestamp": 1775193387,
          "title": "Crustdata: Real-Time B2B Data Broker",
          "content": "<html>...</html>"
      }
  ]
  ```
</CodeGroup>

<Note>
  **Current platform behavior:** The response array order may differ from the
  request order. Match successful results by their `url` field, not by array
  index.
</Note>

***

## Handle partial failures

When some URLs succeed and others fail, the request still returns `200`.
Failed URLs have `success: false` with all other fields as `null`.

<CodeGroup>
  ```bash Request theme={"theme":"vitesse-black"}
  curl --request POST \
    --url https://api.crustdata.com/web/enrich/live \
    --header 'authorization: Bearer YOUR_API_KEY' \
    --header 'content-type: application/json' \
    --header 'x-api-version: 2025-11-01' \
    --data '{
      "urls": [
        "https://example.com",
        "https://this-domain-does-not-exist-xyz.com"
      ]
    }'
  ```

  ```json Response theme={"theme":"vitesse-black"}
  [
      {
          "success": true,
          "url": "https://example.com",
          "timestamp": 1775193366,
          "title": "Example Domain",
          "content": "<html lang=\"en\"><head><title>Example Domain</title></head><body>...</body></html>"
      },
      {
          "success": false,
          "url": null,
          "timestamp": null,
          "title": null,
          "content": null
      }
  ]
  ```
</CodeGroup>

### Correlating failures to input URLs

Failed entries have `url: null`, so you cannot directly identify which
input URL failed. To correlate failures:

1. Track the URLs you sent.
2. Collect the `url` values from all successful entries.
3. Any input URL not in the successful set is the one that failed.

```javascript theme={"theme":"vitesse-black"}
const requestedUrls = [
    "https://example.com",
    "https://this-domain-does-not-exist-xyz.com",
];
const successfulUrls = new Set(
    fetchResponse.filter((r) => r.success).map((r) => r.url),
);
const failedUrls = requestedUrls.filter((url) => !successfulUrls.has(url));
// failedUrls = ["https://this-domain-does-not-exist-xyz.com"]
```

<Tip>
  Always check the `success` field for each entry in the response array. Build
  your parsing logic to handle both successful and failed entries gracefully.
</Tip>

***

## Bypass Cloudflare protection with human mode

Some websites use Cloudflare to block automated requests. Set
`human_mode: true` to attempt a browser-like fetch path for these pages.

```bash theme={"theme":"vitesse-black"}
curl --request POST \
  --url https://api.crustdata.com/web/enrich/live \
  --header 'authorization: Bearer YOUR_API_KEY' \
  --header 'content-type: application/json' \
  --header 'x-api-version: 2025-11-01' \
  --data '{
    "urls": ["https://example.com"],
    "human_mode": true
  }'
```

<Note>
  **Current platform behavior:** Cloudflare bypass is not guaranteed. Some
  sites have additional protections that may still block the request.
</Note>

***

## Processing fetched content

The `content` field returns raw HTML. Here are common next steps:

| Task                  | Approach                                                       |
| --------------------- | -------------------------------------------------------------- |
| Extract text          | Parse HTML and strip tags (BeautifulSoup, Cheerio, etc.)       |
| Extract links         | Find all `<a>` tags and their `href` attributes                |
| Extract metadata      | Parse `<meta>` tags for SEO data (description, og:title, etc.) |
| Detect changes        | Fetch periodically and diff the `content` or `title` fields    |
| Resolve relative URLs | Combine relative paths with the base `url` from the response   |

***

## Next steps

* [Web Fetch](/web-docs/fetch/introduction) — back to the main Fetch page.
* [Reference](/web-docs/fetch/reference) — request parameters, error handling, and common gotchas.
* [Web Search examples](/web-docs/search/examples) — search-then-fetch workflow patterns.
