Skip to main content
This page collects ready-to-copy patterns for the Web APIs. Each example shows a real request, the response, and what to extract.

Company & profile discovery

Find domains, LinkedIn, GitHub profiles

Research & analysis

Scholar articles, authors, AI, news, social, mixed-source

Search-then-fetch workflows

End-to-end competitive intelligence

Bulk fetch

Batch-fetch up to 10 URLs

Company and profile discovery

Find a company’s website domain

Search for a company by name followed by “website”. The first result URL is typically the company’s website.
curl --request POST \
  --url https://api.crustdata.com/web/search/live \
  --header 'authorization: Bearer YOUR_API_KEY' \
  --header 'content-type: application/json' \
  --header 'x-api-version: 2025-11-01' \
  --data '{
    "query": "ADAMSBROWN, LLC website",
    "sources": ["web"]
  }'
Extract: results[0].urlhttps://www.adamsbrowncpa.com/
Do not wrap the company name in quotes — this lets the search engine match partial name variations. If the company name is common, add city and state: "ADAMSBROWN, LLC WICHITA KS website".
Bridge to Company API: Extract the domain from the URL, then pass it to Company Enrich for the full company profile:
// Extract domain from results[0].url
const url = new URL("https://www.adamsbrowncpa.com/");
const domain = url.hostname.replace("www.", ""); // "adamsbrowncpa.com"
Company Enrich request body
{ "domains": ["adamsbrowncpa.com"] }

Find a company’s LinkedIn URL

Use the site parameter with linkedin.com/company to restrict results to LinkedIn company pages.
curl --request POST \
  --url https://api.crustdata.com/web/search/live \
  --header 'authorization: Bearer YOUR_API_KEY' \
  --header 'content-type: application/json' \
  --header 'x-api-version: 2025-11-01' \
  --data '{
    "query": "ADAMSBROWN, LLC",
    "sources": ["web"],
    "site": "linkedin.com/company"
  }'
Extract: results[0].urlhttps://www.linkedin.com/company/adams-brown-cpa Bridge to Company API: Pass the LinkedIn URL to Company Identify:
{ "professional_network_profile_urls": ["https://www.linkedin.com/company/adams-brown-cpa"] }

Find a person’s LinkedIn URL

Use the site parameter with linkedin.com/in to find a person’s LinkedIn profile, then enrich via the Person API.
curl --request POST \
  --url https://api.crustdata.com/web/search/live \
  --header 'authorization: Bearer YOUR_API_KEY' \
  --header 'content-type: application/json' \
  --header 'x-api-version: 2025-11-01' \
  --data '{
    "query": "Jeff Dean Google",
    "sources": ["web"],
    "site": "linkedin.com/in"
  }'
Extract: results[0].urlhttps://www.linkedin.com/in/jeff-dean-8b212555 Bridge to Person API: Pass the LinkedIn URL to Person Enrich:
{ "professional_network_profile_urls": ["https://www.linkedin.com/in/jeff-dean-8b212555"] }

Find a GitHub profile

Use site: "github.com" to search for developer profiles on GitHub.
curl --request POST \
  --url https://api.crustdata.com/web/search/live \
  --header 'authorization: Bearer YOUR_API_KEY' \
  --header 'content-type: application/json' \
  --header 'x-api-version: 2025-11-01' \
  --data '{
    "query": "Tyler Lambe",
    "sources": ["web"],
    "site": "github.com",
    "geolocation": "US"
  }'
Extract: results[0].urlhttps://github.com/tylambe

Research and analysis

Search for academic research on a topic

Search Google Scholar for articles with date filtering to find papers with citation data and PDF links.
curl --request POST \
  --url https://api.crustdata.com/web/search/live \
  --header 'authorization: Bearer YOUR_API_KEY' \
  --header 'content-type: application/json' \
  --header 'x-api-version: 2025-11-01' \
  --data '{
    "query": "deep learning",
    "geolocation": "US",
    "sources": ["scholar-articles"],
    "startDate": 1672531200,
    "endDate": 1704067200
  }'
Extract:
  • citations — citation count to gauge impact.
  • pdf_url — direct PDF download link (when available).
  • authors[].profile_url — Google Scholar author profile link.
  • metadata — citation string: "Author - Year - Publisher".

Look up an academic researcher’s profile

Search for a researcher by name to get their full Google Scholar profile with h-index, citation metrics, and top publications.
curl --request POST \
  --url https://api.crustdata.com/web/search/live \
  --header 'authorization: Bearer YOUR_API_KEY' \
  --header 'content-type: application/json' \
  --header 'x-api-version: 2025-11-01' \
  --data '{
    "query": "jeff dean",
    "geolocation": "US",
    "sources": ["scholar-author"]
  }'
Extract: citations.all for total impact, h_index.all for research quality, articles[] for top publications.

Get an AI-generated overview of a topic

Use AI mode for a synthesized answer with source references.
curl --request POST \
  --url https://api.crustdata.com/web/search/live \
  --header 'authorization: Bearer YOUR_API_KEY' \
  --header 'content-type: application/json' \
  --header 'x-api-version: 2025-11-01' \
  --data '{
    "query": "uv vs pip",
    "geolocation": "US",
    "sources": ["ai"]
  }'
Extract: content for the overview text, references[].url for source verification.

Search news with date filtering

Filter news results to a specific date range by providing startDate and endDate as Unix timestamps in seconds.
curl --request POST \
  --url https://api.crustdata.com/web/search/live \
  --header 'authorization: Bearer YOUR_API_KEY' \
  --header 'content-type: application/json' \
  --header 'x-api-version: 2025-11-01' \
  --data '{
    "query": "artificial intelligence developments",
    "geolocation": "US",
    "sources": ["news"],
    "startDate": 1728259200,
    "endDate": 1730937600
  }'
Response trimmed for clarity. Results are filtered to the October 7 – November 7, 2024 date range.
startDate and endDate are Unix timestamps in seconds. October 7, 2024 = 1728259200. November 7, 2024 = 1730937600.

Search social media posts

Search for recent social media mentions of a topic or person.
curl --request POST \
  --url https://api.crustdata.com/web/search/live \
  --header 'authorization: Bearer YOUR_API_KEY' \
  --header 'content-type: application/json' \
  --header 'x-api-version: 2025-11-01' \
  --data '{
    "query": "crustdata AI agents",
    "geolocation": "US",
    "sources": ["social"]
  }'
Current platform behavior: Social media results may return an empty results array for some queries depending on availability. Always check results.length before processing.

Search with enriched scholar articles

Use scholar-articles-enriched to get the same result shape as scholar-articles, but with richer author profile data populated.
curl --request POST \
  --url https://api.crustdata.com/web/search/live \
  --header 'authorization: Bearer YOUR_API_KEY' \
  --header 'content-type: application/json' \
  --header 'x-api-version: 2025-11-01' \
  --data '{
    "query": "large language models",
    "geolocation": "US",
    "sources": ["scholar-articles-enriched"],
    "startDate": 1672531200,
    "endDate": 1704067200
  }'
The result shape is the same as scholar-articles, but authors[].profile_url and authors[].profile_id are more likely to be populated. Use scholar-articles-enriched when you need to follow up on author profiles.

Mixed-source search with safe parsing

When searching multiple sources, the results[] array contains items with different shapes. Always branch on result.source.
curl --request POST \
  --url https://api.crustdata.com/web/search/live \
  --header 'authorization: Bearer YOUR_API_KEY' \
  --header 'content-type: application/json' \
  --header 'x-api-version: 2025-11-01' \
  --data '{
    "query": "machine learning infrastructure",
    "geolocation": "US",
    "sources": ["web", "news", "scholar-articles"]
  }'
Safe parsing logic:
const fetchableUrls = [];

for (const result of response.results) {
  switch (result.source) {
    case 'web':
    case 'news':
    case 'social':
      // Standard results — URL is fetchable
      fetchableUrls.push(result.url);
      break;
    case 'scholar-articles':
    case 'scholar-articles-enriched':
      // Fetch the article page URL for HTML content
      // Note: pdf_url is a direct PDF download link — handle separately outside Web Fetch
      fetchableUrls.push(result.url);
      break;
    case 'scholar-author':
      // Author profile — not a fetchable content page
      console.log(`Author: ${result.name} (${result.affiliation})`);
      break;
    case 'ai':
      // AI overview — content is inline, references have URLs
      console.log(`AI Overview: ${result.content}`);
      result.references?.forEach(ref => fetchableUrls.push(ref.url));
      break;
  }
}

// Pass fetchableUrls to the Fetch endpoint
Not every search result should go to Fetch. Scholar-author results are profiles, not content pages. AI results provide content inline and use references[].url for source URLs instead of a top-level url.

Search-then-fetch workflows

End-to-end competitive intelligence

Search for competitor news, then fetch the full article content for analysis.
1

Search for competitor news

curl --request POST \
  --url https://api.crustdata.com/web/search/live \
  --header 'authorization: Bearer YOUR_API_KEY' \
  --header 'content-type: application/json' \
  --header 'x-api-version: 2025-11-01' \
  --data '{
    "query": "OpenAI funding 2026",
    "geolocation": "US",
    "sources": ["news", "web"]
  }'
2

Select URLs and fetch content

Extract URLs from the search results and pass them to the Fetch endpoint.
curl --request POST \
  --url https://api.crustdata.com/web/enrich/live \
  --header 'authorization: Bearer YOUR_API_KEY' \
  --header 'content-type: application/json' \
  --header 'x-api-version: 2025-11-01' \
  --data '{
    "urls": [
      "https://www.reuters.com/technology/openai-funding-2026/",
      "https://techcrunch.com/2026/01/15/openai-funding/"
    ]
  }'
3

Parse results and handle failures

Check success for each entry. Parse HTML from successful fetches. To identify which URLs failed, compare requested URLs against successful url values.
const requestedUrls = [
  "https://www.reuters.com/technology/openai-funding-2026/",
  "https://techcrunch.com/2026/01/15/openai-funding/"
];
const successfulUrls = new Set(
  fetchResponse.filter(r => r.success).map(r => r.url)
);

for (const result of fetchResponse) {
  if (result.success) {
    const text = extractText(result.content);
    console.log(`${result.pageTitle}: ${text.substring(0, 200)}...`);
  }
}

const failedUrls = requestedUrls.filter(u => !successfulUrls.has(u));
console.log('Failed URLs:', failedUrls);
Failed entries have url: null, so correlate failures by comparing successful URLs to your input list. See Fetch: correlating failures.

Full pipeline in Python: search → fetch → parse

A complete Python example that searches, filters fetchable URLs by source, fetches content, and handles failures.
import requests

API_KEY = "YOUR_API_KEY"
HEADERS = {
    "authorization": f"Bearer {API_KEY}",
    "content-type": "application/json",
    "x-api-version": "2025-11-01",
}

# Step 1: Search
search_resp = requests.post(
    "https://api.crustdata.com/web/search/live",
    headers=HEADERS,
    json={"query": "OpenAI funding 2026", "sources": ["web", "news"]},
).json()

# Step 2: Extract fetchable URLs (web, news, social, scholar-articles have url)
fetchable_urls = []
for result in search_resp["results"]:
    if result["source"] in ("web", "news", "social", "scholar-articles", "scholar-articles-enriched"):
        fetchable_urls.append(result["url"])
    elif result["source"] == "ai":
        # AI results: fetch reference URLs instead
        for ref in result.get("references", []):
            fetchable_urls.append(ref["url"])
    # scholar-author: no content URL to fetch

# Step 3: Fetch (max 10 URLs per request)
fetch_resp = requests.post(
    "https://api.crustdata.com/web/enrich/live",
    headers=HEADERS,
    json={"urls": fetchable_urls[:10]},
).json()

# Step 4: Process results and correlate failures
successful_urls = set()
for item in fetch_resp:
    if item["success"]:
        successful_urls.add(item["url"])
        print(f"Fetched: {item['pageTitle']} ({len(item['content'])} chars)")
    else:
        print("One URL failed to fetch")

failed_urls = [u for u in fetchable_urls[:10] if u not in successful_urls]
if failed_urls:
    print(f"Failed URLs: {failed_urls}")

AI overview → fetch source references

When using AI mode, the overview content is inline. To get the full source articles, fetch the URLs from references[].
1

Search with AI mode

curl --request POST \
  --url https://api.crustdata.com/web/search/live \
  --header 'authorization: Bearer YOUR_API_KEY' \
  --header 'content-type: application/json' \
  --header 'x-api-version: 2025-11-01' \
  --data '{"query": "uv vs pip", "sources": ["ai"], "geolocation": "US"}'
Extract: results[0].references[].url — the source article URLs.
2

Fetch the reference URLs

curl --request POST \
  --url https://api.crustdata.com/web/enrich/live \
  --header 'authorization: Bearer YOUR_API_KEY' \
  --header 'content-type: application/json' \
  --header 'x-api-version: 2025-11-01' \
  --data '{"urls": ["https://realpython.com/uv-vs-pip/"]}'
Parse the content from successful entries to read the full source articles.
AI results do not have a top-level url field. Always use references[].url for fetch targets.

Bulk fetch

Batch-fetch multiple webpages

Fetch up to 10 URLs in a single request for bulk content extraction.
curl --request POST \
  --url https://api.crustdata.com/web/enrich/live \
  --header 'authorization: Bearer YOUR_API_KEY' \
  --header 'content-type: application/json' \
  --header 'x-api-version: 2025-11-01' \
  --data '{
    "urls": [
      "https://example.com",
      "https://example.org",
      "https://example.net",
      "https://www.crustdata.com",
      "https://docs.crustdata.com"
    ]
  }'
Processing tips:
  • Match results by the url field — the response order may differ from the request.
  • Check success for each entry before processing content.
  • Use pageTitle for quick identification without parsing HTML.
  • For larger batches (>10 URLs), split into multiple requests of 10 each.

Next steps

  • Web Search reference — full request/response contract, all parameters, result shapes by source, field-presence matrix.
  • Web Fetch reference — full request/response contract, partial failure handling, content processing guidance.
  • Web APIs Quickstart — overview, common workflows, and getting started.