This page collects ready-to-copy patterns for the Web APIs. Each example shows a real request, the response, and what to extract.
Company & profile discovery Find domains, LinkedIn, GitHub profiles
Research & analysis Scholar articles, authors, AI, news, social, mixed-source
Search-then-fetch workflows End-to-end competitive intelligence
Bulk fetch Batch-fetch up to 10 URLs
Company and profile discovery
Find a company’s website domain
Search for a company by name followed by “website”. The first result URL is typically the company’s website.
curl --request POST \
--url https://api.crustdata.com/web/search/live \
--header ' authorization: Bearer YOUR_API_KEY ' \
--header ' content-type: application/json ' \
--header ' x-api-version: 2025-11-01 ' \
--data ' {
"query": "ADAMSBROWN, LLC website",
"sources": ["web"]
} '
Extract: results[0].url → https://www.adamsbrowncpa.com/
Do not wrap the company name in quotes — this lets the search engine
match partial name variations. If the company name is common, add city
and state: "ADAMSBROWN, LLC WICHITA KS website".
Bridge to Company API: Extract the domain from the URL, then pass it to Company Enrich for the full company profile:
// Extract domain from results[0].url
const url = new URL ( " https://www.adamsbrowncpa.com/ " );
const domain = url . hostname . replace ( " www. " , "" ); // "adamsbrowncpa.com"
Company Enrich request body
{ " domains " : [ " adamsbrowncpa.com " ] }
Find a company’s LinkedIn URL
Use the site parameter with linkedin.com/company to restrict results to LinkedIn company pages.
curl --request POST \
--url https://api.crustdata.com/web/search/live \
--header ' authorization: Bearer YOUR_API_KEY ' \
--header ' content-type: application/json ' \
--header ' x-api-version: 2025-11-01 ' \
--data ' {
"query": "ADAMSBROWN, LLC",
"sources": ["web"],
"site": "linkedin.com/company"
} '
Extract: results[0].url → https://www.linkedin.com/company/adams-brown-cpa
Bridge to Company API: Pass the LinkedIn URL to Company Identify :
{ " professional_network_profile_urls " : [ " https://www.linkedin.com/company/adams-brown-cpa " ] }
Find a person’s LinkedIn URL
Use the site parameter with linkedin.com/in to find a person’s LinkedIn profile, then enrich via the Person API.
curl --request POST \
--url https://api.crustdata.com/web/search/live \
--header ' authorization: Bearer YOUR_API_KEY ' \
--header ' content-type: application/json ' \
--header ' x-api-version: 2025-11-01 ' \
--data ' {
"query": "Jeff Dean Google",
"sources": ["web"],
"site": "linkedin.com/in"
} '
Extract: results[0].url → https://www.linkedin.com/in/jeff-dean-8b212555
Bridge to Person API: Pass the LinkedIn URL to Person Enrich :
{ " professional_network_profile_urls " : [ " https://www.linkedin.com/in/jeff-dean-8b212555 " ] }
Find a GitHub profile
Use site: "github.com" to search for developer profiles on GitHub.
curl --request POST \
--url https://api.crustdata.com/web/search/live \
--header ' authorization: Bearer YOUR_API_KEY ' \
--header ' content-type: application/json ' \
--header ' x-api-version: 2025-11-01 ' \
--data ' {
"query": "Tyler Lambe",
"sources": ["web"],
"site": "github.com",
"geolocation": "US"
} '
Extract: results[0].url → https://github.com/tylambe
Research and analysis
Search for academic research on a topic
Search Google Scholar for articles with date filtering to find papers with citation data and PDF links.
curl --request POST \
--url https://api.crustdata.com/web/search/live \
--header ' authorization: Bearer YOUR_API_KEY ' \
--header ' content-type: application/json ' \
--header ' x-api-version: 2025-11-01 ' \
--data ' {
"query": "deep learning",
"geolocation": "US",
"sources": ["scholar-articles"],
"startDate": 1672531200,
"endDate": 1704067200
} '
Extract:
citations — citation count to gauge impact.
pdf_url — direct PDF download link (when available).
authors[].profile_url — Google Scholar author profile link.
metadata — citation string: "Author - Year - Publisher".
Look up an academic researcher’s profile
Search for a researcher by name to get their full Google Scholar profile with h-index, citation metrics, and top publications.
curl --request POST \
--url https://api.crustdata.com/web/search/live \
--header ' authorization: Bearer YOUR_API_KEY ' \
--header ' content-type: application/json ' \
--header ' x-api-version: 2025-11-01 ' \
--data ' {
"query": "jeff dean",
"geolocation": "US",
"sources": ["scholar-author"]
} '
Extract: citations.all for total impact, h_index.all for research quality, articles[] for top publications.
Get an AI-generated overview of a topic
Use AI mode for a synthesized answer with source references.
curl --request POST \
--url https://api.crustdata.com/web/search/live \
--header ' authorization: Bearer YOUR_API_KEY ' \
--header ' content-type: application/json ' \
--header ' x-api-version: 2025-11-01 ' \
--data ' {
"query": "uv vs pip",
"geolocation": "US",
"sources": ["ai"]
} '
Extract: content for the overview text, references[].url for source verification.
Search news with date filtering
Filter news results to a specific date range by providing startDate and endDate as Unix timestamps in seconds.
curl --request POST \
--url https://api.crustdata.com/web/search/live \
--header ' authorization: Bearer YOUR_API_KEY ' \
--header ' content-type: application/json ' \
--header ' x-api-version: 2025-11-01 ' \
--data ' {
"query": "artificial intelligence developments",
"geolocation": "US",
"sources": ["news"],
"startDate": 1728259200,
"endDate": 1730937600
} '
Response trimmed for clarity. Results are filtered to the October 7 – November 7, 2024 date range.
startDate and endDate are Unix timestamps in seconds . October 7, 2024 = 1728259200. November 7, 2024 = 1730937600.
Search social media posts
Search for recent social media mentions of a topic or person.
curl --request POST \
--url https://api.crustdata.com/web/search/live \
--header ' authorization: Bearer YOUR_API_KEY ' \
--header ' content-type: application/json ' \
--header ' x-api-version: 2025-11-01 ' \
--data ' {
"query": "crustdata AI agents",
"geolocation": "US",
"sources": ["social"]
} '
Current platform behavior: Social media results may return an empty
results array for some queries depending on availability. Always check
results.length before processing.
Search with enriched scholar articles
Use scholar-articles-enriched to get the same result shape as scholar-articles, but with richer author profile data populated.
curl --request POST \
--url https://api.crustdata.com/web/search/live \
--header ' authorization: Bearer YOUR_API_KEY ' \
--header ' content-type: application/json ' \
--header ' x-api-version: 2025-11-01 ' \
--data ' {
"query": "large language models",
"geolocation": "US",
"sources": ["scholar-articles-enriched"],
"startDate": 1672531200,
"endDate": 1704067200
} '
The result shape is the same as scholar-articles, but authors[].profile_url and
authors[].profile_id are more likely to be populated. Use scholar-articles-enriched
when you need to follow up on author profiles.
Mixed-source search with safe parsing
When searching multiple sources, the results[] array contains items with different shapes. Always branch on result.source.
curl --request POST \
--url https://api.crustdata.com/web/search/live \
--header ' authorization: Bearer YOUR_API_KEY ' \
--header ' content-type: application/json ' \
--header ' x-api-version: 2025-11-01 ' \
--data ' {
"query": "machine learning infrastructure",
"geolocation": "US",
"sources": ["web", "news", "scholar-articles"]
} '
Safe parsing logic:
const fetchableUrls = [];
for ( const result of response . results ) {
switch ( result . source ) {
case ' web ' :
case ' news ' :
case ' social ' :
// Standard results — URL is fetchable
fetchableUrls . push ( result . url );
break ;
case ' scholar-articles ' :
case ' scholar-articles-enriched ' :
// Fetch the article page URL for HTML content
// Note: pdf_url is a direct PDF download link — handle separately outside Web Fetch
fetchableUrls . push ( result . url );
break ;
case ' scholar-author ' :
// Author profile — not a fetchable content page
console . log ( ` Author: ${ result . name } ( ${ result . affiliation } ) ` );
break ;
case ' ai ' :
// AI overview — content is inline, references have URLs
console . log ( ` AI Overview: ${ result . content } ` );
result . references ?. forEach ( ref => fetchableUrls . push ( ref . url ));
break ;
}
}
// Pass fetchableUrls to the Fetch endpoint
Not every search result should go to Fetch. Scholar-author results are
profiles, not content pages. AI results provide content inline and use
references[].url for source URLs instead of a top-level url.
Search-then-fetch workflows
End-to-end competitive intelligence
Search for competitor news, then fetch the full article content for analysis.
Search for competitor news
curl --request POST \
--url https://api.crustdata.com/web/search/live \
--header ' authorization: Bearer YOUR_API_KEY ' \
--header ' content-type: application/json ' \
--header ' x-api-version: 2025-11-01 ' \
--data ' {
"query": "OpenAI funding 2026",
"geolocation": "US",
"sources": ["news", "web"]
} '
Select URLs and fetch content
Extract URLs from the search results and pass them to the Fetch endpoint. curl --request POST \
--url https://api.crustdata.com/web/enrich/live \
--header ' authorization: Bearer YOUR_API_KEY ' \
--header ' content-type: application/json ' \
--header ' x-api-version: 2025-11-01 ' \
--data ' {
"urls": [
"https://www.reuters.com/technology/openai-funding-2026/",
"https://techcrunch.com/2026/01/15/openai-funding/"
]
} '
Parse results and handle failures
Check success for each entry. Parse HTML from successful fetches. To identify which URLs failed, compare requested URLs against successful url values. const requestedUrls = [
" https://www.reuters.com/technology/openai-funding-2026/ " ,
" https://techcrunch.com/2026/01/15/openai-funding/ "
];
const successfulUrls = new Set (
fetchResponse . filter ( r => r . success ). map ( r => r . url )
);
for ( const result of fetchResponse ) {
if ( result . success ) {
const text = extractText ( result . content );
console . log ( ` ${ result . pageTitle } : ${ text . substring ( 0 , 200 ) } ... ` );
}
}
const failedUrls = requestedUrls . filter ( u => ! successfulUrls . has ( u ));
console . log ( ' Failed URLs: ' , failedUrls );
Full pipeline in Python: search → fetch → parse
A complete Python example that searches, filters fetchable URLs by source, fetches content, and handles failures.
import requests
API_KEY = " YOUR_API_KEY "
HEADERS = {
" authorization " : f "Bearer {API_KEY} " ,
" content-type " : " application/json " ,
" x-api-version " : " 2025-11-01 " ,
}
# Step 1: Search
search_resp = requests . post (
" https://api.crustdata.com/web/search/live " ,
headers = HEADERS ,
json ={ " query " : " OpenAI funding 2026 " , " sources " : [ " web " , " news " ]},
). json ()
# Step 2: Extract fetchable URLs (web, news, social, scholar-articles have url)
fetchable_urls = []
for result in search_resp [ " results " ]:
if result [ " source " ] in ( " web " , " news " , " social " , " scholar-articles " , " scholar-articles-enriched " ):
fetchable_urls . append ( result [ " url " ])
elif result [ " source " ] == " ai " :
# AI results: fetch reference URLs instead
for ref in result . get ( " references " , []):
fetchable_urls . append ( ref [ " url " ])
# scholar-author: no content URL to fetch
# Step 3: Fetch (max 10 URLs per request)
fetch_resp = requests . post (
" https://api.crustdata.com/web/enrich/live " ,
headers = HEADERS ,
json ={ " urls " : fetchable_urls [: 10 ]},
). json ()
# Step 4: Process results and correlate failures
successful_urls = set ()
for item in fetch_resp :
if item [ " success " ]:
successful_urls . add ( item [ " url " ])
print ( f "Fetched: { item [ ' pageTitle ' ] } ( { len ( item [ ' content ' ]) } chars)" )
else :
print ( " One URL failed to fetch " )
failed_urls = [ u for u in fetchable_urls [: 10 ] if u not in successful_urls ]
if failed_urls :
print ( f "Failed URLs: { failed_urls } " )
AI overview → fetch source references
When using AI mode, the overview content is inline. To get the full source articles, fetch the URLs from references[].
Search with AI mode
curl --request POST \
--url https://api.crustdata.com/web/search/live \
--header ' authorization: Bearer YOUR_API_KEY ' \
--header ' content-type: application/json ' \
--header ' x-api-version: 2025-11-01 ' \
--data ' {"query": "uv vs pip", "sources": ["ai"], "geolocation": "US"} '
Extract: results[0].references[].url — the source article URLs.
Fetch the reference URLs
curl --request POST \
--url https://api.crustdata.com/web/enrich/live \
--header ' authorization: Bearer YOUR_API_KEY ' \
--header ' content-type: application/json ' \
--header ' x-api-version: 2025-11-01 ' \
--data ' {"urls": ["https://realpython.com/uv-vs-pip/"]} '
Parse the content from successful entries to read the full source articles.
AI results do not have a top-level url field. Always use references[].url
for fetch targets.
Bulk fetch
Batch-fetch multiple webpages
Fetch up to 10 URLs in a single request for bulk content extraction.
curl --request POST \
--url https://api.crustdata.com/web/enrich/live \
--header ' authorization: Bearer YOUR_API_KEY ' \
--header ' content-type: application/json ' \
--header ' x-api-version: 2025-11-01 ' \
--data ' {
"urls": [
"https://example.com",
"https://example.org",
"https://example.net",
"https://www.crustdata.com",
"https://docs.crustdata.com"
]
} '
Processing tips:
Match results by the url field — the response order may differ from the request.
Check success for each entry before processing content.
Use pageTitle for quick identification without parsing HTML.
For larger batches (>10 URLs), split into multiple requests of 10 each.
Next steps
Web Search reference — full request/response contract, all parameters, result shapes by source, field-presence matrix.
Web Fetch reference — full request/response contract, partial failure handling, content processing guidance.
Web APIs Quickstart — overview, common workflows, and getting started.