Ready-to-copy patterns for Web Search . Each example shows
a real request, the response, and what to extract.
Restrict results to a specific site
Use the site parameter to limit results to a single domain. Useful for
finding company pages on LinkedIn, profiles on GitHub, or content on a
specific website.
curl --request POST \
--url https://api.crustdata.com/web/search/live \
--header ' authorization: Bearer YOUR_API_KEY ' \
--header ' content-type: application/json ' \
--header ' x-api-version: 2025-11-01 ' \
--data ' {
"query": "ADAMSBROWN, LLC",
"sources": ["web"],
"site": "linkedin.com/company"
} '
Extract: The first result URL is typically the best match. For company
profile URLs, pass the result to
Company Identify for a full profile.
Search with date filtering
Use start_date and end_date (Unix timestamps in seconds) to limit
results to a specific time range.
curl --request POST \
--url https://api.crustdata.com/web/search/live \
--header ' authorization: Bearer YOUR_API_KEY ' \
--header ' content-type: application/json ' \
--header ' x-api-version: 2025-11-01 ' \
--data ' {
"query": "distributed systems",
"location": "US",
"sources": ["web", "news"],
"site": "example.com",
"start_date": 1728259200,
"end_date": 1730937600
} '
Convert dates to Unix timestamps: October 7, 2024 = 1728259200. You can
use any Unix timestamp converter tool.
Multi-page search
Use page to request multiple result pages in a single response. The
metadata object tells you which pages succeeded.
Request
Response (with page 2 failure)
curl --request POST \
--url https://api.crustdata.com/web/search/live \
--header ' authorization: Bearer YOUR_API_KEY ' \
--header ' content-type: application/json ' \
--header ' x-api-version: 2025-11-01 ' \
--data ' {
"query": "artificial intelligence startups",
"sources": ["web"],
"location": "US",
"page": 3
} '
Response trimmed for clarity. Pages 1 succeeded, page 2 failed, page 3 was
empty.
The response aggregates results across all successful pages into a single
results[] array. Check metadata to understand page-level outcomes:
metadata.totalResults — total results available across all sources and pages.
metadata.failedPages — page numbers that returned errors. Retry the request with a smaller page value if needed.
metadata.emptyPages — page numbers that returned no results. You have reached the end of available results — do not retry .
if ( response . metadata . failedPages . length > 0 ) {
console . log ( " Failed pages: " , response . metadata . failedPages );
}
if ( response . metadata . emptyPages . length > 0 ) {
console . log (
" Reached end of results at page " ,
Math . min (... response . metadata . emptyPages ),
);
}
Current platform behavior (not guaranteed by the OpenAPI contract): Each
page returns approximately 10 results. If metadata.emptyPages contains
page numbers, you have reached the end of available results.
Company and profile discovery
Find a company’s website domain
Search for a company by name followed by “website”. The first result URL is
typically the company’s website.
curl --request POST \
--url https://api.crustdata.com/web/search/live \
--header ' authorization: Bearer YOUR_API_KEY ' \
--header ' content-type: application/json ' \
--header ' x-api-version: 2025-11-01 ' \
--data ' {
"query": "ADAMSBROWN, LLC website",
"sources": ["web"]
} '
Extract: results[0].url → https://www.adamsbrowncpa.com/
Do not wrap the company name in quotes — this lets the search engine
match partial name variations. If the company name is common, add city and
state: "ADAMSBROWN, LLC WICHITA KS website".
Bridge to Company API: Extract the domain from the URL, then pass it to
Company Enrich for the full company profile:
const url = new URL ( " https://www.adamsbrowncpa.com/ " );
const domain = url . hostname . replace ( " www. " , "" ); // "adamsbrowncpa.com"
Company Enrich request body
{ " domains " : [ " adamsbrowncpa.com " ] }
Find a person’s profile URL
Use the site parameter with the profile host (e.g. linkedin.com/in) to
find a person’s profile, then enrich via the Person API.
curl --request POST \
--url https://api.crustdata.com/web/search/live \
--header ' authorization: Bearer YOUR_API_KEY ' \
--header ' content-type: application/json ' \
--header ' x-api-version: 2025-11-01 ' \
--data ' {
"query": "Jeff Dean Google",
"sources": ["web"],
"site": "linkedin.com/in"
} '
Bridge to Person API: Pass the profile URL to
Person Enrich :
{
" professional_network_profile_urls " : [
" https://www.linkedin.com/in/jeff-dean-8b212555 "
]
}
Find a developer profile
Use site: "github.com" to search for developer profiles on GitHub.
curl --request POST \
--url https://api.crustdata.com/web/search/live \
--header ' authorization: Bearer YOUR_API_KEY ' \
--header ' content-type: application/json ' \
--header ' x-api-version: 2025-11-01 ' \
--data ' {
"query": "Tyler Lambe",
"sources": ["web"],
"site": "github.com",
"location": "US"
} '
Research and analysis
Search for academic research on a topic
Search for academic articles with date filtering to find papers with
citation data and PDF links.
curl --request POST \
--url https://api.crustdata.com/web/search/live \
--header ' authorization: Bearer YOUR_API_KEY ' \
--header ' content-type: application/json ' \
--header ' x-api-version: 2025-11-01 ' \
--data ' {
"query": "deep learning",
"location": "US",
"sources": ["scholar-articles"],
"start_date": 1672531200,
"end_date": 1704067200
} '
Extract:
citations — citation count to gauge impact.
pdf_url — direct PDF download link (when available).
authors[].profile_url — Author profile link.
metadata — citation string: "Author - Year - Publisher".
Look up an academic researcher’s profile
Search for a researcher by name to get their full academic profile with
h-index, citation metrics, and top publications.
curl --request POST \
--url https://api.crustdata.com/web/search/live \
--header ' authorization: Bearer YOUR_API_KEY ' \
--header ' content-type: application/json ' \
--header ' x-api-version: 2025-11-01 ' \
--data ' {
"query": "jeff dean",
"location": "US",
"sources": ["scholar-author"]
} '
Extract: citations.all for total impact, h_index.all for research
quality, articles[] for top publications.
Get an AI-generated overview of a topic
Use deep research mode for a synthesized answer with source references.
curl --request POST \
--url https://api.crustdata.com/web/search/live \
--header ' authorization: Bearer YOUR_API_KEY ' \
--header ' content-type: application/json ' \
--header ' x-api-version: 2025-11-01 ' \
--data ' {
"query": "uv vs pip",
"location": "US",
"sources": ["ai"]
} '
Extract: content for the overview text, references[].url for source
verification.
Search news with date filtering
Filter news results to a specific date range by providing start_date and
end_date as Unix timestamps in seconds.
curl --request POST \
--url https://api.crustdata.com/web/search/live \
--header ' authorization: Bearer YOUR_API_KEY ' \
--header ' content-type: application/json ' \
--header ' x-api-version: 2025-11-01 ' \
--data ' {
"query": "artificial intelligence developments",
"location": "US",
"sources": ["news"],
"start_date": 1728259200,
"end_date": 1730937600
} '
start_date and end_date are Unix timestamps in seconds . October 7,
2024 = 1728259200. November 7, 2024 = 1730937600.
Search social media posts
Search for recent social media mentions of a topic or person.
curl --request POST \
--url https://api.crustdata.com/web/search/live \
--header ' authorization: Bearer YOUR_API_KEY ' \
--header ' content-type: application/json ' \
--header ' x-api-version: 2025-11-01 ' \
--data ' {
"query": "crustdata AI agents",
"location": "US",
"sources": ["social"]
} '
Current platform behavior: Social media results may return an empty
results array for some queries depending on availability. Always check
results.length before processing.
Mixed-source search with safe parsing
When searching multiple sources, the results[] array contains items with
different shapes. Always branch on result.source.
curl --request POST \
--url https://api.crustdata.com/web/search/live \
--header ' authorization: Bearer YOUR_API_KEY ' \
--header ' content-type: application/json ' \
--header ' x-api-version: 2025-11-01 ' \
--data ' {
"query": "machine learning infrastructure",
"location": "US",
"sources": ["web", "news", "scholar-articles"]
} '
Safe parsing logic:
const fetchableUrls = [];
for ( const result of response . results ) {
switch ( result . source ) {
case " web " :
case " news " :
case " social " :
fetchableUrls . push ( result . url );
break ;
case " scholar-articles " :
case " scholar-articles-enriched " :
fetchableUrls . push ( result . url );
break ;
case " scholar-author " :
console . log ( ` Author: ${ result . name } ( ${ result . affiliation } ) ` );
break ;
case " ai " :
console . log ( ` AI Overview: ${ result . content } ` );
result . references ?. forEach (( ref ) => fetchableUrls . push ( ref . url ));
break ;
}
}
Not every search result should go to Fetch. Academic author results are
profiles, not content pages. Deep research results provide content inline
and use references[].url for source URLs instead of a top-level url.
Search-then-fetch workflows
End-to-end competitive intelligence
Search for competitor news, then fetch the full article content for
analysis.
Search for competitor news
curl --request POST \
--url https://api.crustdata.com/web/search/live \
--header ' authorization: Bearer YOUR_API_KEY ' \
--header ' content-type: application/json ' \
--header ' x-api-version: 2025-11-01 ' \
--data ' {
"query": "OpenAI funding 2026",
"location": "US",
"sources": ["news", "web"]
} '
Extract URLs from results[].url.
Select URLs and fetch content
curl --request POST \
--url https://api.crustdata.com/web/enrich/live \
--header ' authorization: Bearer YOUR_API_KEY ' \
--header ' content-type: application/json ' \
--header ' x-api-version: 2025-11-01 ' \
--data ' {
"urls": [
"https://www.reuters.com/technology/openai-funding-2026/",
"https://techcrunch.com/2026/01/15/openai-funding/"
]
} '
Parse results and handle failures
Check success for each entry. Parse HTML from successful fetches. To
identify which URLs failed, compare requested URLs against successful
url values. const successfulUrls = new Set (
fetchResponse . filter ( r => r . success ). map ( r => r . url )
);
const failedUrls = requestedUrls . filter ( u => ! successfulUrls . has ( u ));
Full pipeline in Python: search → fetch → parse
A complete Python example that searches, filters fetchable URLs by source,
fetches content, and handles failures.
import requests
API_KEY = " YOUR_API_KEY "
HEADERS = {
" authorization " : f "Bearer {API_KEY} " ,
" content-type " : " application/json " ,
" x-api-version " : " 2025-11-01 " ,
}
# Step 1: Search
search_resp = requests . post (
" https://api.crustdata.com/web/search/live " ,
headers = HEADERS ,
json ={ " query " : " OpenAI funding 2026 " , " sources " : [ " web " , " news " ]},
). json ()
# Step 2: Extract fetchable URLs
fetchable_urls = []
for result in search_resp [ " results " ]:
if result [ " source " ] in ( " web " , " news " , " social " , " scholar-articles " , " scholar-articles-enriched " ):
fetchable_urls . append ( result [ " url " ])
elif result [ " source " ] == " ai " :
for ref in result . get ( " references " , []):
fetchable_urls . append ( ref [ " url " ])
# Step 3: Fetch (max 10 URLs per request)
fetch_resp = requests . post (
" https://api.crustdata.com/web/enrich/live " ,
headers = HEADERS ,
json ={ " urls " : fetchable_urls [: 10 ]},
). json ()
# Step 4: Process results and correlate failures
successful_urls = set ()
for item in fetch_resp :
if item [ " success " ]:
successful_urls . add ( item [ " url " ])
print ( f "Fetched: { item [ ' title ' ] } ( { len ( item [ ' content ' ]) } chars)" )
failed_urls = [ u for u in fetchable_urls [: 10 ] if u not in successful_urls ]
if failed_urls :
print ( f "Failed URLs: { failed_urls } " )
AI overview → fetch source references
When using deep research mode, the overview content is inline. To get the
full source articles, fetch the URLs from references[].
Search with deep research mode
curl --request POST \
--url https://api.crustdata.com/web/search/live \
--header ' authorization: Bearer YOUR_API_KEY ' \
--header ' content-type: application/json ' \
--header ' x-api-version: 2025-11-01 ' \
--data ' {"query": "uv vs pip", "sources": ["ai"], "location": "US"} '
Extract: results[0].references[].url — the source article URLs.
Fetch the reference URLs
curl --request POST \
--url https://api.crustdata.com/web/enrich/live \
--header ' authorization: Bearer YOUR_API_KEY ' \
--header ' content-type: application/json ' \
--header ' x-api-version: 2025-11-01 ' \
--data ' {"urls": ["https://realpython.com/uv-vs-pip/"]} '
Parse the content from successful entries to read the full source
articles.
AI results do not have a top-level url field. Always use
references[].url for fetch targets.
Next steps
Source types — full result shapes and field presence for each source.
Web Search — main Search page with request/response and errors.
Web Fetch — fetch the HTML content of URLs returned by search results.