Skip to main content
Use this when you want to find web pages, news articles, academic papers, author profiles, AI-generated overviews, or social media posts matching a search query. The Web Search API accepts a query and returns results from one or more source types. The result shape varies by source — always specify sources explicitly when you need predictable parsing. Every request goes to the same endpoint:
POST https://api.crustdata.com/web/search/live

Parameters

Request fields and defaults

Result shapes

Per-source field tables with Tabs

Field matrix

Which fields exist for each source

Request body

ParameterTypeRequiredDefaultDescription
querystringYesSearch query text. Max 5,000 characters. Supports search operators like site: and filetype:.
geolocationstringNoISO 3166-1 alpha-2 country code for region-specific results (e.g., "US", "GB", "JP").
sourcesstring[]NoSources to query: web, news, scholar-articles, scholar-articles-enriched, scholar-author, ai, social. Current platform behavior: omitting this field searches all sources.
sitestringNoRestrict results to a domain (e.g., "linkedin.com/company", "github.com"). Max 500 characters.
startDateintegerNoUnix timestamp (seconds). Only results after this date.
endDateintegerNoUnix timestamp (seconds). Only results before this date. Must be > startDate.
numPagesintegerNo1Number of result pages to return. Minimum: 1.
solveCloudflarebooleanNofalseCurrent platform behavior: Attempt to bypass Cloudflare protection when fetching result page content. Affects content retrieval, not search discovery itself. Not guaranteed to succeed.

Source capabilities

Current platform behavior — not guaranteed by the OpenAPI contract. Parameter applicability varies by source. This table reflects observed behavior.
SourceBest use caseFetchable url?site effective?Date filters effective?
webGeneral web searchYesYesYes
newsNews articlesYesYesYes
scholar-articlesAcademic papersYesNoYes
scholar-articles-enrichedPapers + author profilesYesNoYes
scholar-authorResearcher profilesNoNoNo
aiAI-generated summariesNoNoNo
socialSocial media mentionsYesNoNo

Response body

FieldTypeDescription
successbooleanWhether the search executed successfully.
querystringThe query as interpreted by the API (includes site: prefix if site was set).
timestampintegerUnix timestamp in milliseconds when the search was performed.
resultsarraySearch results. Shape varies by source — see Result shapes by source.
metadata.totalResultsintegerTotal number of results available across all pages (may exceed the number in the results array if you requested fewer pages).
metadata.failedPagesarrayPage numbers that failed to return results.
metadata.emptyPagesarrayPage numbers that returned no results.
Timestamps: Search timestamp is in milliseconds. Fetch timestamp is in seconds. Divide Search timestamps by 1000 when comparing across endpoints.

The simplest search uses a query with an explicit sources array. Always specify sources for predictable result parsing.
curl --request POST \
  --url https://api.crustdata.com/web/search/live \
  --header 'authorization: Bearer YOUR_API_KEY' \
  --header 'content-type: application/json' \
  --header 'x-api-version: 2025-11-01' \
  --data '{
    "query": "crustdata",
    "sources": ["web"],
    "geolocation": "US"
  }'
Response trimmed for clarity.
Extract: Each result in results[] contains source, title, url, snippet, and position. Use position for ranking and url for follow-up fetching.

Restrict results to a specific site

Use the site parameter to limit results to a single domain. Useful for finding company pages on LinkedIn, profiles on GitHub, or content on a specific website.
curl --request POST \
  --url https://api.crustdata.com/web/search/live \
  --header 'authorization: Bearer YOUR_API_KEY' \
  --header 'content-type: application/json' \
  --header 'x-api-version: 2025-11-01' \
  --data '{
    "query": "ADAMSBROWN, LLC",
    "sources": ["web"],
    "site": "linkedin.com/company"
  }'
Extract: The first result URL is typically the best match. For company LinkedIn URLs, pass the result to the Company Identify API for a full profile.

Search with date filtering

Use startDate and endDate (Unix timestamps in seconds) to limit results to a specific time range.
curl --request POST \
  --url https://api.crustdata.com/web/search/live \
  --header 'authorization: Bearer YOUR_API_KEY' \
  --header 'content-type: application/json' \
  --header 'x-api-version: 2025-11-01' \
  --data '{
    "query": "distributed systems",
    "geolocation": "US",
    "sources": ["web", "news"],
    "site": "example.com",
    "startDate": 1728259200,
    "endDate": 1730937600
  }'
Convert dates to Unix timestamps: October 7, 2024 = 1728259200. You can use any Unix timestamp converter tool.

Result shapes by source

The results[] array shape depends on the source field of each result. Use this reference when parsing multi-source responses.
Standard web and news results share the same shape.
FieldTypeDescription
sourcestring"web" or "news".
titlestringPage title.
urlstringPage URL.
snippetstringText excerpt.
positionintegerResult position (1-based).
{
    "source": "web",
    "title": "Crustdata: Real-Time B2B Data Broker via API or Data Feed",
    "url": "https://crustdata.com/",
    "snippet": "Crustdata is a B2B data provider offering real-time company & people datasets.",
    "position": 1
}

Result ordering and ranking

Current platform behavior: When querying a single source, position reflects the source’s natural ranking order. When querying multiple sources, results from different sources are interleaved and position may reflect a per-source rank rather than a global rank. metadata.totalResults is the total count across all requested sources and pages.

Parsing multi-source responses

When you query multiple sources at once (or omit sources), the results[] array can contain items with different shapes. Always check the source field of each result to determine which fields are available:
for (const result of response.results) {
  switch (result.source) {
    case 'web':
    case 'news':
    case 'social':
      // Standard: title, url, snippet, position
      console.log(result.title, result.url);
      break;
    case 'scholar-articles':
    case 'scholar-articles-enriched':
      // Academic: standard fields + authors, citations, pdf_url, metadata
      console.log(result.title, result.citations, result.authors);
      break;
    case 'scholar-author':
      // Author profile: name, affiliation, h_index, articles[]
      console.log(result.name, result.affiliation, result.h_index);
      break;
    case 'ai':
      // AI overview: content, references[]
      console.log(result.content, result.references);
      break;
  }
}

Use numPages to request multiple pages of results. The metadata object tells you which pages succeeded.
curl --request POST \
  --url https://api.crustdata.com/web/search/live \
  --header 'authorization: Bearer YOUR_API_KEY' \
  --header 'content-type: application/json' \
  --header 'x-api-version: 2025-11-01' \
  --data '{
    "query": "artificial intelligence startups",
    "sources": ["web"],
    "geolocation": "US",
    "numPages": 3
  }'
Response trimmed for clarity. Pages 1 succeeded, page 2 failed, page 3 was empty.
The response aggregates results across all successful pages into a single results[] array. Check metadata to understand page-level outcomes:
  • metadata.totalResults — total results available across all sources and pages.
  • metadata.failedPages — page numbers that returned errors. Retry these individually with numPages: 1 and appropriate offset logic.
  • metadata.emptyPages — page numbers that returned no results. You have reached the end of available results — do not retry.
Handling page outcomes:
if (response.metadata.failedPages.length > 0) {
  // Some pages failed — retry the full request or reduce numPages
  console.log('Failed pages:', response.metadata.failedPages);
}

if (response.metadata.emptyPages.length > 0) {
  // No more results available — do not request more pages
  console.log('Reached end of results at page', Math.min(...response.metadata.emptyPages));
}
Current platform behavior (not guaranteed by the OpenAPI contract): Each page returns approximately 10 results. If metadata.emptyPages contains page numbers, you have reached the end of available results.

Field presence by source

Use this reference to determine which fields are present for each source type.
Naming note: The API uses metadata in two different contexts. The response-level metadata is an object with totalResults, failedPages, and emptyPages. The per-result metadata field (scholar-articles only) is a citation string like "Author - Year - Publisher". Always use the full path (response.metadata vs result.metadata) to avoid confusion.
Standard fields — present in web, news, social, and scholar-articles / scholar-articles-enriched:
FieldSources with this fieldNotes
sourceAll sourcesAlways present
titleweb, news, social, scholar-articles*, aiAI: always "AI Overview"
urlweb, news, social, scholar-articles*, scholar-authorScholar-author: profile link
snippetweb, news, social, scholar-articles*Absent in ai, scholar-author
positionweb, news, social, scholar-articles*Absent in ai, scholar-author
Scholar article fieldsscholar-articles and scholar-articles-enriched only:
FieldTypeNotes
metadatastringCitation string: "Author - Year - Publisher"
pdf_urlstring?Direct PDF download link — handle outside Web Fetch
authorsarray[{ name, profile_url, profile_id }]
citationsintegerTotal citation count
Scholar author fieldsscholar-author only:
FieldTypeNotes
namestringAuthor full name
affiliationstringInstitutional affiliation
websitestring?Personal or institutional website
interestsarray[{ title, link }]
thumbnailstring?Profile photo URL
citationsobject{ all, since_2020 } — different type than scholar-articles
h_indexobject{ all, since_2020 }
i10_indexobject{ all, since_2020 }
articlesarray[{ title, url, year, citations, authors, publication }]
AI mode fieldsai only:
FieldTypeNotes
contentstringAI-generated overview text
referencesarray[{ title, url, snippet }] — fetch these URLs
imagesarray[{ url, alt, width, height }]
For full request/response examples of each source type, see the Web API Examples page.

Error handling

Search returns 400 for invalid requests and 401 for auth failures.
{
    "error": {
        "type": "invalid_request",
        "message": "query: This field is required.",
        "metadata": []
    }
}

Common gotchas

MistakeFix
Omitting sources and expecting uniform resultsDifferent sources return different fields. Specify sources explicitly for predictable parsing.
Using site with scholar-author or ai sourcessite only applies to web and news sources. It has no effect on Scholar or AI searches.
Expecting snippet in AI mode resultsAI mode returns content and references instead of snippet and position.
Expecting position in scholar-author resultsScholar author results don’t have position — they have name, affiliation, citations, etc.
Using startDate >= endDatestartDate must be strictly less than endDate.

Next steps

  • Web Fetch — fetch the HTML content of URLs returned by search results.
  • Web API Examples — ready-to-copy patterns for common workflows.