Reference material for Search Jobs: filter grammar and operators, common indexed fields, the fullDocumentation Index
Fetch the complete documentation index at: https://docs.crustdata.com/llms.txt
Use this file to discover all available pages before exploring further.
Job catalog, id
semantics, aggregation bucket metadata, null behavior, and errors.
For worked examples, see Examples. For sorting,
pagination, field selection, and aggregations, see
Pagination & sorting.
Jobs ID cheat sheet. The Jobs APIs use three id concepts — keep them straight:
crustdata_job_id— the Crustdata job identifier. Returned on everyJob. Use it as your dedupe key.company.basic_info.crustdata_company_id— the Crustdata company identifier returned on everyJob.company.basic_info.company_id(filter alias) — the dot-path used infiltersandaggregations.columnfor indexed Search Jobs. It points to the same integer ascompany.basic_info.crustdata_company_id. This alias is not sortable; for deterministic pagination, sort onmetadata.date_addedinstead.
group_by on company.basic_info.company_id, each bucket also returns metadata.company_name, metadata.company_website_domain, and metadata.linkedin_id for labeling.Filter grammar
Every filter describes which individual job rows to keep. The API checks each job listing against your filter independently — it never groups or combines rows before filtering. There are two building blocks:| Building block | What it does |
|---|---|
SearchCondition (leaf) | Tests one field on one job row — e.g. title = "Software Engineer". |
SearchConditionGroup (node) | Combines conditions with and or or. Groups can nest inside other groups. |
All-words operators (
(.)) work fine in AND. Because (.) checks for
individual words — not a contiguous substring — a query like (title (.) "Software Development") AND (title (.) "Software Engineer") matches any
title containing all three words “Software”, “Development”, and “Engineer”
(e.g. “Software Development Engineer”).Single condition
AND / OR group
Array-field filters and grouping
Filtering on array fields
Filtering on array fields
When you filter on a string-array field like
This matches any company whose
company.basic_info.industries, the condition is satisfied if any
element of the array matches.For example:industries array contains that
exact string. Use (.) to match words within any element.Grouping by array fields
Grouping by array fields
When you
group_by on an array field, each array element becomes
its own bucket key. A company in two industries contributes one
count to each of the two industry buckets — so the sum of bucket
counts can exceed total_count for array columns.Filter operators
Use the table below to pick the righttype for each condition. Every
operator works on indexed fields only.
| Operator | value shape | Meaning |
|---|---|---|
= | scalar (string/number/boolean) | Exact match. |
!= | scalar | Not equal. |
< | scalar (numeric or ISO date) | Less than. |
=< | scalar (numeric or ISO date) | Less than or equal. Not <=. |
> | scalar (numeric or ISO date) | Greater than. |
=> | scalar (numeric or ISO date) | Greater than or equal. Not >=. |
in | array of scalars | Field value is any entry in the array. |
not_in | array of scalars | Field value is none of the entries in the array. |
(.) | string | Case-insensitive all-words match. Every word in the query must appear somewhere in the field, but not necessarily next to each other or in the same order. "Software Engineer" matches "Software Engineer", "Software Development Engineer", and "Engineer, Software Systems". A single word like "engineer" also matches "Engineering Manager". Great for broad keyword hunting in job_details.title or content.description. |
[.] | string | Case-insensitive exact-phrase match. The words must appear contiguously and in order. "Software Engineer" matches "Senior Software Engineer" but not "Software Development Engineer" (extra word in between) and not "Engineer Software" (wrong order). Use [.] when you need precision over recall. |
Common indexed fields
These are the indexed fields most often used infilters, sorts, and
aggregations.column. This table is a summary of the most common paths,
not an authoritative catalog. For the deeper field catalog — including id
semantics, null handling, and bucket metadata — see the full
Field reference below.
Company id filter alias. The filterable field path uses the short alias
company.basic_info.company_id, but the response shape returns the same
integer at company.basic_info.crustdata_company_id. They point to the same
value. See Jobs IDs: a quick map.- Job details
- Company basic info
- Company firmographics
- Location
- Content, metadata, IDs
| Field | Example |
|---|---|
job_details.title | "Software Engineer" |
job_details.category | "Engineering", "Sales", "Operations", "Others" |
job_details.workplace_type | "Remote", "Hybrid", "On-site", "" |
job_details.reposted_job | true / false |
job_details.url | "https://www.linkedin.com/jobs/view/4398377738" |
Sending a filter on a non-indexed field returns
400 with Unsupported columns in conditions: ['...']. Sending an unsupported group_by column
returns a similar error listing every supported aggregation column.Field reference
This section covers the return shape, id semantics, aggregation bucket metadata, and the most important indexed field catalogs in one place.Annotated full Job example
The code fence below uses
jsonc because it includes inline // comments
for annotation. Strip the comments before sending it to a strict JSON
parser.Jobs IDs: a quick map
| ID | Lives on | Purpose |
|---|---|---|
crustdata_job_id | Top-level on each Job | Crustdata job identifier. Use it as your dedupe key in your own store. |
job_details.job_id | Inside Job.job_details | Secondary job identifier. It currently mirrors crustdata_job_id and is kept for backwards compatibility. |
company.basic_info.crustdata_company_id | Inside Job.company.basic_info | Crustdata company identifier returned on each row. |
company.basic_info.company_id | Search filter / aggregation path | Indexed alias for the same company identifier. Use this in filters.field and aggregations.column. |
Aggregation bucket metadata
When yougroup_by on company.basic_info.company_id, each bucket carries
a metadata object whose keys use bucket-specific names rather than the
Job response dot-paths:
| Bucket metadata key | Equivalent Job value | Notes |
|---|---|---|
company_name | company.basic_info.name | Plain company name. |
company_website_domain | company.basic_info.primary_domain | Primary website domain. |
linkedin_id | company.basic_info.professional_network_id | Public-profile identifier returned only inside aggregation buckets. |
crustdata_company_id | company.basic_info.crustdata_company_id | Crustdata company id. Defined in the spec as nullable; the bucket key already carries this value. |
Job identifiers
| Path | Type | Filter | Sort | Group | Return | Example |
|---|---|---|---|---|---|---|
crustdata_job_id | integer | ✅ | — | — | ✅ | 41053563 |
job_details.job_id | integer | — | — | — | ✅ | 41053563 |
Job details (job_details.*)
| Path | Type | Filter | Sort | Group | Return | Example |
|---|---|---|---|---|---|---|
job_details.title | string | ✅ | — | ✅ | ✅ | "Software Engineer" |
job_details.category | string | ✅ | — | ✅ | ✅ | "Engineering" |
job_details.workplace_type | string | ✅ | — | ✅ | ✅ | "Remote", "Hybrid", "On-site", "" |
job_details.reposted_job | boolean | ✅ | — | — | ✅ | false |
job_details.url | string | ✅ | — | — | ✅ | "https://www.linkedin.com/jobs/view/4398377738" |
job_details.number_of_openings | integer | — | — | — | ✅ | 1 |
Company basic info (company.basic_info.*)
| Path | Type | Filter | Sort | Group | Return | Example |
|---|---|---|---|---|---|---|
company.basic_info.company_id | integer | ✅ | — | ✅ | — | 631394 |
company.basic_info.crustdata_company_id | integer | — | — | — | ✅ | 631394 |
company.basic_info.name | string | ✅ | — | — | ✅ | "Stripe" |
company.basic_info.primary_domain | string | ✅ | — | ✅ | ✅ | "stripe.com" |
company.basic_info.website | string | — | — | — | ✅ | "https://stripe.com" |
company.basic_info.professional_network_id | string | ✅ | — | — | ✅ | "2135371" |
company.basic_info.industries | string[] | ✅ | — | ✅ | ✅ | ["Technology, Information and Internet"] |
company.basic_info.company_id and
company.basic_info.crustdata_company_id refer to the same integer. Use the
short alias in filters and aggregations.column. The response shape
writes the value under crustdata_company_id.Company firmographics
Headcount (company.headcount.*)
| Path | Type | Filter | Sort | Group | Return | Example |
|---|---|---|---|---|---|---|
company.headcount.total | integer | ✅ | ✅ | — | ✅ | 14522 |
company.headcount.range | string | ✅ | — | ✅ | ✅ | "5001-10000" |
company.headcount.largest_headcount_country | string | — | — | — | ✅ | "USA" |
Followers (company.followers.*)
| Path | Type | Filter | Sort | Group | Return | Example |
|---|---|---|---|---|---|---|
company.followers.count | integer | ✅ | ✅ | — | ✅ | 1335688 |
Revenue (company.revenue.*)
| Path | Type | Filter | Sort | Group | Return | Example |
|---|---|---|---|---|---|---|
company.revenue.estimated.lower_bound_usd | integer | ✅ | ✅ | — | ✅ | 500000000 |
company.revenue.estimated.upper_bound_usd | integer | ✅ | ✅ | — | ✅ | 1000000000 |
company.revenue.acquisition_status | string | ✅ | — | — | ✅ | "" |
company.revenue.public_markets.stock_symbols | string[] | — | — | — | ✅ | ["STRIPE"] |
company.revenue.public_markets.fiscal_year_end | string | — | — | — | ✅ | "" |
Funding (company.funding.*)
| Path | Type | Filter | Sort | Group | Return | Example |
|---|---|---|---|---|---|---|
company.funding.total_investment_usd | number | ✅ | ✅ | — | ✅ | 9440247725.0 |
company.funding.valuation_usd | number | ✅ | ✅ | — | ✅ | 50000000000.0 |
company.funding.last_fundraise_date | string (ISO 8601) | ✅ | ✅ | — | ✅ | "2026-03-09T00:00:00" |
company.funding.last_round_type | string | ✅ | — | ✅ | ✅ | "secondary_market" |
company.funding.num_funding_rounds | integer | ✅ | ✅ | — | ✅ | 23 |
company.funding.investors | string[] | — | — | — | ✅ | ["Sequoia Capital"] |
Competitors and company locations
| Path | Type | Filter | Sort | Group | Return | Example |
|---|---|---|---|---|---|---|
company.competitors.websites | string[] | — | — | — | ✅ | ["https://plaid.com"] |
company.locations.country | string | ✅ | — | ✅ | ✅ | "USA" |
company.locations.state | string | — | — | — | ✅ | "California" |
company.locations.city | string | — | — | — | ✅ | "South San Francisco" |
company.locations.street_address | string | — | — | — | ✅ | "354 Oyster Point Blvd, ..." |
Job location (location.*)
| Path | Type | Filter | Sort | Group | Return | Example |
|---|---|---|---|---|---|---|
location.raw | string | ✅ | — | — | ✅ | "Melbourne, Victoria, Australia" |
location.city | string | ✅ | — | — | ✅ | "Melbourne" |
location.district | string | ✅ | — | — | ✅ | "Southbank" |
location.state | string | ✅ | — | — | ✅ | "Victoria" |
location.country | string | ✅ | — | ✅ | ✅ | "Australia" |
location.pincode | string | ✅ | — | — | ✅ | "3006" |
Country value normalization.
location.country can appear as full names
("United States"), ISO-style short forms ("USA"), and occasional
variants ("United States of America"). When filtering by country, either
match multiple variants with in or pre-discover the exact indexed values
by running a group_by on location.country.Content (content.*)
| Path | Type | Filter | Sort | Group | Return | Example |
|---|---|---|---|---|---|---|
content.description | string | ✅ | — | — | ✅ | "Stripe is a financial infrastructure..." |
Metadata (metadata.*)
| Path | Type | Filter | Sort | Group | Return | Example |
|---|---|---|---|---|---|---|
metadata.date_added | string (ISO 8601) | ✅ | ✅ | — | ✅ | "2026-04-07T11:37:29" |
metadata.date_updated | string (ISO 8601) | ✅ | ✅ | — | ✅ | "2026-04-08T00:00:00" |
Null, blank, and sparse field behavior
MostJob fields are nullable in the spec and can legitimately be absent
or empty.
- Null or missing — the field is not present on a given
Job. - Blank string
""— the field was present but had no indexable value (common forjob_details.workplace_type). Treat blank as “unspecified”, not as the same thing as null. - Sparse nested objects —
company.funding,company.revenue, andcompany.competitorsare often missing for smaller or private companies. is_null/is_not_nulloperators are currently not implemented — request the field viafieldsand filter for null presence client-side.
Errors
| Status | Envelope shape | Meaning |
|---|---|---|
400 | { "error": { "type", "message", "metadata" } } | Invalid request — unsupported filter column, unsupported aggregation column, limit out of range, or malformed body. error.type is invalid_request for validation failures and internal_error for unsupported-column checks. |
401 | { "message": "..." } (flat — not the error envelope) | Unauthorized — the Authorization header is missing, malformed, or contains an invalid API key. |
500 | { "error": { "type", "message", "metadata" } } | Internal server error — retry after a short delay. |
Pagination & sorting
How to paginate, sort, select fields, and aggregate results in Search Jobs. For worked examples, see Examples. For filter grammar, operators, and the full field catalog, see Reference.Replace
YOUR_API_KEY in each example with your actual API key. All
requests require the x-api-version: 2025-11-01 header.Sorting
sorts is an ordered array. Each item has a field and order ("asc"
or "desc"). Sorts apply in array order — the first sort is the primary
key, the second breaks ties, and so on.
Sortable fields
The following indexed fields are verified sortable:metadata.date_addedmetadata.date_updatedcompany.headcount.totalcompany.followers.countcompany.revenue.estimated.lower_bound_usdcompany.revenue.estimated.upper_bound_usdcompany.funding.total_investment_usdcompany.funding.valuation_usdcompany.funding.last_fundraise_datecompany.funding.num_funding_rounds
- Newest postings first —
{ "field": "metadata.date_added", "order": "desc" } - Biggest companies first —
{ "field": "company.headcount.total", "order": "desc" } - Most followed companies first —
{ "field": "company.followers.count", "order": "desc" } - Highest-funded companies first —
{ "field": "company.funding.total_investment_usd", "order": "desc" }
Pagination
Pagination is cursor-based. Each response returns anext_cursor (or
null when you reach the end). To fetch the next page, resend the original
request body with cursor set to the previous next_cursor.
Walk forward
Take
next_cursor from the response and pass it back as cursor in the
next request. Keep filters, sorts, and fields identical — if you
change them, the cursor becomes meaningless.Consistency between pages
Current platform behavior — best-effort, not strict snapshot. A
cursor is consistent with respect to the filter, sort, and field
selection you sent on the first page, so the same query will keep
paging forward over a coherent result stream. However, because the
underlying indexed dataset is continuously updated, new jobs indexed
between page requests can cause minor drift in
total_count and in the
exact position of individual rows. Treat pagination as best-effort,
not a strict snapshot.For bulk exports where every row matters:- Constrain your filter to a bounded date window (for example
metadata.date_added >= 2025-01-01AND< 2025-07-01) so newly indexed jobs outside the window do not affect the walk, and - Re-run the full walk periodically and diff against the prior
snapshot using
crustdata_job_idas the dedupe key.
Dataset freshness and lifecycle
What the indexed Jobs dataset represents. The Search Jobs dataset
is a rolling index of job listings discovered from the web, refreshed
on an ongoing basis. Each row has:
metadata.date_added— when Crustdata first saw the listing.metadata.date_updated— most recent refresh.
metadata.date_added or metadata.date_updated window (for
example, within the last 30 days) and pair it with the hiring
company’s firmographics. For alerting or repeated exports, keep your
date windows bounded and dedupe rows with crustdata_job_id.Date filter semantics
Dates and timezones. When you pass a date-only value like
"2025-01-01", the backend interprets it as 2025-01-01T00:00:00 in
UTC. Ranges using => are inclusive of the boundary and < is
exclusive, so "metadata.date_added" >= "2025-01-01" AND
< "2025-07-01" covers every listing indexed between Jan 1 (inclusive)
and Jul 1 (exclusive) in UTC. Pass full timestamps like
"2025-01-01T08:00:00" when you need finer precision.Fetch page 2
Field selection
Usefields to return only the dot-paths you need. The top-level groups
are crustdata_job_id, job_details, company, location, content,
metadata. You can request:
- A whole group —
"company"returns everycompany.*sub-object. - A sub-object —
"company.basic_info"returns only the basic info block. - A single field —
"company.basic_info.name"returns just the name.
Aggregations
Aggregations let you roll up results without returning individual job rows. Setlimit: 0 when you only want aggregation output. Two types are
supported:
count— returns the total number of jobs matchingfilters.group_by— buckets the results bycolumnand returns per-bucket counts.
AggregationRequest schema
| Field | Type | Required | Description |
|---|---|---|---|
type | string (enum) | Yes | "count" for a simple total, "group_by" to bucket by column. |
column | string | Required for group_by | Dot-path to group by. Must be in the Groupable fields allowlist. |
agg | string (enum) | Required for group_by | Sub-aggregation inside each bucket. Currently only "count" is supported. |
size | integer | No (default 100) | Maximum number of buckets to return. Min 1, max 1000. |
AggregationResponseItem echoes type and column, then carries:
value(integer) — populated forcountaggregations. The total match count.buckets(array) — populated forgroup_byaggregations. Each bucket has akey,count, and ametadataobject whose keys depend on the grouped column. See Aggregation bucket metadata.
aggregations[] in the same order you sent them.
Count all Engineering jobs
Top companies indexing “Software Engineer” listings (bounded window)
Groupable fields
group_by.column is restricted to the following indexed fields:
company.basic_info.company_idcompany.basic_info.industriescompany.basic_info.primary_domaincompany.funding.last_round_typecompany.headcount.rangecompany.locations.countryjob_details.categoryjob_details.titlejob_details.workplace_typelocation.country
400 with Unsupported aggregation column: '...'. Supported: ....
What’s next
- Search Jobs — back to the main Search page.
- Examples — SDR/BDR keyword hunting, mid-market filtering, funding-triggered queries, and aggregations.
- Pagination & sorting — sorting, pagination, field selection, and aggregations.
- OpenAPI reference — the formal schema for every request, response, and error.

