DataExport API update - new endpoints and filters

March 24th, 2026

DataExport API update - new endpoints and filters

March 24, 2026

This release adds powerful new search capabilities — boolean logic between keyword groups, wildcard & exclusion matching, proximity search, and filtering by user profile metadata and location. All features work with both /api/v1/preview and /api/v1/export.

🆕 New Features

Boolean logic across keyword groups

Parameter: group_operator · string · optional · default: "AND"

You can now combine keyword groups with AND or OR logic. Set group_operator: "OR" to match posts hitting any group instead of requiring all groups.

Example: Match (ukraine AND war) OR crimea by creating two groups with group_operator: "OR":

{ "keyword_groups": [ {"terms": ["ukraine", "war"], "operator": "AND"}, {"terms": ["crimea"], "operator": "OR"} ], "group_operator": "OR" }

All other filters (dates, domains, languages) apply to both branches.

Wildcard & prefix matching

Syntax: append * to any term

Terms ending with * trigger substring matching, regardless of full_string_scan. The * is stripped and the remaining stem is searched as a substring.

regulat* → matches regulation, regulatory, regulate, etc.
sustainab* → matches sustainable, sustainability, etc.
Minimum stem length: 1 character
Works in both keyword_groups and exclude_keyword_groups

{ "keyword_groups": [ {"terms": ["regulat*", "legislat*"], "operator": "OR"} ] }

Exclusion keyword groups

Parameter: exclude_keyword_groups · object[] · optional · max 3 groups, 20 terms each

Filter out noise by excluding posts that match specific terms — applied after inclusion matching. Same structure as keyword_groups.

{ "keyword_groups": [ {"terms": ["crypto*"], "operator": "OR"} ], "exclude_keyword_groups": [ {"terms": ["sponsored", "ad", "promo*"], "operator": "OR"} ] }

In our tests, exclusion alone cut irrelevant results by over 99% on noisy queries.

Proximity search (NEAR/N)

Parameter: proximity_groups · object[] · optional · max 3 groups

Find posts where two terms appear within N words of each other (bidirectional, case-insensitive). Distance range: 1–10 words.

		FieldTypeDescription
`term_a`	string	First proximity term
`term_b`	string	Second proximity term
`distance`	integer	Max words between terms (1–10)

{ "keyword_groups": [ {"terms": ["inflation", "rate"], "operator": "OR"} ], "proximity_groups": [ {"term_a": "inflation", "term_b": "rate", "distance": 5} ] }

This query found 20,000+ posts that exact phrase matching returned zero on, because the words were close but not adjacent.

⚠️ Requires keyword_groups — the API returns 400 without it.

Location filtering

Parameter: locations · string[] · optional · max 20 items, 500 chars each

Filter by user-declared profile location. Case-insensitive substring matching with OR logic.

{ "locations": ["Poland", "Warszawa", "Warsaw"] }

~25% of posts carry location data (primarily x.com)
Empty strings and whitespace are automatically trimmed/ignored

Profile & user metadata filtering

Parameter: profile_filters · object · optional · max 5 fields, 10 values per field

Filter on Twitter/X author profile attributes from the summary JSON column. Currently populated for x.com data only.

	FieldMatch type
`user_description`	Substring (case-insensitive)
`profile_image_url`	Substring (case-insensitive)
`user_followers_count`	Exact string match
`user_following_count`	Exact string match
`user_created_at`	Exact string match
`user_verified`	Exact (`"true"` / `"false"`)
`user_blue_verified`	Exact (`"true"` / `"false"`)

{ "keyword_groups": [ {"terms": ["bitcoin"], "operator": "OR"} ], "profile_filters": { "user_blue_verified": ["true"], "user_description": ["journalist", "reporter"] }, "domains": ["x.com"] }

Multiple fields are AND-combined; multiple values within a field are OR-combined. Unknown field names are silently ignored.

🔧 Improved

Username matching

Parameter: case_sensitive_usernames · boolean · optional · default: false

Username matching is now case-insensitive by default (e.g. @RuchNarodowy and @ruchnarodowy both match). Set case_sensitive_usernames: true to enforce exact-case matching (legacy behavior).

Exact phrase matching with `full_string_scan`

Parameter: full_string_scan · boolean · default: false

Now documented with clearer guidance: when enabled, you get true substring/phrase matching — useful for character-level precision (e.g. distinguishing "17 year-old" from "17-year-old"). Without it, fast mode uses token-based index acceleration (Bloom filters), which handles 99% of use cases.

📋 New Parameters Summary

			ParameterTypeDefaultScope
`group_operator`	`string`	`"AND"`	How keyword groups combine with each other
`exclude_keyword_groups`	`object[]`	`null`	Up to 3 groups of terms to exclude
`proximity_groups`	`object[]`	`null`	NEAR/N term proximity matching
`locations`	`string[]`	`null`	User-declared location substring filter
`profile_filters`	`object`	`null`	x.com profile metadata filtering
`case_sensitive_usernames`	`boolean`	`false`	Toggle case-sensitive username matching

⚠️ No Breaking Changes

All new parameters are optional with sensible defaults. Existing queries continue to work unchanged.

📖 Full docs: exordelabs.com/developer-docs/data-export

Exorde