March 24th, 2026

DataExport API update - new endpoints and filters

March 24, 2026

This release adds powerful new search capabilities — boolean logic between keyword groups, wildcard & exclusion matching, proximity search, and filtering by user profile metadata and location. All features work with both /api/v1/preview and /api/v1/export.

🆕 New Features

Boolean logic across keyword groups

Parameter: group_operator · string · optional · default: "AND"

You can now combine keyword groups with AND or OR logic. Set group_operator: "OR" to match posts hitting any group instead of requiring all groups.

Example: Match (ukraine AND war) OR crimea by creating two groups with group_operator: "OR":

{ "keyword_groups": [ {"terms": ["ukraine", "war"], "operator": "AND"}, {"terms": ["crimea"], "operator": "OR"} ], "group_operator": "OR" } 

All other filters (dates, domains, languages) apply to both branches.


Wildcard & prefix matching

Syntax: append * to any term

Terms ending with * trigger substring matching, regardless of full_string_scan. The * is stripped and the remaining stem is searched as a substring.

  • regulat* → matches regulation, regulatory, regulate, etc.

  • sustainab* → matches sustainable, sustainability, etc.

  • Minimum stem length: 1 character

  • Works in both keyword_groups and exclude_keyword_groups

{ "keyword_groups": [ {"terms": ["regulat*", "legislat*"], "operator": "OR"} ] } 

Exclusion keyword groups

Parameter: exclude_keyword_groups · object[] · optional · max 3 groups, 20 terms each

Filter out noise by excluding posts that match specific terms — applied after inclusion matching. Same structure as keyword_groups.

{ "keyword_groups": [ {"terms": ["crypto*"], "operator": "OR"} ], "exclude_keyword_groups": [ {"terms": ["sponsored", "ad", "promo*"], "operator": "OR"} ] } 

In our tests, exclusion alone cut irrelevant results by over 99% on noisy queries.


Proximity search (NEAR/N)

Parameter: proximity_groups · object[] · optional · max 3 groups

Find posts where two terms appear within N words of each other (bidirectional, case-insensitive). Distance range: 1–10 words.

FieldTypeDescription

term_a

string

First proximity term

term_b

string

Second proximity term

distance

integer

Max words between terms (1–10)

{ "keyword_groups": [ {"terms": ["inflation", "rate"], "operator": "OR"} ], "proximity_groups": [ {"term_a": "inflation", "term_b": "rate", "distance": 5} ] } 

This query found 20,000+ posts that exact phrase matching returned zero on, because the words were close but not adjacent.

⚠️ Requires keyword_groups — the API returns 400 without it.


Location filtering

Parameter: locations · string[] · optional · max 20 items, 500 chars each

Filter by user-declared profile location. Case-insensitive substring matching with OR logic.

{ "locations": ["Poland", "Warszawa", "Warsaw"] } 
  • ~25% of posts carry location data (primarily x.com)

  • Empty strings and whitespace are automatically trimmed/ignored


Profile & user metadata filtering

Parameter: profile_filters · object · optional · max 5 fields, 10 values per field

Filter on Twitter/X author profile attributes from the summary JSON column. Currently populated for x.com data only.

FieldMatch type

user_description

Substring (case-insensitive)

profile_image_url

Substring (case-insensitive)

user_followers_count

Exact string match

user_following_count

Exact string match

user_created_at

Exact string match

user_verified

Exact ("true" / "false")

user_blue_verified

Exact ("true" / "false")

{ "keyword_groups": [ {"terms": ["bitcoin"], "operator": "OR"} ], "profile_filters": { "user_blue_verified": ["true"], "user_description": ["journalist", "reporter"] }, "domains": ["x.com"] } 

Multiple fields are AND-combined; multiple values within a field are OR-combined. Unknown field names are silently ignored.


🔧 Improved

Username matching

Parameter: case_sensitive_usernames · boolean · optional · default: false

Username matching is now case-insensitive by default (e.g. @RuchNarodowy and @ruchnarodowy both match). Set case_sensitive_usernames: true to enforce exact-case matching (legacy behavior).


Exact phrase matching with full_string_scan

Parameter: full_string_scan · boolean · default: false

Now documented with clearer guidance: when enabled, you get true substring/phrase matching — useful for character-level precision (e.g. distinguishing "17 year-old" from "17-year-old"). Without it, fast mode uses token-based index acceleration (Bloom filters), which handles 99% of use cases.


📋 New Parameters Summary

ParameterTypeDefaultScope

group_operator

string

"AND"

How keyword groups combine with each other

exclude_keyword_groups

object[]

null

Up to 3 groups of terms to exclude

proximity_groups

object[]

null

NEAR/N term proximity matching

locations

string[]

null

User-declared location substring filter

profile_filters

object

null

x.com profile metadata filtering

case_sensitive_usernames

boolean

false

Toggle case-sensitive username matching


⚠️ No Breaking Changes

All new parameters are optional with sensible defaults. Existing queries continue to work unchanged.


📖 Full docs: exordelabs.com/developer-docs/data-export