Prefix Autocomplete with Edge-Ngrams

The single most common way to break edge-ngram autocomplete is to apply the same analyzer at index and search time. When you do, the user’s three-character input is itself exploded into prefix tokens, the one-letter token matches almost every document, latency collapses, and the dropdown fills with noise. This guide implements prefix autocomplete correctly in Elasticsearch with a custom edge_ngram token filter, an asymmetric index-versus-search analyzer split, and disciplined min_gram/max_gram tuning. It is the analyzer-level recipe behind the broader query autocomplete & suggestions guide and the Search Frontend & UX Patterns area, and it is the right choice when you need term-prefix matching inside text fields with normal scoring and filtering rather than a curated, in-memory suggestion list.

Prerequisites

Elasticsearch 8.x or OpenSearch 2.x at localhost:9200.
A new index you can create from scratch (analyzers are immutable on existing fields; changing them requires a reindex).
curl and jq.

Diagnosis / Context

Edge-ngrams generate, for each term, every prefix from min_gram to max_gram characters and store them as ordinary tokens in the inverted index. Indexing search with min_gram: 2, max_gram: 20 writes the tokens se, sea, sear, searc, search. A user typing sea then matches because sea is a literal token. That part is correct and desirable.

The failure is on the search side. If the field’s search_analyzer is the same edge-ngram analyzer, Elasticsearch tokenizes the query sea into se, sea — and the low-signal se token matches seattle, season, secret, and thousands of unrelated documents. Worse, because match defaults to an OR over the query tokens, any of those expanded prefixes is enough to pull a document into the result set, and each one is scored, sorted, and paginated. You see this as a slow query returning far too many hits:

GET /products/_search { "query": { "match": { "title": "sea" } } }
# took: 412ms, hits.total.value: 38194   <-- a two-letter prefix should not match 38k docs

The fix is asymmetry: edge-ngram only at index time, a plain lowercase analyzer at search time. Then the query sea stays the single token sea and matches only documents that contain the indexed prefix token sea — which is exactly the set of documents with a word beginning sea. This is why edge-ngrams are the right tool when you want term-prefix matching inside text bodies with ordinary BM25 scoring and filter clauses, and the wrong tool when you want a curated, weighted suggestion list (use the completion suggester for that, as the parent guide explains).

A second, subtler reason the split matters is index size. Edge-ngrams already inflate the posting lists severalfold — each term contributes (max_gram − min_gram + 1) tokens instead of one. If you also edge-ngram the query, you pay that explosion twice: once on disk and once per request as the query analyzer fans a short input into many clauses. Keeping the search analyzer plain caps query-side work at one token per word regardless of how aggressive your index-time ngramming is, which decouples the two tuning decisions cleanly.

Solution Steps

1. Create the index with split analyzers

PUT localhost:9200/products
{
  "settings": {
    "index": { "max_ngram_diff": 20 },           // allow a wide min/max gram span
    "analysis": {
      "filter": {
        "edge_ngram_filter": {
          "type": "edge_ngram",
          "min_gram": 2,                          // skip 1-char prefixes: too low-signal, huge posting lists
          "max_gram": 20                          // cover the longest term you expect to prefix-match
        }
      },
      "analyzer": {
        "autocomplete_index": {                   // INDEX time: explode terms into prefixes
          "type": "custom",
          "tokenizer": "standard",
          "filter": ["lowercase", "edge_ngram_filter"]
        },
        "autocomplete_search": {                  // SEARCH time: plain, no ngramming
          "type": "custom",
          "tokenizer": "standard",
          "filter": ["lowercase"]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "title": {
        "type": "text",
        "analyzer": "autocomplete_index",
        "search_analyzer": "autocomplete_search"  // the load-bearing line
      }
    }
  }
}

2. Index sample documents

curl -s -X POST 'localhost:9200/products/_bulk?refresh' -H 'Content-Type: application/json' --data-binary '
{"index":{}}
{"title":"search engine"}
{"index":{}}
{"title":"seattle mariners cap"}
{"index":{}}
{"title":"seasonal jacket"}
'

3. Query with the user’s raw input

Send the partial input straight through a match query. Because the field’s search_analyzer is plain, the input is not re-ngrammed; it matches the indexed prefix tokens directly. There is no special query type to learn — the asymmetry in the mapping does all the work.

curl -s 'localhost:9200/products/_search' -H 'Content-Type: application/json' -d '{
  "query": { "match": { "title": "sear" } }
}' | jq '.hits.hits[]._source.title'
# => "search engine"   (only the document whose term starts with "sear")

4. Constrain and rank like any other field

Because the matches land in the ordinary inverted index, you keep the full query DSL. Wrap exact-match constraints in a filter clause so they bypass scoring and hit the segment cache, and add a bool should boost on an exact-keyword subfield so a full-word match outranks a mere prefix match. This is the payoff of the edge-ngram approach over an in-memory suggester: prefix completion composes with filtering, boosting, and pagination at no extra cost.

curl -s 'localhost:9200/products/_search' -H 'Content-Type: application/json' -d '{
  "query": {
    "bool": {
      "must":   [{ "match": { "title": "sear" } }],
      "filter": [{ "term": { "in_stock": true } }],
      "should": [{ "match": { "title.keyword": "search engine" } }]
    }
  }
}' | jq '.hits.hits[] | {title: ._source.title, score: ._score}'

Verification

Prove the asymmetry directly with the _analyze API — the same text must tokenize differently under each analyzer.

# INDEX analyzer explodes into prefixes:
curl -s 'localhost:9200/products/_analyze' -H 'Content-Type: application/json' \
  -d '{"analyzer":"autocomplete_index","text":"search"}' | jq -c '[.tokens[].token]'
# => ["se","sea","sear","searc","search"]

# SEARCH analyzer leaves the query intact:
curl -s 'localhost:9200/products/_analyze' -H 'Content-Type: application/json' \
  -d '{"search_analyzer":true,"field":"title","text":"sea"}' | jq -c '[.tokens[].token]'
# => ["sea"]

If the second command returns ["se","sea"], the search_analyzer is not wired up and you have the match-explosion bug. Expected healthy state: the index analyzer emits prefixes, the search analyzer emits exactly one token per query word.

As a regression guard, capture the hit count for a known two-character prefix and assert it stays bounded. A healthy sea query should return on the order of the documents that genuinely start with sea, not a double-digit percentage of the corpus.

curl -s 'localhost:9200/products/_search?filter_path=hits.total' -H 'Content-Type: application/json' \
  -d '{"size":0,"track_total_hits":true,"query":{"match":{"title":"sea"}}}' | jq '.hits.total.value'
# Healthy: a small, explainable number. If this is in the tens of thousands, the search analyzer regressed.

Run this assertion in CI against a seeded fixture index so an accidental mapping change — someone dropping search_analyzer during a refactor — fails the build instead of shipping a latency regression to production.

5. Tune min_gram and max_gram against your corpus

The two gram bounds are the only knobs with real index-size and recall consequences, and they trade off directly. min_gram sets the shortest prefix you index: lower it and a user gets matches after fewer keystrokes, but the posting lists for one- and two-character prefixes balloon and their selectivity drops. max_gram sets the longest prefix you index: a query longer than max_gram characters matches nothing on this field, because no token that long was ever written.

Pick min_gram to equal the minimum input length your frontend will send a request for — if the UI waits for two characters, min_gram: 2 wastes nothing. Pick max_gram to cover your longest realistically-prefixed term; for product titles 20 is generous, for short SKUs 10 may suffice. The span between them must not exceed index.max_ngram_diff (default 1), which is why the create call sets it explicitly.

# Inspect how many distinct terms the field holds; a sudden jump after lowering
# min_gram signals prefix-token bloat rather than new content.
curl -s 'localhost:9200/products/_stats/store,docs?filter_path=indices.*.primaries.store.size_in_bytes' \
  -H 'Content-Type: application/json' | jq

A practical default for English title autocomplete is min_gram: 2, max_gram: 20. Measure store size before and after any change; if dropping min_gram from 3 to 2 doubles the field’s footprint for marginal UX gain, keep it at 3 and have the frontend hold the request one keystroke longer.

Common Pitfalls

min_gram set to 1 floods the index and the dropdown

A min_gram of 1 emits a single-character token for every term, so one-letter input matches essentially the whole corpus and the posting lists for a, e, s become enormous. Set min_gram: 2 or 3 and let the frontend withhold the request until the user has typed that many characters. Verify the field’s term count did not explode with GET /products/_stats/store.

Queries longer than max_gram silently return nothing

Edge-ngrams only exist up to max_gram characters. With max_gram: 10, indexing internationalization produces no token longer than internatio, so a user typing international (13 chars) matches nothing even though the document exists. Set max_gram to your longest realistic term, and note Elasticsearch rejects a min/max span above index.max_ngram_diff (default 1) — which is why the mapping raises it to 20.

Multi-word phrase order is lost

A plain match on an edge-ngram field treats words independently, so red shoe and shoe red score alike and partial last-word prefixing is loose. When phrase order and last-token prefixing matter, switch the query to match_phrase_prefix, or move to the search_as_you_type field type described in the parent guide. Edge-ngrams are strongest for single-term prefix matching, not ordered phrase completion. Stopwords and synonyms also shift which prefixes resolve — keep them consistent across both analyzers, as covered in synonym & stopword management.

Query autocomplete & suggestions — the parent guide comparing edge-ngrams against the completion suggester and search_as_you_type.
Search-as-you-type interfaces — the field type to reach for when phrase order and filtering matter more than term prefixes.
Elasticsearch fundamentals for engineers — analyzer chains, tokenizers, and the inverted index this recipe depends on.