1. Requirements Clarification
Functional Requirements
- Return top-5 search suggestions for any given prefix
- Suggestions ranked by query frequency (most-searched first)
- Support incremental updates as new queries arrive
- Filter profanity and banned terms from suggestions
- Support multiple languages and character sets
Non-Functional Requirements
- Latency: <100ms end-to-end; <20ms server-side processing
- Scale: 10 billion searches per day on Google-scale; even mid-tier apps handle 100M queries/day
- Availability: 99.99% uptime — autocomplete failure degrades UX but is not a P0
- Data freshness: Suggestions updated within 1 hour of trending changes
2. Trie Data Structure
A Trie (prefix tree) is a tree data structure where each edge represents one character. Traversing from root to a node spells out a prefix. All words sharing a common prefix share the same path through the tree.
Words: "search", "searchable", "seal", "season"
root
│
s
│
e
/ \
a a
│ │
r l (seal)
│ │
c s (seals)
│
h (search)
│
a
│
b
│
l
│
e (searchable)
Each node also stores: top-5 suggestions for this prefix
Node "sea": [{seal:5000}, {search:3000}, {season:2000}, {seabird:800}, {seafood:700}]
Node "sear": [{search:3000}, {searchable:1200}, {searcher:400}]
Storing top-K suggestions at each node is the key optimization. Without it, every query would need to traverse the entire subtree to collect and sort all matching words — O(number of words with this prefix). With pre-computed top-K at each node, lookup is O(prefix_length).
3. Frequency Scoring and Updates
Suggestions are ranked by query frequency. Two update strategies exist:
| Strategy | How It Works | Latency to Update | Complexity | Recommended For |
|---|---|---|---|---|
| Batch Rebuild | Aggregate query logs daily; rebuild full Trie from scratch | Up to 24 hours | Low | Most production systems |
| Real-Time Update | Update Trie node counts on every search query | <1 second | High (locking, consistency) | Trending/news apps |
| Micro-Batch | Aggregate last 10 minutes of logs; update Trie every 10 min | 10 minutes | Medium | Balance of freshness and complexity |
For most systems, batch rebuild (daily) is sufficient. The Trie is rebuilt offline from aggregate query logs (stored in a data warehouse or Hadoop) and then swapped in atomically. The live Trie continues serving traffic while the new one is being built.
4. Distributed Trie — Sharding Strategy
A single-server Trie for billions of queries won't scale. Shard by the first two characters of the prefix so all queries with the same two-character prefix go to the same server.
Client: user types "sea..."
│
▼
Routing Layer
maps first 2 chars "se" → Shard 3
│
▼
Shard 3 (handles: sa, sb, ..., se, sf, ..., sz)
contains the full Trie subtree for all words starting with 's'
│
▼
Returns top-5 for "sea": [seasonal, search, seattle, seabird, seal]
With 10 shards and 676 two-character prefixes (a-z × a-z), each shard handles ~67 prefixes. Some shards will be hotter than others (words starting with "th", "in", "co" are more frequent in English) — use weighted sharding or move hot prefixes to dedicated servers.
Interview Tip — Shard by First Character First
Start simple: shard by first character (26 shards for a-z). Only add two-character sharding if you prove a single letter's data exceeds one server's memory. Premature over-engineering is a red flag in interviews — start with the simplest design that meets requirements, then scale.
5. Caching Strategy
Autocomplete queries follow a Zipf distribution — a small number of prefixes account for the vast majority of traffic. Cache aggressively at multiple layers.
- Browser cache: Cache last 100 queries per user in localStorage. Instant suggestions for repeated queries without any network call.
- CDN cache: Cache responses for common prefixes at CDN edge nodes (TTL 1 hour). Works for non-personalized suggestions.
- Redis cache: Cache top-K suggestions per prefix in Redis. TTL = 10 minutes. Key:
autocomplete:prefix:{prefix}. A 1GB Redis cache holds ~500K unique prefix results. - In-process cache: Each Trie server keeps a local LRU cache of the last 10K queries. Eliminates Trie traversal for hot prefixes.
6. Query Pipeline — End to End
7. Trie vs Elasticsearch for Autocomplete
| Feature | Custom Trie | Elasticsearch Completion Suggester |
|---|---|---|
| Latency | ~1–5ms | ~5–15ms |
| Fuzzy matching (typos) | No (exact prefix only) | Yes (fuzziness parameter) |
| Filtering by category | Complex custom logic | Built-in with context suggesters |
| Engineering effort | High (build, shard, monitor) | Low (managed service) |
| Memory usage | Efficient (only top-K per node) | Higher (FST-based structure) |
| Real-time updates | Requires custom pipeline | Near real-time index updates |
| Best for | Google-scale, custom scoring | Most production applications |
How We Research and Update This Guide
We test the underlying formula or workflow, compare outputs with reliable references, and revise examples whenever the page content changes.
- The workflow or formula is tested directly in the tool and compared against independent reference examples.
- Examples are kept practical so readers can verify the result without hidden assumptions.
- Pages are revised whenever the interface, calculation flow, or surrounding guidance materially changes.
Frequently Asked Questions — Search Autocomplete System Design
A Trie (prefix tree) is a tree where each node represents one character. The path from root to a node spells out a prefix. Searching for suggestions for prefix "sea" traverses: root → s → e → a, then collects all words in the subtree below the "a" node. Each node stores the top-K suggestions for its prefix (pre-computed) to avoid traversing the entire subtree on every query. This gives O(P + K) lookup time where P = prefix length and K = number of suggestions returned.
Frequency scoring is the baseline: rank by how often each term was searched. Score = log(query_count) * recency_boost * personalization_boost. Recency boost decays older queries: boost = e^(-λ × days_since_last_search). Personalization: promote terms the user has searched before. Each Trie node stores only the top-K (e.g. top-5) suggestions by score, pruned at build time. During online updates, use a sliding window counter (last 7 days) rather than all-time count to keep suggestions fresh.
Users expect suggestions to appear in under 100ms after each keystroke — ideally under 50ms. This constrains every layer: the Trie lookup must be in-memory (not a database query), the result must be cached aggressively, and the network round-trip must be minimized. In practice, teams add a 300ms debounce on the frontend (don't send a request on every keystroke — wait 300ms of inactivity) which gives the backend more slack. Even so, the server-side target is typically <20ms.
Shard by the first two characters of the prefix. All queries starting with "se" go to shard 1, "sh" to shard 2, etc. With 26 × 26 = 676 possible two-character prefixes and, say, 10 shards, each shard handles ~67 prefixes. A routing layer maps prefix → shard using a static hash table (not consistent hashing — prefix mapping is deterministic). When rebuilding the Trie (daily batch job), rebuild only the affected shards for changed prefixes.
Maintain a blocklist of prohibited terms (profanity, hate speech, competitor names). Filter at two points: (1) at Trie build time — exclude blocked terms from the frequency counters so they never enter the Trie; (2) at query time — after retrieving top-K suggestions, apply a fast regex or hash-set check against the blocklist. The blocklist is small enough to keep in memory. For spam queries (artificially inflated by bots), apply rate limiting and anomaly detection on the query logging pipeline before counts feed into the Trie.
Elasticsearch's completion suggester is a production-ready alternative to a custom Trie. Use Elasticsearch when: you need fuzzy matching (handle typos), multi-language support, or complex filtering (e.g. autocomplete for products within a category). A custom Trie gives lower latency (~1ms vs ~5ms for ES) and full control over scoring, but requires significant engineering effort to build and operate. For most companies, Elasticsearch is the right default. Build a custom Trie only when you have proven scale requirements and dedicated search engineering resources.