How Meituan Cut Elasticsearch Search Latency by 84% with an RLE‑Based Inverted Index
This article details Meituan's search‑engine team optimization of Elasticsearch for a high‑traffic LBS scenario, describing the performance bottlenecks in term‑posting retrieval and merging, the design of a run‑length‑encoding (RLE) inverted index, its integration as a plugin, extensive benchmarking, and the resulting 84% reduction in TP99 query latency.
Meituan's food‑delivery search team faced growing latency and CPU load as the number of searchable merchants rose to tens of thousands per query. The primary hotspots were the inverted‑list (posting) retrieval and merge phases of Elasticsearch's term queries, which became CPU‑intensive at scale.
Problem Analysis
In the LBS‑driven workflow, a search request translates to a POST food/_search query with a massive terms filter containing up to hundreds of thousands of merchant IDs. While each term lookup is fast (<0.001 ms), the sheer volume (10 000+ terms) makes the total retrieval cost 5‑10 ms per request, and the subsequent merge of millions of postings dominates CPU usage.
Technical Exploration
The team examined Elasticsearch/Lucene's design: term dictionaries are stored as FSTs, posting lists are read from disk, decoded, and merged using a DocIdSetBuilder that iterates over each doc ID. This process has linear complexity O(K·M) where M is the number of terms and K the average posting length.
To address the two CPU hotspots—posting retrieval and posting merge—the following solutions were evaluated:
Replace FST lookup with a hash‑map (e.g., LongObjectHashMap) for O(1) term lookup.
Adopt RoaringBitmap as the underlying structure for posting sets.
Leverage Index Sorting to make doc IDs contiguous, turning sparse merges into dense sequential operations.
Design a custom Run‑Length Encoding (RLE) posting format that stores each posting list as [start, length] when IDs are contiguous.
Benchmarks showed that LongObjectHashMap offered the fastest term lookup, while an RLE‑based dense RoaringBitmap reduced merge time dramatically (TP99 from 13 ms to 0.5 ms for 10 000 postings of length 100).
RLE‑Based Posting Format
After applying index sorting on city_id, geohash, and poi_id, each merchant’s postings become a strictly increasing integer sequence. Encoding this sequence with RLE compresses it to two integers ( start and length), cutting memory from several kilobytes to a few bytes.
class SparseRoaringDocIdSet {
int[] index; // bucket indices that contain data
Container[] denseSets; // tightly packed posting containers
}The SparseRoaringDocIdSet records only non‑empty buckets, eliminating the overhead of empty containers during merges.
Integration into Elasticsearch
The solution was packaged as a non‑intrusive plugin:
Hook into the internal engine to load existing segments and replace their posting formats with the RLE‑based format.
Generate RLE posting tables for configured fields and associate them with segments.
Provide a new query builder that falls back to the original path when RLE data is unavailable, ensuring safe degradation.
Performance Gains
After deploying the plugin to production, the full‑stack query latency (TP99) dropped from 100 ms to 16 ms—a reduction of 84%. CPU consumption on data nodes also decreased significantly, as shown in the following charts:
Conclusion and Outlook
The RLE‑based inverted index, combined with hash‑map term lookup, dense RoaringBitmap storage, and index sorting, eliminated the dominant CPU hotspots in Meituan’s food‑delivery search service, achieving an 84% latency reduction and lower CPU usage. The approach is applicable to other large‑scale Elasticsearch deployments, especially where term posting lists become dense after sorting. Future work includes automatic container selection (bitmap/array/RLE) based on data distribution and persisting RLE postings at index‑write time to avoid runtime recompression.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
