In-Memory Inverted Index Compression Algorithms: Overview and MILC Optimization for High‑Performance Search
This article reviews major in‑memory inverted index compression techniques such as PForDelta, PEF, and MILC, explains their principles and trade‑offs, and details practical optimizations applied at 58.com to achieve query performance comparable to uncompressed indexes while reducing memory usage by about 35 percent.
In search engines, inverted indexes are a core data structure for efficient retrieval, and their size becomes a bottleneck on large data sets, prompting the development of compression techniques that reduce disk I/O, memory consumption, and CPU‑cache traffic.
Research on inverted index compression spans nearly five decades; early work focused on maximizing compression ratio to improve disk I/O, while modern systems, which keep index data resident in memory, prioritize decoding speed and query performance.
In the current 58.com search environment, hot index data typically resides in memory, making memory usage the primary resource constraint; therefore, compression aims to lower memory footprint and balance resource utilization.
This paper first surveys several representative compression algorithms suitable for memory‑resident scenarios, then concentrates on one algorithm—MILC—detailing its design, implementation, and optimizations for the 58.com search workload.
Overview of Inverted Index Compression Algorithms
An inverted index maps each term to an ordered list of document IDs (posting list). Since posting lists dominate index size, compression essentially reduces ordered integer sequences.
Beyond compression ratio, a technique must support fast retrieval operations such as intersection, union, and difference of posting lists; in 58.com, intersection is the most common operation, requiring efficient location of a given document ID within a list.
We surveyed industry algorithms including the classic PForDelta and its derivatives (NewPFD, OptPFD), the SIGIR‑2014 best‑paper Partitioned Elias‑Fano (PEF), and the 2017 MILC algorithm.
PForDelta
PForDelta (2005) compresses posting lists by dividing them into fixed‑size chunks (e.g., 128 IDs), storing the first ID and the deltas of subsequent IDs. Deltas are encoded with a fixed bit‑width; outlier deltas are stored separately as exceptions.
While simple, PForDelta’s delta‑based decoding requires scanning all previous deltas, leading to roughly double the query time compared to an uncompressed list.
PEF: Partitioned Elias‑Fano
PEF builds on the 1974 Elias‑Fano encoding, separating high bits (bucket index) and low bits (offset). It stores bucket sizes using unary codes and concatenates low bits, enabling O(1) bucket location and fast linear search within a bucket. PEF achieves better compression and intersection speed than OptPFD.
The authors provide an open‑source implementation ( https://github.com/ot/partitioned_elias_fano ) and presentation slides.
Experimental results show PEF reduces posting‑list space by ~65 % while increasing query time by only ~36 %.
MILC: Inverted List Compression in Memory
MILC targets high query performance. Like PForDelta it uses chunking, but encodes each chunk with offset‑based values (difference from the chunk’s first element) stored at a uniform bit‑width, avoiding delta‑based accumulation.
Decoding a value involves a binary search in the metadata (jump) table to locate the chunk, then a binary search within the chunk using the offset as the key.
For example, decoding the first offset of a chunk stored in a 32‑bit array A[0] & 0x03FF; a value spanning two words is decoded as (A[0] >> 30) | ((A[1] & 0x00FF) << 2).
Because MILC does not depend on preceding values, it supports random access and binary search, unlike PForDelta which must scan sequentially.
To improve compression, MILC introduces dynamic chunking and a second‑level chunking scheme.
Dynamic Chunking
Fixed‑size chunks can produce large offsets for unevenly distributed IDs. Dynamic chunking groups numerically close IDs, reducing the maximum offset and thus the required bit‑width.
Figure 2 (from the original paper) illustrates how dynamic chunking reduces total bits from 2432 bits to 2088 bits for a sample list.
Designing dynamic chunks balances the number of chunks against bit‑width; the jump table size (λ) limits the maximum chunk size to 2^λ. MILC’s metadata is 80 bits, yielding a maximum of 160 IDs per chunk.
The algorithm uses dynamic programming: an auxiliary array records the optimal chunking cost for each position, evaluates 160 possible chunk sizes, and back‑tracks to obtain the optimal partition in O(n) time.
Second‑Level Chunking
After the first level, each chunk’s offset sequence can be further subdivided, creating a hierarchy of jump tables and data blocks, simplifying decoding at the cost of additional metadata.
Practical Optimizations and Deployment at 58.com
Initially a simplified MILC with dynamic chunking (no second‑level chunking) was implemented. Performance analysis revealed that decoding offsets dominated query time due to multiple bit‑wise operations.
Optimizations included storing offsets as 8‑, 16‑, or 32‑bit integers to eliminate complex bit‑extractions and enable SIMD‑accelerated linear search, which, despite higher average complexity than binary search, is faster for the small chunk sizes typical in practice.
The jump‑table binary search was replaced with Galloping (exponential) search, which quickly narrows the search interval when the target key lies near the start of the list.
After these changes, intersection query latency became comparable to the uncompressed baseline, while memory usage dropped by roughly 35 %.
Note that these optimizations benefit intersection operations; union operations still require full scans and remain slower than uncompressed lists.
Conclusion
We surveyed several modern inverted‑list compression schemes, highlighted their trade‑offs between compression ratio and decoding speed, and presented MILC in detail along with engineering optimizations that allow 58.com’s main‑site search to retain query performance while saving significant memory.
Future work may explore other index structures such as bitmap indexes and compression of term dictionaries and forward indexes.
References: [1] M. Zukowski et al., “Super‑scalar RAM‑CPU cache compression,” ICDE 2006. [2] J. Zhang et al., “Performance of compressed inverted list caching in search engines,” WWW 2008. [3] H. Yan et al., “Inverted index compression and query processing with optimized document ordering,” WWW 2009. [4] G. Ottaviano and R. Venturini, “Partitioned Elias‑Fano indexes,” SIGIR 2014. [5] J. Wang, C. Lin et al., “MILC: Inverted List Compression in Memory.” [6] https://en.wikipedia.org/wiki/Exponential_search [7] http://www.di.unipi.it/~ottavian/files/partitioned_elias_fano_sigir14.pptx
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
58 Tech
Official tech channel of 58, a platform for tech innovation, sharing, and communication.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
