Databases 20 min read

SlimTrie: A Space‑Efficient Trie‑Based Index for Large‑Scale Storage Systems

This article presents SlimTrie, a trie‑based indexing structure that dramatically reduces memory consumption while maintaining fast query speeds, detailing its design, compression techniques, implementation steps, memory analysis, and performance comparisons with map and B‑Tree structures for large‑scale storage systems.

High Availability Architecture

Mar 14, 2019

SlimTrie: A Space‑Efficient Trie‑Based Index for Large‑Scale Storage Systems

Recent discussions on the SlimTrie data structure have led to a detailed technical exposition that explains its motivation, design goals, and practical benefits for cloud storage indexing.

Background and requirements : Traditional index structures such as hash maps and tree‑based indexes (B‑Tree, RBTree, SkipList, LSM Tree) suffer from memory consumption that grows with both the number of keys (n) and key length (k), making them unsuitable for indexing billions of small files in memory‑constrained environments.

Design goals of SlimTrie :

Correctly locate existing keys (no guarantee for non‑existent keys).

Support range queries.

Memory overhead depends only on the number of keys, not on key length.

Provide fast query performance.

Index only static data and limit key size to 16 KB.

Generation steps :

Build a standard Trie for all keys.

Compress the Trie by pruning single‑branch nodes (LeafParent and Inner nodes), reducing the space complexity from O(n·k) to O(n).

Store the compressed structure using three compacted arrays (inners, leaf‑parents, steps) that replace pointer‑heavy representations with 16‑bit IDs and 4‑bit words.

Memory model : By replacing pointers with 16‑bit IDs and using bitmap‑based branch representation, the total memory consumption is roughly 6 bytes per key (plus a small constant overhead), independent of key length. The compacted arrays add about 0.15 × n + 8 bytes of overhead.

Performance evaluation : Experiments in Go compare SlimTrie with Go's built‑in map and Google’s B‑Tree implementation. Results show that SlimTrie uses significantly less memory than both alternatives and achieves query speeds about 2.6× faster than B‑Tree while being comparable to sorted array lookups. Memory usage remains stable (≈6–7 bytes per key) as the number of keys or key length varies.

Use case scenario : With 32 GB of RAM, SlimTrie can index roughly 3.2 billion files, demonstrating its suitability for petabyte‑scale storage systems.

Conclusion : SlimTrie offers a highly space‑efficient, fast‑query index that overcomes the limitations of classic data structures for massive key‑value workloads, making it a compelling choice for modern cloud storage architectures.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Indexing Memory optimization Go storage Trie SlimTrie

Written by

High Availability Architecture

Official account for High Availability Architecture.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.