Databases 20 min read

SlimTrie: A Space‑Efficient Trie‑Based Index for Large‑Scale Storage Systems

This article presents SlimTrie, a trie‑based indexing structure that dramatically reduces memory consumption while maintaining fast query speeds, detailing its design, compression techniques, implementation steps, memory analysis, and performance comparisons with map and B‑Tree structures for large‑scale storage systems.

High Availability Architecture
High Availability Architecture
High Availability Architecture
SlimTrie: A Space‑Efficient Trie‑Based Index for Large‑Scale Storage Systems

Recent discussions on the SlimTrie data structure have led to a detailed technical exposition that explains its motivation, design goals, and practical benefits for cloud storage indexing.

Background and requirements : Traditional index structures such as hash maps and tree‑based indexes (B‑Tree, RBTree, SkipList, LSM Tree) suffer from memory consumption that grows with both the number of keys (n) and key length (k), making them unsuitable for indexing billions of small files in memory‑constrained environments.

Design goals of SlimTrie : Correctly locate existing keys (no guarantee for non‑existent keys). Support range queries. Memory overhead depends only on the number of keys, not on key length. Provide fast query performance. Index only static data and limit key size to 16 KB.

Generation steps : Build a standard Trie for all keys. Compress the Trie by pruning single‑branch nodes (LeafParent and Inner nodes), reducing the space complexity from O(n·k) to O(n). Store the compressed structure using three compacted arrays (inners, leaf‑parents, steps) that replace pointer‑heavy representations with 16‑bit IDs and 4‑bit words.

Memory model : By replacing pointers with 16‑bit IDs and using bitmap‑based branch representation, the total memory consumption is roughly 6 bytes per key (plus a small constant overhead), independent of key length. The compacted arrays add about 0.15 × n + 8 bytes of overhead.

Performance evaluation : Experiments in Go compare SlimTrie with Go's built‑in map and Google’s B‑Tree implementation. Results show that SlimTrie uses significantly less memory than both alternatives and achieves query speeds about 2.6× faster than B‑Tree while being comparable to sorted array lookups. Memory usage remains stable (≈6–7 bytes per key) as the number of keys or key length varies.

Use case scenario : With 32 GB of RAM, SlimTrie can index roughly 3.2 billion files, demonstrating its suitability for petabyte‑scale storage systems.

Conclusion : SlimTrie offers a highly space‑efficient, fast‑query index that overcomes the limitations of classic data structures for massive key‑value workloads, making it a compelling choice for modern cloud storage architectures.

IndexingMemory OptimizationGoStorageTrieSlimTrie
High Availability Architecture
Written by

High Availability Architecture

Official account for High Availability Architecture.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.