Databases 7 min read

Understanding LSM-Tree Architecture and Its Applications in Big Data Systems

The article explains the Log-Structured Merge-Tree (LSM) architecture, its core components, advantages and disadvantages, and demonstrates how it is employed in big‑data platforms such as HBase and Apache Druid to achieve high‑throughput writes and scalable query processing.

Top Architecture Tech Stack

Jul 16, 2024

Understanding LSM-Tree Architecture and Its Applications in Big Data Systems

LSM Architecture

Overview

LSM ( Log-Structured Merge-Tree) is a write‑optimized data structure used in key‑value stores and data systems; all write operations (insert, update, delete) are first recorded in memory and later flushed to disk in batches, reducing I/O and improving write performance.

Components

The main components of an LSM‑Tree are MemTable (in‑memory table), immutable SSTable (disk table), the merge process, compaction, Bloom filter, and the Write‑Ahead Log ( WAL) which ensures durability.

MemTable: temporarily stores newly written data in memory.

SSTable: when the MemTable is full, data are flushed as immutable disk tables.

Merge: periodically combines multiple SSTables to keep their number and size manageable.

Compaction: compresses data during merge to optimize storage space.

Bloom Filter: quickly checks whether a key may exist in an SSTable, reducing unnecessary disk reads.

WAL: logs writes before they reach the MemTable, enabling recovery after crashes.

Advantages and Disadvantages

Advantages include high write throughput, batch writes, and delayed merge/compaction, making LSM‑Tree ideal for write‑intensive workloads. Disadvantages are slower reads, space amplification, and write amplification.

LSM in the Big Data Domain

HBase Application of LSM‑Tree

In HBase, data are first written to the HLog (commit log), then to the MemStore (equivalent to a MemTable), and finally flushed to disk as a storeFile (equivalent to an SSTable). This LSM‑Tree‑based flow enables fast ingestion, though reads may need to query both the MemStore and multiple SSTables.

To mitigate read latency, HBase performs periodic compaction of small SSTables into larger ones and employs Bloom filters for faster key existence checks.

Apache Druid Application of LSM‑Tree

Druid adopts an LSM‑like architecture for its real‑time nodes. Incoming data are first loaded into an in‑memory buffer (similar to a MemTable). When conditions are met, the buffer is flushed to disk as a segment, which is immediately loaded into off‑heap memory for fast querying.

Real‑time nodes periodically merge segments from the same time window into larger segments (segment merge), analogous to LSM‑Tree compaction, and upload the merged segments to DeepStorage.

The coordinator node assigns historical nodes to download and load these segments, after which query nodes retrieve data from both real‑time and historical nodes and merge the results for the user.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

compaction LSM‑Tree databases storage

Written by

Top Architecture Tech Stack

Sharing Java and Python tech insights, with occasional practical development tool tips.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.