Databases 17 min read

RocksDB Basics: Architecture, Features, and Performance

This article provides a comprehensive overview of RocksDB, covering its origin, design goals, core architecture components, key features such as APIs, compression strategies, durability mechanisms, backup and replication support, as well as tooling, testing, and performance characteristics.

High Availability Architecture
High Availability Architecture
High Availability Architecture
RocksDB Basics: Architecture, Features, and Performance

RocksDB originated as a high‑performance key‑value store developed at Facebook, implemented as a C++ library that supports atomic reads and writes on various storage media including RAM, flash, disk, and HDFS.

Its design emphasizes fast storage utilization, configurable compression, and tools for production and debugging environments. RocksDB builds on LevelDB code and incorporates ideas from Apache HBase.

Design Goals

Performance is a primary focus, aiming to fully exploit flash or RAM bandwidth for point lookups, range scans, and mixed random read/write workloads while allowing extensive parameter tuning for read, write, and space amplification scenarios.

Production‑ready support includes extensive configurability, backward compatibility across versions, and built‑in tools for deployment and debugging.

Architecture Overview

RocksDB is an embedded KV store with operations Get, Put, Delete, and Scan. Its core structures are the memtable (in‑memory write buffer), SST files (sorted immutable files on storage), and log files (write‑ahead logs).

When a memtable fills, it is flushed to an SST file; background compaction merges files across levels using either Universal or Level‑style compaction, with multithreaded options to improve write throughput.

Key Features

APIs include Get, MultiGet, Iterator (range scans and prefix scans using Bloom filters), Snapshot (consistent point‑in‑time view), and Write/Put for atomic batch updates. RocksDB also supports read‑only mode for higher read performance.

Compression options cover Snappy, Zlib, Bzip2, LZ4, and LZ4_HC, configurable per level, with support for custom compression filters (e.g., TTL‑based key removal).

Durability is provided by a transaction log; writes can be synced or batched, and logs are replayed on restart. RocksDB offers full and incremental backup, replication via GetUpdatesSince, and metadata insertion via PutLogData.

Multiple embedded databases can coexist in a single process using shared Env objects and thread pools; block and table caches improve read performance, and stackable DB interfaces enable extensions such as TTL and backupable DB.

Memtables are pluggable (skiplist, vector, prefix‑hash) and can be pipelined to increase write throughput. Merge records allow server‑side compaction of updates without read‑modify‑write cycles.

Tools, Testing, and Performance

Utilities like sst_dump and ldb aid inspection and manual compaction. Unit tests and make check verify functionality, while db_bench provides benchmark results for various workloads on flash and memory.

Images illustrating the architecture and compaction strategies are included in the original article.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

databasecompressionRocksDBkey-value store
High Availability Architecture
Written by

High Availability Architecture

Official account for High Availability Architecture.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.