Databases 13 min read

SessionDB: A High‑Performance LSM‑Based Key/Value Store for Stateless Sessions

The article introduces SessionDB, a Java‑implemented, LSM‑tree‑based key/value storage engine designed to eliminate sticky sessions by providing high‑throughput, durable, and scalable session data handling, and details its architecture, optimizations, sharding strategy, and benchmark comparisons with BerkeleyDB, LevelDB, and RocksDB.

Ctrip Technology
Ctrip Technology
Ctrip Technology
SessionDB: A High‑Performance LSM‑Based Key/Value Store for Stateless Sessions

To address the scalability bottleneck caused by sticky sessions, Ctrip's SOA team developed SessionDB, a centralized session server built on a high‑performance, persistent key/value store that follows the Log‑Structured Merge Tree (LSM) algorithm and draws design inspiration from Google LevelDB.

The engine offers high read/write performance (near O(1) memory access for writes and average O(1) disk operations for reads), full data durability on disk, large capacity beyond RAM limits, low heap memory usage via a three‑tier storage hierarchy (heap, memory‑mapped files, and disk), thread‑safe non‑blocking operations, crash‑resistance, automatic compaction, and a lightweight Map‑like API (Get/Put/Delete) implemented in Java.

SessionDB’s LSM implementation simplifies LevelDB by sorting only on key hash values, eliminating the need for sequential key traversal, which suits session and cache workloads. The overall architecture consists of an active ActiveMapTable (C0), immutable map tables (Level0), and subsequent sorted map tables (Level1, Level2) that are merged by background merger threads, with Bloom filters and memory‑mapped files accelerating lookups.

Index files store fixed‑length 40‑byte entries (hash, offset, length, etc.), enabling fast binary search on the hash and minimal data file reads. Bloom filters further reduce unnecessary disk accesses, while memory‑mapped files provide persistent, GC‑friendly storage.

A sharding strategy partitions the database into multiple units (default four), distributing keys by hash modulo the shard count to balance concurrent read/write load.

Benchmark tests (1 M keys of 16‑byte key / 100‑byte value) on a 4‑core Xeon server show SessionDB outperforming BerkeleyDB, LevelDB, and RocksDB in both read and write latency, despite being written in Java, thanks to its index and storage optimizations.

The authors conclude that SessionDB delivers superior random read/write performance for session and cache scenarios and plan future work on a server version, multi‑language clients, and a distributed extension inspired by Amazon Dynamo.

JavaShardingPerformance BenchmarkBloom FilterLSMKey-Value StoreSessionDB
Ctrip Technology
Written by

Ctrip Technology

Official Ctrip Technology account, sharing and discussing growth.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.