Databases 19 min read

How Ctrip Transformed Kvrocks into TRocks – A High‑Performance Persistent KV Store

This article explains how Ctrip’s technical team identified the limitations of Redis on private cloud, evaluated numerous NoSQL/NewSQL options, selected Kvrocks, and engineered TRocks with functional, availability, and operational enhancements that deliver higher performance, lower cost, and robust data consistency for large‑scale KV workloads.

dbaplus Community
dbaplus Community
dbaplus Community
How Ctrip Transformed Kvrocks into TRocks – A High‑Performance Persistent KV Store

Background

Over recent years Ctrip’s technology assurance team refined Redis governance on private cloud, solving operational issues and gaining extensive experience. To reduce infrastructure costs and meet high‑performance, massive‑scale storage demands, the team explored building a persistent KV database on private cloud that could complement Redis while offering additional capabilities.

Problems Faced

On public cloud, Ctrip reduced costs by storing Redis data on SSDs using Kvrocks, cutting expenses by 60% (see Fig. 1). However, growing business needs required a private‑cloud solution that provides:

Large‑capacity KV storage beyond Redis limits.

Data persistence across restarts.

Reduced Redis operational cost.

Native support for semantics like selectforupdate to handle inventory deductions without external distributed locks.

Higher consistency, e.g., synchronous replication.

The team aimed to find a persistent KV database compatible with Redis protocols yet offering richer features.

Research and Selection

The team surveyed major NoSQL/NewSQL databases against criteria such as industry adoption, mature middleware, comprehensive cluster‑operation tooling, performance scalability (10×), and extensibility for custom development. After extensive analysis, Kvrocks was chosen because it inherits Redis governance maturity, can reuse existing middleware, and aligns closely with Ctrip’s operational practices.

From Kvrocks to TRocks

1. Functional Enhancements

Distributed Lock – TRocks implements a built‑in key‑level lock, eliminating the need for external lock services and reducing system complexity and security risks.

Requests carry a unique clientid and an auto‑incrementing seq, which are written as a write‑batch to RocksDB and replicated to slaves, guaranteeing atomicity.

Composite Commands – To address multi‑step operations (e.g., setting a hash value and its TTL) that previously required two commands, TRocks adds composite commands that execute atomically while remaining transparent to clients.

2. Availability Enhancements

Adjustable Consistency – TRocks adds semi‑synchronous replication similar to MySQL. Users can configure the minimum number of semi‑sync slaves required before acknowledging a write, improving durability (see Fig. 3).

To avoid cross‑region latency issues, an IDC mode requires at least one slave from a different data center to respond before confirming the write (Fig. 5).

TRocks also suppresses full‑sync replication when only minor inconsistencies exist, performing targeted synchronization from the first divergent sequence (Fig. 7).

3. Operational Governance Enhancements

Horizontal Scaling – Building on a previous Redis scaling solution, TRocks introduces a BinlogServer‑based scaling mechanism that supports seamless data migration between Redis, Kvrocks, and TRocks clusters.

Write‑rate throttling and BinlogServer rate‑limiting mitigate disk I/O pressure during large migrations.

Sentinel Multi‑Data‑Center Deployment – TRocks Sentinel is deployed across three data centers (Fig. 9) to provide cross‑region failover. Leader election issues were resolved by extending the random back‑off interval to 100‑200 ms and triggering immediate elections after master loss.

Performance and Cost Data

After internal rollout, TRocks runs nearly 2 K instances on private cloud, storing >10 TB of data. Benchmarks (Fig. 10‑11) show comparable 99.9 % latency to Redis while offering richer commands. A 40‑core host with two RAID‑0 SATA SSDs delivers 8‑10 k QPS for <1 KB values; NVMe SSDs increase this by 3‑5×.

Because TRocks compresses data 3‑7×, a 40‑core host can host 20 instances (≈40 GB each) and achieve up to 90 % cost reduction versus Redis.

Future Plans

Planned enhancements include:

More composite commands to reduce round‑trip latency for multi‑step data retrieval.

Support for per‑subkey expiration in hash structures.

Integration of RocksDB checkpointing (already merged from Kvrocks 2.0) to avoid heavy I/O during full sync.

Full migration to NVMe SSDs to further boost performance, especially for large keys (>100 KB).

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

KV StoreKVROCKSpersistent databaseTRocks
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.