Databases 24 min read

Fusion: Didi’s Self‑Developed NewSQL System and Its Evolution from a NoSQL Platform

Fusion, Didi’s home‑grown C++ NoSQL database built on RocksDB, powers over 400 services and 1.5 PB of data, then evolved into the NewSQL layer dise with schema, secondary indexes, transactions and MySQL‑compatible binlog, and now guides a next‑gen distributed database project targeting true elastic scaling, distributed transactions and full SQL compatibility.

Didi Tech

May 23, 2019

Fusion: Didi’s Self‑Developed NewSQL System and Its Evolution from a NoSQL Platform

Jie Mei’s introduction: This article is compiled from the talk of Didi’s database storage expert Yu Wenlong at the 10th DTCC China Database Conference. It mainly introduces Didi’s self‑developed NewSQL system called Fusion.

NewSQL refers to a class of databases that combine the massive‑scale, high‑throughput capabilities of NoSQL with the transactional and SQL features of traditional relational databases.

Can a mature NoSQL system be turned into a NewSQL system? The answer is yes. Didi has a mature self‑developed NoSQL system (Fusion) and, based on it, successfully incubated a NewSQL system. The presentation is divided into three parts: first, the successes of the NoSQL layer; second, the evolution to NewSQL; third, the shortcomings of this approach and future directions.

NewSQL systems must retain the massive data handling and high‑throughput of NoSQL while providing full SQL support and ACID transactions.

Fusion is a C++‑implemented distributed NoSQL database that supports the Redis protocol and persists data using RocksDB. It is already serving more than 400 business services, spanning over 300 clusters, with a total data volume of 1.5 PB and peak QPS exceeding 14 million, all operated with fully automated operations without dedicated ops personnel.

The overall storage product architecture is built on top of the RocksDB engine and adds a network layer, a cluster‑management layer, and an access layer to form the Fusion NoSQL system. On top of Fusion, Didi integrates its scheduling system and Hadoop compute platform to create the DTS service FastLoad. Further extensions on Fusion add schema management, secondary indexes, transactions, and binlog capabilities to build the NewSQL storage component (named dise). A comprehensive intelligent control system, based on Salt‑Stack, handles user access and automated operations.

Chapter 1: The Mature NoSQL Storage System – Fusion

Fusion’s background: it is a self‑developed distributed NoSQL database written in C++, supporting the Redis protocol, with data persisted in RocksDB. It now serves over 400 business services, covering the entire company, with more than 300 clusters, fully automated operations, 1.5 PB of data, and peak QPS over 14 million.

Fusion was created to solve two massive data‑storage problems at Didi: historical orders (hundreds of billions of records) and driver trajectory data (even larger). Traditional Redis and MySQL could not handle this scale, prompting the development of Fusion.

Because the Redis protocol is simple yet rich, Fusion implements Redis‑compatible storage structures on disk and builds a cluster architecture consisting of an access layer, a cluster‑management layer, a persistence layer, and a high‑availability layer.

Five key highlights of Fusion are presented, with four detailed below.

Highlight 1: Data Flow

Fusion must integrate with Didi’s broader development ecosystem, connecting with other storage systems, middleware, offline and real‑time compute platforms. Examples include Hive‑to‑Fusion integration and inter‑Fusion data flow.

To solve Hive‑to‑Fusion data movement, Didi built the one‑stop DTS platform FastLoad. Initially created for the label and feature platforms, FastLoad extracts data from Hive (offline) and loads it into Fusion (online) without involving the Redis proxy, using RocksDB’s ingest feature to achieve high‑speed bulk loading.

FastLoad addresses three core pain points: wasted development resources (each Hive‑to‑Fusion integration required duplicated code), stability concerns (high‑throughput offline loads could affect online services), and low production efficiency (Redis‑protocol bulk loads could not leverage batch or compression features).

FastLoad offers two submission methods: a web console and an open API. Users submit tasks, the system schedules data extraction, sharding, SST file generation, and then streams the SST files directly to Fusion nodes via TCP, bypassing the proxy. After ingestion, users can read the data through the Redis protocol.

Highlight 2: Degradation & Disaster Recovery

Fusion implements master‑downgrade capability: during traffic spikes, read‑write separation is performed at the proxy layer, routing a portion of reads to replicas to protect the master.

For cluster‑level disaster recovery, Fusion synchronizes data between two clusters. If the primary cluster becomes unavailable, users can either switch the VIP to the secondary cluster manually or use a one‑click flow platform to redirect the VIP to the secondary real servers.

The data synchronization between clusters is point‑to‑point, independent of middleware. Each node is aware of the remote cluster’s routing information and adapts to state changes (e.g., leader switch, scaling). A sliding‑window mechanism using unique sequence numbers provides ordered send/ack, ensuring reliable transmission and automatic compensation for data generated while the primary cluster was down.

Highlight 3: Extreme Efficiency

RocksDB‑level optimizations include a key‑granular cache with hotspot prediction (tripling hit rate), a 24‑hour compaction schedule (saving ~25% disk space and reducing latency spikes), and aggressive memory‑usage garbage collection for unused page‑cache and block‑cache.

Highlight 4: Security Guarantees

During upgrades, binary and configuration files are backed up, and a hard‑link backup scheme for RocksDB SST files provides fast, space‑efficient data backups without modifying the service code.

User‑level snapshots are offered via RocksDB’s checkpoint mechanism, enabling near‑instant snapshots.

FastLoad retains multiple data versions to support use‑cases such as A/B testing and data rollback.

Chapter 2: NewSQL Exploration on Top of Fusion

The motivation for NewSQL is to overcome MySQL’s limitations on massive data volumes (flexibility, scalability, cost). Desired NewSQL capabilities include easy schema evolution, unlimited storage, and higher cost‑effectiveness.

Key challenges addressed:

Detecting user schema on a KV store.

Emitting MySQL‑compatible binlog from a KV store.

Implementing secondary indexes.

Providing transaction support.

Architecture: DDL operations are decoupled via a configuration center. The proxy parses SQL, converts it to KV, and writes to Fusion. Fusion generates MySQL‑format binlog, which is pushed to MQ. An index server consumes the binlog and builds secondary indexes based on user‑defined keys.

Schema management separates DDL from data flow; the proxy handles INSERT/REPLACE/DELETE/SELECT/UPDATE, targeting single‑table large‑table scenarios.

Example: a “student” table with three rows and four columns is stored as a Redis hash where the big key is "student: ", fields are column names, and values are cell values.

Binlog compatibility: For updates, the original value is fetched and written together with the new value; inserts only write the new value.

Secondary indexes (unique and non‑unique) are built asynchronously. Because Fusion shards data by hash, a Redis hash‑tag is added to index keys to ensure that all entries of the same index reside on the same node, enabling efficient scans.

Transaction support: Distributed transactions are avoided by requiring users to place rows that need transactional guarantees on the same hash‑tag, turning the problem into a single‑node transaction. RocksDB’s transaction engine is leveraged, and Lua scripts are used for transaction interaction, allowing complex logic to be executed server‑side without maintaining transaction state in the proxy.

Lua scripts can also be embedded in MySQL comments to drive transaction logic when using the MySQL protocol.

Summary of NewSQL deployment: after one year online, the system stores over 400 TB, handles more than 2 million QPS, and serves 58 business lines (average >8 TB per business), a scale that MySQL cannot comfortably handle.

Chapter 3: Distributed Database Design

While the NewSQL solution works well for many use‑cases, it has notable limitations: only single‑node transactions, single‑node indexes, asynchronous indexing, no JOIN support, and no elastic scaling.

Therefore Didi is launching a new distributed database project that aims to provide:

Distributed transactions.

Truly unlimited data and index scaling.

Real‑time indexing.

Elastic scaling.

Strong consistency with multi‑replica high availability.

SQL compatibility.

The proposed architecture follows the classic distributed KV design with range partitioning, Raft‑based strong consistency, automatic split, global scan, and future support for distributed transactions and full SQL parsing.

Current status: a robust KV system with Raft consensus, automatic splitting, and global scan has been built. The next step is to add distributed transaction support, followed by SQL‑parser integration to replace the current NewSQL solution, and eventually achieve full SQL compatibility.

Three‑step summary:

Developed the NoSQL system Fusion, gaining expertise in distribution, persistence, high availability, and data flow.

Explored NewSQL to quickly solve business problems, achieving success for over 50 services while recognizing its limitations.

Planning a ground‑up distributed database to push massive‑scale OLTP further, with phased development: powerful KV → SQL‑parser → high‑level SQL compatibility.

The evolution follows two principles: avoid over‑design and avoid large leaps. By iteratively enhancing a stable architecture, Didi delivers continuous value while expanding technical knowledge.

Original source: ITPUB (itpuber)

Author: Yu Wenlong – Technical Expert at Didi, former engineer at VMware, Taobao, and Alibaba Cloud, responsible for Didi’s self‑developed NoSQL, NewSQL, and DTS projects.

Recruitment notice: The team is hiring NewSQL development experts. Please send resumes to [email protected].

Links to related technical talks:

AI in Mobility and Cloud

Li Hang: Ceph Distributed Storage Overview

Rao Quancheng: Deep Dive into Go Reflection

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Database Architecture NewSQL NoSQL RocksDB

Written by

Didi Tech

Official Didi technology account

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.