Databases 16 min read

How Didi Built Fusion-NewSQL: A High‑Throughput, Low‑Latency NewSQL on Distributed KV

Fusion-NewSQL is Didi’s internally‑developed NewSQL system built atop the Fusion distributed KV store, offering MySQL compatibility, high throughput, low latency, flexible schema changes, secondary indexes, and integration with ElasticSearch and Hive, with detailed architecture, data flow, and future roadmap.

dbaplus Community

Jan 15, 2020

How Didi Built Fusion-NewSQL: A High‑Throughput, Low‑Latency NewSQL on Distributed KV

Background

Fusion‑NewSQL is a NewSQL storage system developed by Didi on top of its own distributed KV store, Fusion. It is MySQL‑compatible, supports secondary indexes, and is designed for massive data volume and high request rates.

Problem Statement

Didi’s rapid business growth caused data size and request volume to surge, making traditional sharding and schema‑change approaches cumbersome. Frequent schema modifications, index additions, and the need for secondary‑index support required a more flexible solution.

Open‑Source Evaluation

The team evaluated TiDB but found it unsuitable because it could not meet low‑latency (<100 ms 99th percentile) requirements, incurred high storage costs with three replicas, and did not align with Didi’s need to bypass distributed transactions.

Design Goals

High throughput, low latency, large capacity.

MySQL protocol compatibility and ecosystem support.

Primary‑key and secondary‑index queries.

Flexible schema changes without affecting online stability.

Architecture Overview

Fusion‑NewSQL consists of the following components:

DiseServer – parses MySQL protocol.

Fusion Data cluster – stores data using RocksDB.

Fusion Index cluster – stores index information.

ConfigServer – manages schema configuration.

Consumer – consumes MySQL binlog from MQ and builds indexes.

External dependencies – MQ and Zookeeper.

Data Storage Structure

MySQL rows are converted to Redis hashmaps. The hashmap key is table_name+primary_key, guaranteeing global uniqueness. Primary‑key queries become simple HGETALL calls.

Secondary indexes are stored as key‑value pairs. For a unique index the key is table_indexname_indexColumnsValue with the rowkey as the value. For a non‑unique index the key includes the rowkey ( table_indexname_indexColumnsValue_Rowkey) and the value is null, allowing multiple rows per index value.

Data Write Flow

User sends MySQL statements via the MySQL‑SDK to DiseServer.

DiseServer validates the SQL against the schema.

Validated SQL is transformed into a Redis hashmap and sent to the Data cluster.

Data nodes write the data to WAL files and persist it in RocksDB.

Background threads consume WAL, convert it to MySQL binlog format, and push it to MQ.

Consumer reads the binlog, builds index entries according to the operation type, and writes them to the Index cluster.

This asynchronous indexing introduces a small time lag between data and index availability.

Query Flow

When a query arrives, DiseServer selects an appropriate index. If no index matches, the query fails (Fusion‑NewSQL does not support non‑indexed field queries). For a secondary‑index query, the system scans the Index cluster to obtain matching primary keys, then retrieves the full rows from the Data cluster via HGETALL and returns MySQL‑compatible results.

Schema Change Handling

Schema change requests are submitted as tickets, approved by a control system, and then pushed to the ConfigServer. After safety checks, the new schema is stored and propagated to all nodes. Existing data is not rewritten; queries use the new schema to provide default values for missing fields or hide extra fields.

Index additions are handled in two steps: immediate indexing for new data, and a background scan of historical data (preferably from slaves) to build missing index entries.

Ecosystem Integration

Fusion‑NewSQL can export MySQL binlog format to MQ, allowing downstream systems that understand MySQL data to consume it. Hive tables can be loaded via a FastLoad (DTS) tool that converts Hive data into RocksDB SST files and ingests them directly into the Data and Index clusters, bypassing the normal write path.

Complex Query Support via ElasticSearch

For queries that involve multiple range conditions or complex filters, Fusion‑NewSQL provides a special ES index type. Selected fields are indexed into ElasticSearch; at query time, the system generates an ES DSL query, retrieves matching primary keys, and then fetches the full rows from the Data cluster. This hybrid approach balances low‑latency KV indexes with the expressive power of ElasticSearch.

Current Status and Summary

Fusion‑NewSQL is already serving over 70 core Didi services (orders, billing, user center, trading engine, etc.), handling more than 2 million QPS and storing over 600 TB of data.

While not a universal NewSQL solution, Fusion‑NewSQL demonstrates that extending an existing NoSQL foundation with MySQL protocol support and carefully composed components can achieve high ROI for most business scenarios.

Future Work

Support limited distributed transactions for single‑node cross‑row operations.

Replace asynchronous indexing with real‑time indexing to achieve write‑through reads.

Expand supported SQL protocols and functionalities.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

NewSQL Distributed storage MySQL Compatibility Schema Evolution

Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.