Databases 14 min read

Evolution of JD Baitiao’s Data Architecture: From MySQL to Apache ShardingSphere

This article chronicles JD Baitiao’s journey from early MySQL and NoSQL solutions through DBRep to the adoption of Apache ShardingSphere, highlighting the technical motivations, decoupling strategies, performance comparisons, and the broader Database Plus vision for scalable, stable financial‑grade data architectures.

Wukong Talks Architecture
Wukong Talks Architecture
Wukong Talks Architecture
Evolution of JD Baitiao’s Data Architecture: From MySQL to Apache ShardingSphere

JD Baitiao, a flagship financial consumption service of JD.com, has grown to serve hundreds of millions of users, prompting continuous evolution of its backend data architecture to meet massive traffic and strict financial‑grade requirements.

1. Technology Lifecycle: From MySQL to NoSQL to DBRep

In the early stage (2014‑2015), a Solr + HBase solution was used, where Solr handled indexing and HBase stored full data, alleviating pressure on the core database but introducing operational complexity.

From 2015‑2016, the team introduced MongoDB, partitioning data by month to satisfy large‑scale import/export needs; while query performance improved, the system suffered from high memory consumption and limited scalability.

By 2016‑2017, data volume exceeded hundreds of billions, stressing MongoDB. JD Baitiao then built a big‑data platform using DBRep to capture MySQL slave changes and forward them to ES and HBase, achieving real‑time data flow but increasing code coupling.

2. Decoupling the Backend Architecture

Rapid product upgrades turned earlier solutions into bottlenecks, leading to high code complexity, maintenance cost, and tight coupling between business logic and data storage.

The team identified four decoupling goals: data‑architecture decoupling, technical‑architecture decoupling, business‑relationship decoupling, and development‑process decoupling.

When evaluating sharding components, four essential characteristics were required: product maturity, extreme performance, ability to handle massive data, and flexible extensibility.

A comparison between a self‑developed sharding framework and Apache ShardingSphere showed both have high performance, but ShardingSphere excels in lower code coupling, reduced business intrusion, easier upgrades, and better scalability.

Self‑Developed Sharding

ShardingSphere

Performance

High

High

Code Coupling

High

Low

Business Intrusion

High

Low

Upgrade Difficulty

High

Low

Scalability

Average

Good

Consequently, JD Baitiao selected Apache ShardingSphere as the financial‑grade sharding solution.

3. Apache ShardingSphere‑JDBC Solution

ShardingSphere‑JDBC is a lightweight Java framework that acts as an enhanced JDBC driver, requiring only a JAR without additional deployment, fully compatible with JDBC and ORM frameworks.

Product maturity: years of polishing and an active community.

Excellent performance: micro‑kernel design with minimal overhead.

Low migration effort: native JDBC compatibility reduces development work.

Flexible extensibility: works with migration‑sync components for easy data expansion.

After extensive internal validation, ShardingSphere became the preferred middleware in late 2018.

Product Adaptation

To support JD Baitiao’s complex business logic, ShardingSphere upgraded its SQL engine, enhancing compatibility with diverse SQL statements while maintaining near‑native JDBC performance.

Business Cut‑over

Using a custom HASH sharding strategy, the system split data across nearly ten thousand nodes; the migration took about four weeks, involving parallel cluster operation and data verification with self‑developed tools.

Value Gains

Simplified upgrade path, allowing developers to focus on business rather than sharding design.

Significant R&D effort savings by avoiding custom sharding development.

Flexible scaling to handle peak events such as "618" and "11.11".

4. Towards a Stable Standard: Database Plus

As data importance grows, fragmented database products increase operational costs. The "Database Plus" concept proposes a unified management layer above databases, enabling horizontal scaling, encryption, and plug‑in extensions without altering underlying databases.

ShardingSphere 5.0 implements this vision, offering a plug‑in architecture that builds a new data‑governance ecosystem, addressing database fragmentation challenges.

5. Returning to Fundamentals

For large‑scale financial, securities, manufacturing, and retail scenarios, the focus should be on middleware that incrementally enhances existing technology stacks rather than pursuing disruptive new systems.

In summary, JD Baitiao’s data‑architecture evolution demonstrates the importance of timely technology selection, systematic decoupling, and leveraging mature open‑source solutions like Apache ShardingSphere to achieve scalable, stable, and maintainable financial‑grade data platforms.

data migrationarchitectureBig DataShardingShardingSpheredatabasesJD Baitiao
Wukong Talks Architecture
Written by

Wukong Talks Architecture

Explaining distributed systems and architecture through stories. Author of the "JVM Performance Tuning in Practice" column, open-source author of "Spring Cloud in Practice PassJava", and independently developed a PMP practice quiz mini-program.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.