Evolution of JD Baitiao’s Data Architecture: From MySQL to Apache ShardingSphere
This article chronicles JD Baitiao’s journey from early MySQL and NoSQL solutions through DBRep to the adoption of Apache ShardingSphere, highlighting the technical motivations, decoupling strategies, performance comparisons, and the broader Database Plus vision for scalable, stable financial‑grade data architectures.
JD Baitiao, a flagship financial consumption service of JD.com, has grown to serve hundreds of millions of users, prompting continuous evolution of its backend data architecture to meet massive traffic and strict financial‑grade requirements.
1. Technology Lifecycle: From MySQL to NoSQL to DBRep
In the early stage (2014‑2015), a Solr + HBase solution was used, where Solr handled indexing and HBase stored full data, alleviating pressure on the core database but introducing operational complexity.
From 2015‑2016, the team introduced MongoDB, partitioning data by month to satisfy large‑scale import/export needs; while query performance improved, the system suffered from high memory consumption and limited scalability.
By 2016‑2017, data volume exceeded hundreds of billions, stressing MongoDB. JD Baitiao then built a big‑data platform using DBRep to capture MySQL slave changes and forward them to ES and HBase, achieving real‑time data flow but increasing code coupling.
2. Decoupling the Backend Architecture
Rapid product upgrades turned earlier solutions into bottlenecks, leading to high code complexity, maintenance cost, and tight coupling between business logic and data storage.
The team identified four decoupling goals: data‑architecture decoupling, technical‑architecture decoupling, business‑relationship decoupling, and development‑process decoupling.
When evaluating sharding components, four essential characteristics were required: product maturity, extreme performance, ability to handle massive data, and flexible extensibility.
A comparison between a self‑developed sharding framework and Apache ShardingSphere showed both have high performance, but ShardingSphere excels in lower code coupling, reduced business intrusion, easier upgrades, and better scalability.
Self‑Developed Sharding
ShardingSphere
Performance
High
High
Code Coupling
High
Low
Business Intrusion
High
Low
Upgrade Difficulty
High
Low
Scalability
Average
Good
Consequently, JD Baitiao selected Apache ShardingSphere as the financial‑grade sharding solution.
3. Apache ShardingSphere‑JDBC Solution
ShardingSphere‑JDBC is a lightweight Java framework that acts as an enhanced JDBC driver, requiring only a JAR without additional deployment, fully compatible with JDBC and ORM frameworks.
Product maturity: years of polishing and an active community.
Excellent performance: micro‑kernel design with minimal overhead.
Low migration effort: native JDBC compatibility reduces development work.
Flexible extensibility: works with migration‑sync components for easy data expansion.
After extensive internal validation, ShardingSphere became the preferred middleware in late 2018.
Product Adaptation
To support JD Baitiao’s complex business logic, ShardingSphere upgraded its SQL engine, enhancing compatibility with diverse SQL statements while maintaining near‑native JDBC performance.
Business Cut‑over
Using a custom HASH sharding strategy, the system split data across nearly ten thousand nodes; the migration took about four weeks, involving parallel cluster operation and data verification with self‑developed tools.
Value Gains
Simplified upgrade path, allowing developers to focus on business rather than sharding design.
Significant R&D effort savings by avoiding custom sharding development.
Flexible scaling to handle peak events such as "618" and "11.11".
4. Towards a Stable Standard: Database Plus
As data importance grows, fragmented database products increase operational costs. The "Database Plus" concept proposes a unified management layer above databases, enabling horizontal scaling, encryption, and plug‑in extensions without altering underlying databases.
ShardingSphere 5.0 implements this vision, offering a plug‑in architecture that builds a new data‑governance ecosystem, addressing database fragmentation challenges.
5. Returning to Fundamentals
For large‑scale financial, securities, manufacturing, and retail scenarios, the focus should be on middleware that incrementally enhances existing technology stacks rather than pursuing disruptive new systems.
In summary, JD Baitiao’s data‑architecture evolution demonstrates the importance of timely technology selection, systematic decoupling, and leveraging mature open‑source solutions like Apache ShardingSphere to achieve scalable, stable, and maintainable financial‑grade data platforms.
Wukong Talks Architecture
Explaining distributed systems and architecture through stories. Author of the "JVM Performance Tuning in Practice" column, open-source author of "Spring Cloud in Practice PassJava", and independently developed a PMP practice quiz mini-program.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.