Databases 10 min read

How JD Baitiao Scaled to Billions with Apache ShardingSphere

This article chronicles JD Baitiao's data‑architecture evolution from Solr + HBase to MongoDB and finally to Apache ShardingSphere, highlighting the challenges of massive data growth, the need for decoupling, and the performance, scalability, and operational benefits achieved by adopting ShardingSphere.

Programmer DD

Oct 17, 2021

How JD Baitiao Scaled to Billions with Apache ShardingSphere

JD Baitiao used Apache ShardingSphere to solve the problem of storing and scaling trillions of data, laying the foundation for large‑scale promotional activities. Since early 2014, JD Baitiao’s data volume has exploded, and each major promotion tests the technical team while strategic shifts drive data‑architecture growth. -- Zhang Dongfang, JD Baitiao R&D Lead

JD Baitiao Data Architecture Evolution

Since its launch in February 2014, JD Baitiao’s data architecture has undergone several upgrades to handle explosive growth and massive data volumes.

2014‑2015: Solr + HBase

Solr served as an index for searchable fields while HBase stored the full data, relieving pressure on the core database but introducing integration complexity.

2015‑2016: MongoDB Sharding

Data was partitioned by month in a MongoDB cluster, improving hotspot query efficiency and allowing flexible schema changes, yet suffered from limited scalability and high memory consumption.

2016‑2017: DBRep → ES & HBase

With data exceeding hundreds of billions, a DBRep pipeline captured MySQL changes and replicated them to Elasticsearch and HBase, providing real‑time data flow and better scalability, though code coupling remained high.

Need for Decoupling

Application‑level sharding increased code complexity and upgrade effort, prompting a shift to a dedicated sharding component.

Comparison of self‑developed sharding vs. ShardingSphere:

Performance: high for both.

Code coupling: high vs. low.

Business intrusion: high vs. low.

Upgrade difficulty: high vs. low.

Scalability: average vs. good.

Apache ShardingSphere Solution

ShardingSphere‑JDBC is a lightweight Java framework that acts as an enhanced JDBC driver, requiring no extra deployment.

Key features that meet JD Baitiao’s requirements:

Mature product with active community.

Excellent performance due to micro‑kernel design.

Minimal code changes thanks to native MySQL protocol support.

Flexible extension via migration and synchronization components.

After extensive internal validation, ShardingSphere became the preferred sharding middleware for JD Baitiao at the end of 2018.

Product Adaptations

SQL engine upgrades improved compatibility with complex business logic, supporting full SQL routing, distributed primary keys (UUID, Snowflake), and zero‑intrusion hint‑based sharding.

Performance optimizations include SQL parse result caching, JDBC metadata caching, bind and broadcast tables, and automated execution with stream merging.

Migration Process

Data was migrated using DBRep and ShardingSphere‑Scaling over four weeks, synchronizing to target clusters while running parallel environments for verification.

Benefits

Simplified upgrade path, allowing developers to focus on business logic.

Reduced development effort by avoiding custom sharding components.

Flexible scaling to handle large promotional events.

ShardingSphere now enjoys over 14 K GitHub stars and adoption by more than 170 enterprises across finance, e‑commerce, cloud services, and other sectors.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Scalable Architecture ShardingSphere database sharding apache JD Baitiao

Written by

Programmer DD

A tinkering programmer and author of "Spring Cloud Microservices in Action"

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.