Databases 23 min read

How JD.com Guarantees Database Performance During Billion‑Scale Sales Events

This article details JD.com’s comprehensive strategies—including architecture design, pre‑event preparation, real‑time safeguards, and post‑event analysis—to ensure MySQL databases remain high‑performance and highly available during massive traffic spikes like 618 and Double‑11 sales.

21CTO
21CTO
21CTO
How JD.com Guarantees Database Performance During Billion‑Scale Sales Events

Architecture and Load

JD.com’s application architecture follows a mainstream design: user requests are accelerated by CDN, then routed to an application cluster, followed by a message queue (JMQ), a cache cluster (JIMDB and Redis), and finally the database cluster. Over 95% of production databases are MySQL, with a small portion of Oracle, SQL Server, and MongoDB for specific systems.

The database proxy layer (Jproxy) handles traffic management, connection pooling, and read/write separation. JD.com still uses a traditional MySQL master‑slave architecture because of its proven stability and superior performance compared to distributed solutions. The production environment runs more than 75% of MySQL instances in Docker containers.

During major sales events, network traffic can increase 2‑3×, while MySQL QPS can surge up to tenfold, stressing the database layer dramatically.

Pre‑Promotion Preparation

Key steps before a sales event include:

Communication : Hold joint meetings with development teams to confirm critical systems, identify weak points, and define emergency and downgrade plans.

SQL Optimization : Identify and rewrite slow queries; JD.com’s platform collects slow SQL via an enhanced Box‑anemometer tool and notifies owners via email.

Capacity Expansion : Use an automated expansion platform to add master/slave instances, leveraging Docker for rapid deployment.

Data Archiving : Implement a data‑migration pipeline that moves stale data to low‑cost storage (Tokudb) and archives historical data.

Stress Testing : Conduct full‑link and isolated system load tests, simulate order flows with CDN robots, and verify failover mechanisms.

During the Promotion – Real‑Time Safeguards

Operations run 24/7 with strict change‑control: any new deployment requires VP‑level approval and dual‑person DBA verification. Critical DBAs and developers co‑locate to accelerate issue resolution.

The automated operations platform manages asset inventory, Docker containers, MySQL cluster provisioning, automatic MHA failover, DNS updates, backup/restore, and self‑service deployment, dramatically reducing manual DBA workload.

Post‑Promotion Review and Continuous Improvement

After each event, JD.com conducts a thorough retrospective, documenting successes and failures. The focus shifts to enhancing visualization, automation, and intelligent platforms, aligning DBA work with SRE principles and software‑engineer skill sets.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

e‑commercePerformance OptimizationAutomationhigh concurrencymysqldatabase scaling
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.