Scaling JD.com’s Product Detail Pages with Dynamic, High‑Performance Architecture
This article details the evolution and redesign of JD.com’s product detail page architecture, describing the transition from static HTML generation to a dynamic, high‑performance, multi‑datacenter system built on key‑value storage, Nginx + Lua, asynchronous processing, multi‑level caching, and robust scaling and reliability strategies.
In this talk, the author shares the redesign of JD.com’s product detail page (PDP) architecture, driven by the need for high‑performance real‑time rendering and rapid response to complex, ever‑changing business requirements.
What Is a Product Detail Page
A PDP displays detailed product information and serves as a major traffic and order entry point. JD.com maintains many PDP templates (general, global purchase, flash sale, automotive, clothing, group buying, etc.) that share the same core logic but differ in front‑end behavior.
Personalized demands and numerous data sources (dozens of backend services) require an architecture that can handle urgent changes within minutes, something static HTML generation cannot provide.
Architecture Overview
The system consists of three main parts:
Product Detail Page System – responsible for the static portion of the whole page.
Dynamic Service System and Unified Service System – the Unified Service System handles real‑time data such as inventory; the Dynamic Service System provides data services to other internal systems.
Key‑Value Heterogeneous Data Cluster – stores atomic key‑value data to avoid costly relational joins.
Data is stored dimensionally (product, merchant, shop, etc.) and frequently cached in Redis for fast access.
History of PDP Architecture
Architecture 1.0
Technology stack: IIS + C# + SQL Server. Direct DB queries caused performance spikes; a memcached layer was added later.
Architecture 2.0 (Static Generation)
Static HTML is generated via a workflow: MQ notifies changes → Java workers fetch data from dependencies → generate HTML → rsync to other machines → Nginx serves the static files.
Main drawbacks:
Any category or breadcrumb change forces a full re‑generation of all related products.
Rsync becomes a bottleneck as product count grows.
Frequent page‑level changes cannot be responded to quickly.
Architecture 2.1 (Fragmented Static Generation)
Products are routed by their suffix to multiple machines; HTML fragments (header, specs, breadcrumbs, etc.) are generated separately and assembled via Nginx SSI.
Main drawbacks:
Too many fragment files, causing inode exhaustion.
Mechanical disks perform poorly with SSI under high concurrency.
Template changes still require massive re‑generation.
When capacity is reached, static pages are removed and dynamic rendering is used, which stresses downstream services.
Architecture 3.0 (Fully Dynamic)
Key pain points addressed:
Static capacity limits.
Inability to react to rapid, complex business changes.
The new design keeps the same data‑centric ideas but moves to real‑time rendering:
Data changes are still notified via MQ.
Data heterogeneity workers write raw atomic data to a JIMDB cluster (Redis + persistent engine).
A synchronization worker aggregates data by dimension (basic info, product intro, other info) into separate JIMDB clusters.
Front‑end rendering uses Nginx + Lua to fetch data and render templates on the fly.
Principles guiding the new system include data closed‑loop, dimensional storage, stateless workers, asynchronous and concurrent processing, multi‑level caching, dynamic rendering, elasticity, and graceful degradation.
Key Design Principles
Data closed‑loop – keep all data within the system, avoiding external dependencies.
Data dimensionalization – store data by product, merchant, shop, etc., enabling efficient retrieval.
System decomposition – split responsibilities across heterogeneous data, synchronization, and front‑end services.
Stateless, task‑oriented workers – horizontal scalability.
Asynchronous + concurrent processing – use message queues and parallel calls to reduce latency.
Multi‑level caching – browser cache, CDN, Nginx shared dict, local and remote Redis clusters.
Dynamic rendering – templates rendered at request time, supporting rapid UI changes.
Elastic scaling – Docker containers and auto‑scaling based on CPU or bandwidth.
Degrade‑switches – centralized feature flags to gracefully fallback under pressure.
Multi‑datacenter active‑active deployment – each datacenter reads its own replica, with failover to other zones.
Comprehensive load testing – offline (ab, JMeter) and online (tcpcopy, traffic replay).
Problems Encountered and Solutions
SSD Performance Issues
Consumer‑grade SSDs (Samsung 840 Pro) showed unstable throughput; switched to enterprise‑grade Intel 3500 drives.
Key‑Value Store Selection
Benchmarked LevelDB, RocksDB, BeansDB, LMDB, Riak; LMDB offered stable performance for mixed read/write workloads.
JIMDB Synchronization Bottlenecks
Large data volumes caused dump‑and‑sync failures; solution: increase SSD count per machine, use dedicated SAS disks for sync, and plan direct memory forwarding.
Master‑Slave Switch Overhead
Original one‑master‑two‑slave setup caused latency spikes during failover; upgraded to one‑master‑three‑slave for smoother transitions.
Shard Configuration Complexity
Introduced Twemproxy to centralize shard logic and automated deployment to reduce manual changes.
Template Metadata Storage
Moved from storing full HTML fragments to storing only metadata; Lua renders templates using this metadata, reducing storage size.
High Inventory Request Volume
During a flash‑sale, inventory API saw >6 million requests per minute; enabled Nginx proxy cache to throttle and cache responses, stabilizing the system.
Network Jitter and 502 Errors
Reduced Twemproxy timeout settings (connection, read, write) to 150 ms and added fallback to dynamic services.
Excessive Traffic on Access Layer
Moved GZIP compression from the access layer to individual services, cutting upstream traffic by ~80% and lowering CPU usage.
Summary
Data closed‑loop
Dimensional data storage
System decomposition
Stateless, task‑oriented workers
Asynchronous + concurrent processing
Multi‑level caching
Dynamic rendering
Elastic scaling
Graceful degradation switches
Active‑active multi‑datacenter deployment
Robust load‑testing strategies
Optimized access‑layer handling (header trimming, stateless domains, selective proxy caching)
Connection pooling and non‑blocking locks for cache stampede protection
Twemproxy for Redis connection reduction
Unix domain sockets to lower TCP overhead
Reasonable timeout configurations
Long‑connection reuse
Service‑oriented design to eliminate direct DB dependencies
Domain‑based client connection partitioning
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
