How Leading Chinese Companies Scale Elasticsearch for Billions of Queries
This article surveys how major Chinese tech firms such as JD.com, Ctrip, Qunar, 58.com and Didi design, scale, and operate massive Elasticsearch clusters for search, real‑time analytics, and security, detailing architecture choices, shard strategies, data pipelines and performance optimizations.
Many Chinese companies—including Ctrip, Didi, Toutiao, Ele.me, 360 Security, Xiaomi and Vivo—have adopted Elasticsearch for both search and broader big‑data analytics. Combined with Kibana, Logstash, and Beats, the Elastic Stack is widely used for near‑real‑time log analysis, metric monitoring, information security, and machine‑learning‑driven anomaly detection.
1. JD.com to Home Order Center Elasticsearch Evolution
The JD.com order center faces massive read‑heavy traffic, storing order data in MySQL but offloading query pressure to Elasticsearch. The ES cluster now holds over 1 billion documents and processes about 5 billion queries daily. Architecture evolved to a real‑time hot‑standby setup with VIP load balancing, a gateway layer acting as client nodes, and data nodes storing the shards (one primary and two replicas). Shard count was tuned to balance single‑ID lookup performance against aggregation query speed, and older orders are archived to a historical store.
2. Ctrip Elasticsearch Use Cases
2.1 Hotel Order Elasticsearch
Ctrip built a real‑time index for hotel orders, exposing a dedicated web service to improve query convenience while maintaining performance. Elasticsearch was chosen for its lightweight footprint, ease of use, and strong distributed support.
2.2 Flight Ticket Elasticsearch Operations
Data flows from Kafka into Elasticsearch via ETL pipelines, with cold data stored in HDFS and hot/ warm data in databases or caches. The platform supports traditional BI reporting as well as fast, program‑driven decision loops.
2.3 Large‑Scale Cluster Management
The biggest Ctrip log cluster runs 120 data nodes on 70 physical servers, indexing 600 billion documents daily (≈25 TB new index files, 1 million docs/s peak). It retains historical data for 10–90 days, manages 3 441 indices, 17 000 shards, and consumes about 1 PB of disk space.
Daily index count: 600 billion
Peak ingest rate: 1 million docs/s
Historical retention: 10–90 days
Total shards: 17 000 across 3 441 indices
Disk usage: ~1 PB
3. Qunar Order Center Elasticsearch Solution
Qunar processes over 1 million hotel orders per day, growing to 100 million across platforms. The previous hot‑table sharding could not scale beyond 400 million rows. By abstracting searchable fields into Elasticsearch and keeping detailed order data in MySQL, Qunar achieved a split‑storage model: simple OrderNo queries hit MySQL, while complex searches use ES.
The ES index uses 8 shards, storing 1.4 billion documents (≈2 hundred million active) occupying 64 GB; the cluster’s disks total 240 GB.
4. Elastic Stack in 58 Group Information Security
58.com’s security department deployed the Elastic Stack for log storage, high‑throughput search, and security analytics. The implementation covered storage selection, performance tuning, master‑node and data‑node optimizations, security best practices, and Kibana localization for product and operations teams.
5. Didi Multi‑Cluster Elasticsearch Practice
Since early 2016, Didi has built an Elasticsearch platform now exceeding 3 500 instances and 5 PB of data, with peak write throughput over 20 million operations per second. Use cases include map search for rides, multi‑dimensional queries for customer service and operations, and a large‑scale logging service.
Data ingestion relies on a Sink service that consumes Kafka streams (business logs, MySQL binlogs, custom reports) and writes to Elasticsearch, HDFS, Ceph, etc. The Gateway service fronts all queries, exposing HTTP/REST, TCP, and SQL interfaces, handling access control, rate limiting, index‑storage separation, DSL throttling, and multi‑cluster disaster recovery.
6. Practical Order Search Solution
Elasticsearch’s support for structured queries and real‑time updates addresses the pain points of traditional order reporting systems. The architecture adopts a service‑oriented approach: the ES cluster and sharded databases serve as data sources wrapped by a unified order service API, which is consumed by front‑end, back‑end, and reporting applications.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Programmer DD
A tinkering programmer and author of "Spring Cloud Microservices in Action"
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
