Big Data 17 min read

Didi's Multi-Cluster Elasticsearch Architecture: Challenges and Practices

Didi transformed its massive single‑cluster Elasticsearch deployment into a transparent multi‑cluster architecture using TribeNode and cross‑cluster search, isolating workloads, reducing fault impact, and achieving five‑fold scale while preserving a single‑cluster appearance for services, despite added configuration complexity and stability challenges.

Didi Tech
Didi Tech
Didi Tech
Didi's Multi-Cluster Elasticsearch Architecture: Challenges and Practices

Elasticsearch is a distributed search engine built on Lucene, forming part of the Elastic Stack that provides real‑time search and analytics for logs, search services, and system monitoring.

Didi Elasticsearch Overview – Since early 2016 Didi has built an Elasticsearch platform that now runs over 3,500 instances, stores more than 5 PB of data, and reaches peak write throughput of over 20 million operations per second. The platform supports core ride‑hailing map search, customer service queries, operational analytics, and a thousand internal platforms.

Single‑Cluster Bottlenecks

When the single‑cluster grew to several hundred physical machines, it held 3,000+ indices, over 50,000 shards, and PB‑scale storage. Stability risks emerged from:

Elasticsearch architectural limits (single‑threaded master task processing, cluster‑state publishing, task queueing).

Index resource sharing (shard allocation across nodes causing hot‑spot interference).

Diverse business scenarios with conflicting performance requirements.

The master node processes all metadata changes in a single thread, publishing the entire ClusterState to every node and blocking until all nodes acknowledge. This model leads to long task latency, pending‑tasks buildup, and degraded recovery when the cluster scales.

Shared resources among indices cause a small, high‑traffic index to affect the stability of other indices, especially when node failures occur.

Business scenarios vary widely: low‑latency map POI queries, high‑throughput log ingestion, binlog search, and heavy aggregation for monitoring each impose different QoS demands, making a single cluster hard to satisfy all.

Multi‑Cluster Challenges

To overcome these risks Didi designed a multi‑cluster architecture that is transparent to business services. Data ingestion still flows through Kafka; the Sink service routes topics to appropriate clusters, while the Gateway service continues to expose a unified HTTP/REST/SQL interface.

Query routing becomes complex when wildcard index patterns (e.g., index* ) span multiple clusters, requiring a mechanism to merge results across clusters.

TribeNode Introduction

TribeNode (org.elasticsearch.tribe.TribeService) merges the ClusterState of several clusters into a single virtual ClusterState, presenting a unified client node to external queries. It connects to each target cluster as a client node, listens for ClusterState changes, and aggregates them.

Advantages:

Transparent multi‑cluster access.

Simple, elegant implementation with high reliability.

Limitations:

Master tasks must wait for TribeNode responses, potentially affecting stability.

ClusterState is not persisted; on restart, queries may fail until metadata is re‑fetched.

Index name collisions require a preference rule (random, discard, or specific cluster).

Newer Elasticsearch versions introduce Cross‑Cluster Search (CCS) to address these shortcomings.

Multi‑Cluster Topology

The final architecture separates clusters by workload type: Log clusters, Binlog clusters, Document clusters, and Dedicated clusters. Each public cluster typically contains up to 100 data nodes and is deployed via Didi Cloud for automated scaling.

Multiple TribeNode groups are deployed for redundancy; the Gateway routes requests to the appropriate cluster based on index‑to‑cluster mappings. Admin service centrally manages all clusters, indices, and their relationships. The Sink service has been extracted into a DSink platform that receives index metadata from Admin and writes to the correct clusters.

Benefits of the Multi‑Cluster Architecture

Isolation at the cluster level, allowing critical services to run on dedicated clusters.

Reduced fault impact by separating data types across clusters.

Improved horizontal scalability by adding new clusters as needed.

Zero‑perception for business services; the platform appears as a single massive Elasticsearch cluster.

Practical Experience

Key challenges encountered include TribeNode stability (long initialization, high shard counts causing search latency), configuration and version drift across clusters, and capacity balancing. Solutions involve deploying multiple TribeNode groups, synchronizing cluster settings, version management via internal release pipelines, and automated index capacity planning with cross‑cluster migration.

Conclusion

Didi’s multi‑cluster Elasticsearch architecture successfully mitigated the bottlenecks of a single‑cluster design, supporting a five‑fold increase in platform scale while maintaining stability and isolation. The trade‑off of added architectural complexity is outweighed by the gains in reliability and scalability.

architecturescalabilityElasticsearchMulti-ClusterDididistributed search
Didi Tech
Written by

Didi Tech

Official Didi technology account

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.