Big Data 27 min read

How Tencent Scaled Elasticsearch to Thousands of Nodes: Core Kernel Optimizations Revealed

This article details Tencent's large‑scale Elasticsearch deployment, covering its massive usage scenarios, the availability, performance, cost and scalability challenges faced, and the comprehensive kernel‑level optimizations—including memory‑based throttling, storage‑model merging, off‑heap caching, rollup and metadata improvements—that enable PB‑level clusters with high reliability and low expense.

Tencent Tech

May 11, 2020

How Tencent Scaled Elasticsearch to Thousands of Nodes: Core Kernel Optimizations Revealed

Background

Elasticsearch is widely used inside Tencent for real‑time log analysis, structured data analysis, full‑text search and many other scenarios. Clusters now reach thousands of nodes and trillions of documents, prompting continuous high‑availability, high‑performance and low‑cost optimizations.

Massive Scale at Tencent

Key application areas include:

Search services : full‑text search for Tencent Docs, e‑commerce platforms such as Pinduoduo and Mogujie.

Log analysis : end‑to‑end logging for applications, databases, user behavior, network and security data.

Time‑series analysis : cloud monitoring, IoT telemetry and other high‑throughput metrics.

Elasticsearch is also employed for site search, security, APM and more, across public, private and internal clouds.

Pain Points and Challenges

Availability : high load can cause OOM or cluster avalanche, making 99.9% SLA difficult.

Performance : search latency < 20 ms, query spikes < 100 ms; large‑scale analytics demand low response times.

Cost : storage and memory consumption are high due to low compression ratios.

Scalability : billions of shards and thousands of nodes exceed native ES limits (≈10 k shards, ≈100 nodes).

Kernel Optimizations – Availability

System robustness : improved service throttling and node‑balancing to prevent OOM.

Disaster recovery : enhanced replica mechanisms and low‑cost backup for multi‑AZ resilience.

Kernel bugs : fixed master task blockage, distributed deadlocks, slow rolling restarts.

Memory‑based leaky‑bucket throttling : JVM heap acts as a bucket; when usage reaches thresholds, requests are smoothly limited using cosine‑based rate adjustment, protecting both write and query paths.

Kernel Optimizations – Performance

Storage model improvements replace the default size‑based merge with a time‑ordered hierarchical merge, reducing file fragmentation and improving query pruning.

Execution engine enhancements include:

Composite aggregation : leveraged index sorting and early termination to avoid full scans on each page, dramatically speeding up multi‑field aggregations.

Off‑heap FST cache : moved large Finite State Transducers out of the JVM heap, introduced zero‑copy pointers and a two‑level weak‑reference cache, cutting heap usage and GC pauses while preserving query latency.

Kernel Optimizations – Cost

Storage cost is reduced via hot‑cold data separation, lifecycle‑managed migration to HDD, and cold backup to COS. Rollup pipelines pre‑aggregate fine‑grained data into hourly/daily buckets, cutting storage by up to tenfold.

Memory cost is lowered by off‑heap FST caching and adaptive eviction, achieving up to 30% higher heap utilization and enabling a single node with 32 GB heap to manage ~50 TB of data.

Kernel Optimizations – Scalability

Metadata synchronization bottlenecks are addressed with task‑oriented shard creation, incremental metadata structures, and cached statistics, expanding supported shard counts to the million‑level and node counts to the thousand‑level, with index creation times under 5 seconds.

Open‑Source Contributions and Future Plans

Tencent’s ES team has contributed over twenty PRs to the upstream project, with a 70% merge rate and six active contributors. Future work focuses on self‑healing capabilities, deeper analytics performance, and a storage‑compute separation architecture built on the CFS shared file system to further cut costs and boost performance.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

distributed systems Performance Optimization big data Scalability ElasticSearch

Written by

Tencent Tech

Tencent's official tech account. Delivering quality technical content to serve developers.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.