Big Data 16 min read

Scaling Tencent Meeting Video Stream Quality Analysis with Tencent Cloud Elasticsearch

Facing explosive growth and massive video‑stream quality data, Tencent Meeting migrated its custom Lucene‑based analysis engine to Tencent Cloud Elasticsearch, which delivered over 1 million writes per second, automatic sharding, reduced latency from hours to seconds, and sustained 99.99% availability, proving a high‑performance, scalable solution for large‑scale video conferencing.

Tencent Cloud Developer

Jul 21, 2020

Scaling Tencent Meeting Video Stream Quality Analysis with Tencent Cloud Elasticsearch

Tencent Meeting was launched at the end of December 2019 and reached over 10 million daily active users within two months. It has been widely used for pandemic‑control meetings, remote work, and online teaching, becoming a crucial remote communication tool during the COVID‑19 period.

In the first 100 days, the product iterated through 20 versions, continuously expanding to meet growing user demand and topping the free charts in the China App Store.

The massive increase in video‑stream quality reporting data put enormous pressure on the video‑stream quality analysis system. To provide efficient, stable services for all users and to guarantee large‑scale meetings, the analysis team partnered with the Tencent Cloud Elasticsearch (ES) team to explore a cloud‑based data analysis engine.

Rapid user growth forced daily resource expansion: from January 29, cloud hosts were scaled by nearly 15,000 per day, totaling over 100,000 hosts and more than one million CPU cores within eight days.

Video‑stream quality issues such as stutter and audio‑video desynchronization arise from many factors, including network packet loss and unstable connections. To pinpoint problems quickly, the operations team needs extensive runtime data (network type, bitrate, packet loss, CPU/memory usage, OS version, client version, etc.).

Beyond real‑time reporting, multi‑dimensional analysis can uncover anomalies such as regional stutter or version‑specific bugs, presented via dashboards or reports.

High data‑write throughput exposed a critical bottleneck: the original self‑developed search engine, built on Lucene, could not scale quickly enough under massive write loads.

Selection of Elasticsearch and Technical Considerations

To guarantee efficient, stable communication for all users and rapid issue diagnosis, the team decided to migrate from the self‑built system to Tencent Cloud ES.

Key evaluation criteria included:

1. High Performance

Video‑stream data peaks exceed 1000+ writes per second. Under such pressure, ES must sustain stable read/write services.

Analysis of the ES translog revealed that rollGeneration and flush operations were mutually exclusive, causing an average rollGeneration latency of 570 ms.

The optimization merged a flush before closing the old translog, eliminating the lock contention and improving write performance by over 20%.

2. Scalability

The original engine required manual sharding and could not auto‑migrate data, making rapid scaling impossible during the pandemic.

Tencent Cloud ES supports automatic sharding, replica management, and can scale to hundreds or thousands of nodes with seamless elastic expansion.

To avoid hotspot creation on newly added nodes, a custom allocation decider was implemented to evenly distribute shards across all nodes.

3. Stability

After migration, write throughput grew from 50 k/s to over 1 M/s. The system handles sudden traffic spikes without service degradation.

Memory‑based leaky‑bucket throttling was introduced to protect the coordinating node’s ingress layer, using a cosine‑based smooth throttling curve to gradually reduce request rates as memory usage rises.

4. Ease of Use

The solution needed to be deployed within a week with minimal code changes. Logstash was used as the ingestion layer, leveraging its rich plugin ecosystem to replace the custom Kafka‑based pipeline with near‑zero development effort.

SDKs for multiple languages enabled rapid integration with both front‑end and back‑end components, allowing the entire migration to be completed in a single day.

Since the full switch to the ELK architecture, the system achieves >1 M writes per second, latency reduced from hours to seconds, and a stable 99.99% service availability.

Overall, the case demonstrates how Tencent Cloud Elasticsearch provides high‑performance, scalable, and stable data‑search capabilities for large‑scale video‑conference scenarios, offering a reference solution for the industry.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

performance optimization Big Data cloud computing Scalability Elasticsearch

Written by

Tencent Cloud Developer

Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.