Building a Million-User Live-Stream Danmaku System: Bandwidth, Latency, and Reliability Solutions

To support Southeast Asian live-streaming, we designed a custom danmaku system capable of handling up to a million concurrent users per room, tackling bandwidth pressure, weak-network latency, and reliability by employing HTTP compression, response simplification, short-polling, local caching, and lock-free ring buffers.

Java High-Performance Architecture
Java High-Performance Architecture
Java High-Performance Architecture
Building a Million-User Live-Stream Danmaku System: Bandwidth, Latency, and Reliability Solutions

Background

To better support Southeast Asian live streaming, the product added a danmaku feature. The first version used Tencent Cloud, but suffered from stutter and low bullet density, prompting the development of a custom danmaku system capable of supporting up to one million concurrent users per room.

Problem Analysis

Bandwidth pressure: delivering 15 danmaku messages every 3 seconds plus HTTP headers exceeds 3 KB per packet, resulting in roughly 8 Gbps traffic, while the available bandwidth is only 10 Gbps.

Weak network causing danmaku stutter and loss.

Performance and reliability: with a million users online, QPS can exceed 300 k, requiring robust handling during peak events like Double Eleven.

Bandwidth Optimization

We reduced bandwidth consumption with the following measures:

Enable HTTP compression. Gzip can achieve over 40% compression, outperforming deflate by 4‑5%.

Simplify response structure.

Optimize content ordering: placing similar strings and numbers together improves gzip compression.

Frequency control: add request interval parameters to limit client request rates and apply sparse control during low-traffic periods.

Danmaku Stutter and Loss Analysis

The key design decision was choosing a delivery mechanism: push vs pull.

Long Polling via AJAX

The client opens an AJAX request that the server holds until an event occurs, optionally enabling HTTP keep‑alive to save handshake time. 优点: reduces polling frequency, low latency, good browser compatibility; drawback : the server must maintain many connections.

WebSockets

WebSocket provides true bidirectional communication with minimal header overhead (2‑10 bytes for server‑to‑client frames, plus 4 bytes mask for client‑to‑server), better real‑time performance, and supports binary frames and compression.

However, in weak networks the TCP long‑connection often drops, and both Long Polling and WebSocket struggle to detect disconnections quickly. TCP keep‑alive probes (keepalive_probes, keepalive_time, keepalive_intvl) help but are insufficient under unstable conditions.

Given the environment, neither long polling nor WebSocket was suitable, so we adopted a short‑polling approach for danmaku delivery.

Reliability and Performance

We split the service into two parts: a sending side handling complex logic and a pulling side handling high‑frequency read requests. This prevents the high‑QPS pull service from overwhelming the send service and facilitates independent scaling.

On the pull side we introduced a local cache. The service periodically RPC‑calls the danmaku service to refresh an in‑memory buffer, allowing subsequent requests to read directly from memory, drastically reducing latency and external dependency impact.

Data is sharded by time into a ring buffer that retains only the last 60 seconds. The buffer stores timestamps and associated danmaku lists, enabling fast, ordered reads without locks because writes are single‑threaded and reads only access the most recent 30 seconds of data. 在发送弹幕的一端, we apply rate limiting to discard excess bullets and use graceful degradation for optional features (avatar fetching, profanity filtering) so core delivery remains unaffected.

Summary

During the Double Twelve event, even when Redis experienced a brief outage, the system supported 700 k concurrent users with high efficiency and stability, meeting the target objectives.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Performance OptimizationBackend Architecturelive streamingreal-time communicationdanmaku
Java High-Performance Architecture
Written by

Java High-Performance Architecture

Sharing Java development articles and resources, including SSM architecture and the Spring ecosystem (Spring Boot, Spring Cloud, MyBatis, Dubbo, Docker), Zookeeper, Redis, architecture design, microservices, message queues, Git, etc.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.