Scaling Real‑Time Audio/Video for 1000+ Users: Volcano Engine RTC Insights
Volcano Engine RTC shares its best‑practice solutions for large‑scale interactive scenarios—including thousand‑person audio chat, live‑stream co‑hosting, and cloud rendering—detailing server‑side stream selection, intelligent mixing, and edge‑node integration to reduce latency, bandwidth, and computational load while enhancing user experience.
Overview
Volcano Engine RTC, built on ByteDance's internal RTC platform, provides real‑time communication services to more than 40 business products, including high‑traffic apps such as Douyin. It is used in entertainment, online education, gaming voice, and enterprise communication, with monthly usage reaching billions of minutes.
Thousand‑Person Chat
The thousand‑person chat scenario requires a channel to support thousands of participants, each with a microphone. If every host publishes audio, each must subscribe to all other hosts, leading to O(n²) computational complexity and heavy bandwidth and memory usage on both server and client.
Server‑Side Stream Selection
To reduce load, Volcano Engine RTC moves the stream‑selection logic to the server. After selecting a small set of m streams (3 ≤ m ≤ 10) from n incoming streams, the server forwards only those m streams to each client, reducing server processing from O(n²) to O(n) and significantly lowering client bandwidth and memory demands.
Video Stream Handling
Audio and video subscription are separated. Clients can subscribe to the video streams of the loudest speakers, optionally displaying up to ten video streams based on volume, which avoids unnecessary video processing while preserving the visual experience.
Live‑Stream Co‑Hosting
Live co‑hosting combines audio/video from multiple hosts into a single stream for broadcast. Traditional server‑side mixing and transcoding cause black screens, stutters, and quality loss due to multiple encode/decode cycles.
Intelligent Mixing
Volcano Engine adopts a hybrid approach: when client devices have sufficient performance and network quality, mixing is performed on the client; otherwise, server‑side mixing is used. This reduces black frames and improves visual clarity by 20‑40%.
Cloud Rendering
Complex rendering tasks such as high‑quality games or 3D models demand substantial GPU resources, high bitrate, ultra‑low latency (<100 ms), and high reliability. Traditional transmission uses either P2P (low latency, low reliability) or RTC (high reliability, higher latency).
Improved Transmission
By integrating RTC edge nodes with cloud‑rendering servers, the transmission delay between RTC and rendering services is eliminated, achieving around 80 ms latency for cloud gaming/phone and about 135 ms for cloud effects.
Conclusion
The article presents best practices for Volcano Engine RTC in three scenarios: thousand‑person chat (server‑side audio selection and separated audio/video subscription), live‑stream co‑hosting (intelligent client/server mixing to avoid black screens and improve quality), and cloud rendering (edge‑node integration to meet ultra‑low latency and high reliability requirements).
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Volcano Engine Developer Services
The Volcano Engine Developer Community, Volcano Engine's TOD community, connects the platform with developers, offering cutting-edge tech content and diverse events, nurturing a vibrant developer culture, and co-building an open-source ecosystem.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
