Mastering Web Live Streaming: From Capture to Playback
This article walks through the complete web live‑streaming workflow—covering capture sources, video/audio processing, H.264 encoding, RTMP push, transcoding, CDN distribution, HLS playback, pseudo‑fullscreen tricks, adaptive sizing, playback detection, and a custom SDK—providing practical guidance for building robust live video experiences.
In recent years live streaming has surged, and QQ Music has integrated live streaming capabilities for concerts, hosts, and celebrities. This article introduces the web side of the live streaming process.
Live Streaming Status
Many apps have entered the live streaming industry, showing how popular it is.
The video flow is divided into three stages: generation, transmission, and presentation.
Generation Stage
The generation stage includes audio‑video capture and processing.
Capture sources are classified into five categories:
TV‑based live streams using pre‑recorded video sources such as dramas and variety shows.
Concert live streams captured by cameras and microphones.
Game live streams where the host records both webcam footage and game screen via OBS or similar software.
Outdoor live streams using a phone’s camera and microphone.
Mobile game live streams recorded via emulators on Android or screen‑recording tools on iOS.
DIY Video Processing
Captured video often needs post‑processing, including:
Beauty, smoothing, filters, and special effects, which involve complex operations such as face detection and video compositing.
Watermarking for copyright protection, which can also be handled during transcoding.
DIY Audio Processing
Noise reduction for noisy recordings.
Mixing and voice‑changing effects for certain live scenarios.
Encoding Processing
Encoding is essentially video compression.
Example calculation: a 6‑second 720p raw video is 474 MB; over a 10 Mbps link it would take about 6 minutes to transmit, which is unacceptable for live streaming. Using H.264 compression reduces the size to under 1 MB, cutting transmission time to under 800 ms.
H.264 is based on MPEG and uses three frame types: I (intra‑coded), P (predictive), and B (bidirectional). I‑frames contain full images; P and B frames store differences relative to I‑frames.
Compression removes redundancy in two ways:
Intra‑frame compression: exploiting visual redundancy because the human eye is insensitive to certain details.
Inter‑frame compression: exploiting spatial redundancy (correlated neighboring pixels) and temporal redundancy (similarity between consecutive frames).
Audio is also compressed, typically using AAC.
The overall generation workflow is illustrated below:
Transmission Stage
The transmission stage moves video from the capture side to the user side.
Push Stream
Push (upstream) uses the RTMP protocol, which features:
Proprietary protocol supported by browsers with Flash.
Long‑living connections with handshake overhead minimized, keeping latency under 2 seconds.
Broad CDN support.
Transcoding
After pushing to the server, Tencent Cloud provides multiple transcoding formats to ensure compatibility across devices: RTMP for native mobile, FLV for PC, and HLS for mobile H5.
Distribution
Distribution (downstream) delivers the stream to users, involving network quality monitoring, optimization, and content moderation.
Presentation Stage
After distribution, the video is played on the user side.
Playback Protocols
Mobile H5 live streaming mainly uses the HLS protocol, originally developed by Apple for iOS devices and later supported on Android 3.0+.
<code>#EXTM3U m3u文件头,必须放在第一行
#EXT-X-ALLOW-CACHE 设置是否允许cache,当前是不允许
#EXT-X-MEDIA-SEQUENCE 接下来请求的第一个TS分片的序号,
#EXT-X-TARGETDURATION 每个分片TS的最大的时长,当前为9s,
#EXTINF 分片TS的信息,如时长等**</code>HLS playback flow:
The player requests the .m3u8 playlist.
The server returns the playlist, which defines segment length and count, influencing latency.
The client parses the playlist and sequentially fetches TS segments for playback.
Common H5 Playback Issues
Pseudo‑Full‑Screen
When true fullscreen is not controllable, CSS can be used to simulate fullscreen by stretching the video element.
On iOS, the video tag must have the
playsinlineattribute (or
webkit-playsinlinefor iOS 9‑10) to avoid system takeover. Android WebView (X5 kernel) supports inline playback, but QQ Browser may still hijack the stream unless whitelisted.
Adaptive Full‑Screen
Videos of varying dimensions need to adapt to fixed screen sizes. The steps are:
Calculate video width and height.
Compare video aspect ratio with screen aspect ratio.
If the video is wider, scale based on screen width; otherwise, scale based on screen height.
Examples illustrate the scaling and centering process.
Playback Continuity Detection
Because Android’s
canplay,
canplaythrough, and
playingevents vary, a hack is to monitor
timeupdateprogress. If the timestamp does not change over a period, the stream may be stalled.
<code>$('#js_video').on('timeupdate', function(){
playtimeupdate = new Date().getTime();
});
setInterval(function(){
// detect if playback has stopped
if (lastplaytimeupdate && lastplaytimeupdate == playtimeupdate) {
console.log("abnormal");
} else {
lastplaytimeupdate = playtimeupdate;
}
}, 3000);
</code>Custom Full‑Screen UI on PC
When entering fullscreen, browsers assign a very high
z-indexto the video element, which can obscure custom UI controls. The solution is to apply fullscreen to the video’s parent container instead of the video element itself.
Self‑Developed SDK
QQ Music has released a lightweight, highly compatible video SDK for both H5 and PC. The H5 SDK already supports VOD features such as backward/forward, progress dragging, and fullscreen, with live streaming integration in progress. The PC SDK adds volume control, custom fullscreen UI, and quality switching.
Demo pages:
H5 demo: https://y.qq.com/m/demo/demoh5player.html
PC demo: https://y.qq.com/m/demo/demopcplayer.html
H5 SDK is based on Zepto ( http://y.gtimg.cn/music/qmv/qmv.h5.js ), while the PC SDK uses jQuery ( http://y.gtimg.cn/music/qmv/qmv.pc.js ). Example initialization code:
<code>var params = {
title: "QMV Playlist",
container: ".js_videoplayer",
source: ["x00248tvbgv", "t0024928s1j", "g00133xte6r"],
quality: true,
autoplay: false,
mode: 0, // 0 = sequential, 1 = random
useConnectionPlay: false
};
var qmv = new QMV(params);
</code>References
HLS introduction: https://github.com/ossrs/srs/wiki/v2
Difference between intra‑frame and inter‑frame compression: https://www.zhihu.com/question/20237091
HTML5 video live streaming basics: http://km.oa.com/group/19674/articles/show/266140
Deep dive into the HTML5 video tag on mobile: http://km.oa.com/articles/show/261332
QQ Music Frontend Team
QQ Music Web Frontend Team
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.