Frontend Development 22 min read

Mastering WebRTC: From RTMP/HLS Basics to Real-Time Audio‑Video Communication

This article explains common audio‑video streaming protocols such as RTMP and HLS, compares their use cases, then dives into WebRTC fundamentals, device detection, media capture, recording, connection setup, codec considerations, and how to display remote streams, providing a comprehensive guide for building real‑time web communication applications.

ELab Team

Sep 30, 2022

Mastering WebRTC: From RTMP/HLS Basics to Real-Time Audio‑Video Communication

Common Audio‑Video Network Communication Protocols

Standard Live Streaming Protocols

Ordinary Live Protocols

These streams prioritize picture quality over low latency, use CDN distribution, and typically employ RTMP and HLS.

Basic Concepts

RTMP (Real Time Messaging Protocol) – TCP‑based, widely supported by CDNs, low implementation difficulty, but not supported by browsers or iOS; Adobe has stopped updating it.

HLS (HTTP Live Streaming) – Apple‑defined HTTP‑based protocol, splits stream into TS segments, introduces at least one‑segment latency; excellent mobile compatibility (iOS, Android) and can be used in browsers via hls.js.

Choosing Between RTMP and HLS

Use RTMP for media pushing.

Use HLS for mobile web players because browsers do not support RTMP.

iOS requires HLS.

On‑demand video prefers HLS.

Typical Architecture of Ordinary Live

Consists of a live client, a signaling server, and a CDN network.

The live client handles capture, encoding, pushing, pulling, decoding and playback; the broadcaster pushes encoded streams to the CDN, while viewers pull streams from the CDN and render them.

The signaling server manages room creation, joining, leaving and text chat.

The CDN distributes media data to users.

Real‑Time Live Protocols

WebRTC was created to meet the growing demand for low‑latency, interactive communication.

WebRTC Overview

WebRTC (Web Real‑Time Communication) is an open protocol supported by major browsers, enabling peer‑to‑peer audio‑video communication without plugins.

It abstracts complex media handling (codec, transport, echo cancellation, etc.) behind a simple API.

WebRTC Audio‑Video Communication Process

Audio‑Video Device Detection

Fundamentals of Audio Devices

Audio input devices perform A/D conversion, quantization and encoding to produce digital signals.

Fundamentals of Video Devices

Video devices use optical sensors to convert light to RGB data, then DSP processing, conversion to YUV, and compression for transmission.

Getting Device List

navigator.mediaDevices.enumerateDevices()

returns available input and output devices.

navigator.mediaDevices.enumerateDevices().then(function(deviceInfos) {
  deviceInfos.forEach(function(deviceInfo) {
    console.log(deviceInfo);
  });
});

Device labels are empty unless the user grants media permission over HTTPS.

Device Detection Methods

Inspect deviceInfo.kind to distinguish audio vs video devices.

Default devices are selected automatically; specifying a device ID overrides the default.

Use getUserMedia to test whether a device can provide a usable stream.

Video detection: call getUserMedia for video and display it; if visible, the device works.

Audio detection: capture audio with getUserMedia and visualize the waveform or level changes.

Audio‑Video Capture

Key Concepts

Frame rate – number of frames per second; typical acceptable range is 10‑30 fps, with 60 fps for smoother interaction.

Track – independent media stream component (audio track, video track) that does not intersect with other tracks.

Capture API

mediaDevices.getUserMedia

const mediaStreamContrains = {
    video: true,
    audio: true
};
navigator.mediaDevices.getUserMedia(mediaStreamContrains)
  .then(gotLocalMediaStream);

The srcObject property of a media element can be set to a MediaStream object.

Taking a Snapshot

Use a canvas drawImage call with the video element, then download the data URL.

const ctx = document.querySelector('canvas');
ctx.getContext('2d').drawImage($video, 0, 0);
function downLoad(url){
  const a = document.createElement("a");
  a.download = 'photo';
  a.href = url;
  document.body.appendChild(a);
  a.click();
  a.remove();
}
downLoad(ctx.toDataURL("image/jpeg"));

Audio‑Video Recording

Key Concepts

ArrayBuffer – fixed‑length binary data buffer.

ArrayBufferView – typed array views such as Uint32Array.

Blob – binary large object used to store recorded media.

Recording API

new MediaRecorder(stream[, options])

Use MediaRecorder.ondataavailable to collect Blob chunks, then create an object URL for playback or download.

const mediaRecorder = new MediaRecorder(stream);
mediaRecorder.ondataavailable = handleDataAvailable;
mediaRecorder.start(2000);

Establishing a Connection

After capturing media, create an RTCPeerConnection on each side, exchange SDP offers/answers via a signaling server, and exchange ICE candidates.

RTCPeerConnection Workflow

Obtain local media stream with getUserMedia and add it to the peer connection.

Create an SDP offer (A), set local description, send via signaling.

Remote peer (B) sets remote description, creates an answer, sets local description, and sends back.

Both peers exchange ICE candidates and add them with addIceCandidate.

// A creates offer
localPeerConnection.createOffer().then(description => {
  return localPeerConnection.setLocalDescription(description);
});
// B receives offer, sets remote, creates answer
remotePeerConnection.setRemoteDescription(offer)
  .then(() => remotePeerConnection.createAnswer())
  .then(answer => remotePeerConnection.setLocalDescription(answer));

Audio‑Video Codec Overview

Video is a sequence of frames; codecs such as H.26x, MPEG, and QuickTime reduce spatial and temporal redundancy to meet bandwidth constraints.

Displaying Remote Media

When a remote stream arrives, the onaddstream handler assigns it to a video element’s srcObject.

localPeerConnection.onaddstream = function(event) {
  $remoteVideo.srcObject = event.stream;
};

Conclusion

WebRTC encompasses many components; this article provides a high‑level overview of the typical workflow for real‑time audio‑video communication.

JavaScript audio video Real-time communication hls RTMP WebRTC

Written by

ELab Team

Sharing fresh technical insights

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.