Frontend Development 15 min read

WebRTC MediaStream and RTCPeerConnection API Overview and Usage

This article provides a comprehensive overview of WebRTC's MediaStream and RTCPeerConnection APIs, explaining core concepts such as tracks, sources, sinks, device enumeration, media constraints, bitrate and resolution settings, compatibility considerations, screen sharing, content hints, and the step‑by‑step workflow for establishing a peer‑to‑peer connection in the browser.

360 Smart Cloud
360 Smart Cloud
360 Smart Cloud
WebRTC MediaStream and RTCPeerConnection API Overview and Usage

Preface

WebRTC is a key technology for real‑time communication, enabling low‑latency, high‑quality audio, video, and data transmission for scenarios such as peer‑to‑peer calls, multi‑party conferences, and screen sharing, and it is widely applicable in remote collaboration, online education, telemedicine, and IoT.

MediaStream API

The MediaStream API consists of two main objects: MediaStreamTrack and MediaStream . A MediaStreamTrack represents a single type of media (audio or video) generated from a source, while a MediaStream aggregates multiple tracks, allowing synchronized playback of audio and video.

Each track is composed of a source that provides data and a sink that consumes it.

A MediaStream can contain zero or more tracks; all tracks within a stream are rendered in sync, similar to playing a multimedia file.

Source and Sink

In the source code, a MediaTrack is built from a corresponding source and sink. In the browser, the source produces media resources (e.g., microphone audio, camera video, static files), while the sink consumes them for rendering or for transmission via RTCPeerConnection . The connection can act as both source and sink, performing operations such as bitrate reduction, scaling, or frame‑rate adjustment before forwarding the stream.

Detecting Audio/Video Devices

The MediaDevices interface provides access to input devices like cameras and microphones. Its enumerateDevices() method returns a Promise that resolves to an array of MediaDeviceInfo objects describing each available device.

Capturing Local Audio/Video

MediaDevices.getUserMedia() prompts the user for permission and, upon approval, returns a Promise that resolves to a MediaStream containing the requested audio and/or video tracks. If permission is denied or the device is unavailable, the promise is rejected with an error.

Video Constraints

The getUserMedia() call accepts a MediaStreamConstraints object to specify required tracks and optional constraints such as video and audio flags, as well as resolution preferences using width and height . The browser treats these values as "ideal" and may fall back to the nearest supported resolution.

Constraints can also use the keywords min , max , and exact to bound the resolution range, and the deviceId property to select a specific capture device.

MediaDevices.getSupportedConstraints() returns a dictionary of all constraints supported by the user agent.

Bitrate Settings

During a call, the video bitrate can be adjusted in real time by modifying RTCRtpEncodingParameters.maxBitrate , which accepts values in bits per second.

Compatibility Issues

Older browsers may not support navigator.mediaDevices.getUserMedia . In such cases, vendor‑specific fallbacks are required. Safari also has limitations: it does not support multiple tabs using getUserMedia simultaneously, and its screen‑sharing API only allows sharing the entire screen.

Screen Sharing

The MediaDevices.getDisplayMedia() method prompts the user to select a screen, window, or tab to capture, returning a MediaStream similar to getUserMedia . Safari currently only permits full‑screen sharing.

Device Change Monitoring

WebRTC exposes a devicechange event (and the corresponding ondevicechange handler) on navigator.mediaDevices to detect hot‑plugging of media devices.

Video Quality Adaptation Strategies

WebRTC provides three degradation preferences:

MAINTAIN_FRAMERATE : keep frame rate, reduce resolution (suitable for video calls).

MAINTAIN_RESOLUTION : keep resolution, reduce frame rate (ideal for screen sharing or document view).

BALANCED : a balanced approach, disabled by default and enabled via the WebRTC-Video-BalancedDegradation flag.

These preferences are set via the contentHint property of a MediaStreamTrack , which the WebRTC stack translates into a DegradationPreference .

Audio Content Hints

"speech": optimize for spoken voice, possibly applying noise suppression.

"speech-recognition": prioritize clarity for transcription, disabling consumer‑oriented processing.

"music": preserve musical fidelity by disabling voice‑centric processing.

Video Content Hints

"motion": lower resolution to maintain frame rate during fast motion.

"detail": lower frame rate to preserve resolution, default for screen sharing.

"text": lower frame rate while preserving resolution for text‑heavy content; enables text‑optimized encoding tools when supported (e.g., AV1).

RTCPeerConnection API

The RTCPeerConnection interface represents a WebRTC connection between a local and a remote peer, providing methods to create, maintain, monitor, and close the connection.

Its optional RTCConfiguration object includes properties such as:

iceServers : an array of RTCIceServer objects containing STUN/TURN server URLs.

iceTransportPolicy : controls which ICE candidates are considered ("all", "public" – deprecated, "relay").

rtcpMuxPolicy : determines RTCP multiplexing strategy ("negotiate" or "require").

bundlePolicy : selects how media tracks share transport channels ("balanced", "max-compat", "max-bundle").

Typical usage steps include creating a new RTCPeerConnection , adding tracks with addTrack , exchanging SDP offers/answers, and handling ICE candidates.

Main Methods

addIceCandidate : provide remote ICE candidates.

addStream / removeStream : add or remove a MediaStream as a source.

addTrack / removeTrack : add or remove individual MediaStreamTrack objects.

createOffer / createAnswer : generate SDP for negotiation.

setLocalDescription / setRemoteDescription : set local or remote SDP.

getStats : retrieve connection statistics.

close : terminate the connection.

Main Events

onaddstream : fired when a remote stream is added.

ontrack : fired when a remote track is received.

ondatachannel : fired when a data channel is created.

onicecandidate : fired when a new ICE candidate is gathered.

oniceconnectionstatechange , onsignalingstatechange , onremovestream , etc.: provide state‑change notifications.

Typical Workflow

Create a new RTCPeerConnection and add local audio/video tracks via addTrack . The remote side listens for ontrack to receive and render the media.

Exchange SDP offers and answers so that each peer learns the other's capabilities and negotiates codecs, parameters, and transport details.

Exchange ICE candidates. When the initiator receives a remote candidate, it calls addIceCandidate . Likewise, the responder adds the initiator's candidates:

pc2.addIceCandidate(new RTCIceCandidate(ice))

After these steps, a basic WebRTC connection is established.

JavaScriptreal-time communicationWebRTCRTCPeerConnectionBrowser APIsMediaStream
360 Smart Cloud
Written by

360 Smart Cloud

Official service account of 360 Smart Cloud, dedicated to building a high-quality, secure, highly available, convenient, and stable one‑stop cloud service platform.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.