Frontend Development 49 min read

Technical Exploration and Implementation of Cross‑Platform Web Screen Recording Using WebRTC, rrweb, and ffmpeg in Electron

This article presents a comprehensive technical analysis of cross‑platform web screen recording, covering strict business requirements, evaluation of rrweb, ffmpeg, and WebRTC solutions, detailed implementations for video and audio capture in Electron, handling of lock‑screen issues, WebM metadata fixes, memory‑usage constraints, and performance optimizations.

ByteFE
ByteFE
ByteFE
Technical Exploration and Implementation of Cross‑Platform Web Screen Recording Using WebRTC, rrweb, and ffmpeg in Electron

Background

Web screen recording is familiar to many, especially for video conferences, remote desktop software, and WFH scenarios. Beyond live sharing, recording real‑time operations for later replay is a core business scenario, requiring high stability and strict hard‑coded metrics.

Metric Requirements

Support recordings of any length, including over 6 hours.

Record system audio together with screen audio.

Cross‑platform support for Windows, macOS, and Linux.

Continuous recording when dragging a window from A to B.

Continue recording when the app is minimized, maximized, or fullscreen, limiting capture to the app interior only.

Allow uninterrupted recording without closing the app.

Enable timeline dragging on the web side without fully downloading the recording.

Support simultaneous recording of multiple app tabs.

Support recording of multiple app windows within the same system window.

Record live streaming streams.

Screen recordings must be uploaded and encrypted automatically after finishing; they cannot be stored locally.

Technical Solution Exploration

On the Chromium side there are two main approaches: the rrweb library and the native WebRTC API . In an Electron context a third option, ffmpeg , becomes available.

rrweb

Advantages

Can record audio from the current tab while recording.

Cross‑platform compatible.

Handles window drag, minimize, maximize, and fullscreen continuously.

Produces small recording files.

Allows timeline dragging without full download on the web.

Good performance.

Disadvantages

Cannot record live streams; its design limits use cases.

Stops recording if the app tab is closed, causing data loss.

May affect the page DOM in certain scenarios.

ffmpeg

Advantages

High‑quality output with the same file size.

Good performance.

Supports live‑stream recording.

Disadvantages

Cross‑platform handling is complex.

Recording area is static; moving the app may capture off‑screen content.

Cannot pause/resume when switching app tabs.

Supports continuous recording when dragging windows.

Audio data is temporarily stored locally, exposing it if the app closes.

Does not support recording when multiple app windows are open simultaneously.

WebRTC

Advantages

Meets all metrics 1‑11.

Disadvantages

Higher CPU usage; performance is poorer.

Native recordings lack duration metadata.

Native recordings do not support timeline dragging.

Very long recordings may exceed 1/10 of disk space and cause errors.

Native recordings consume significant memory.

Garbage collection of video blobs can easily leak memory.

Initially we built the first version with rrweb, but its hard limitations (loss on window close and no live‑stream support) forced us to switch to the WebRTC API solution.

Media Stream Acquisition

In the WebRTC standard, any continuous media source is abstracted as a MediaStream. The overall flow is illustrated below:

Video Stream Acquisition

To obtain a video stream, we first need the MediaSourceId of the target window or screen. Electron provides a generic API for this:

import
{ desktopCapturer }
from
'electron'
;
// Get MediaSourceIds for all windows or screens
desktopCapturer.getSources({
types
: [
'screen'
,
'window'
],
// Specify whether to capture screens or windows
thumbnailSize
: {
height
:
300
,
// Thumbnail height
width
:
300
// Thumbnail width
},
fetchWindowIcons
:
true
// Capture window icons if available
}).then(sources => {
sources.forEach(source => {
console
.log(source.appIcon);
console
.log(source.display_id);
console
.log(source.id);
console
.log(source.name);
console
.log(source.thumbnail);
});
});

If you only need the current window’s MediaSourceId:

import
{ remote }
from
'electron'
;
// Get MediaSourceId of the current window
const
mediaSourceId = remote.getCurrentWindow().getMediaSourceId();

After obtaining the MediaSourceId, we can request the video stream:

import
{ remote }
from
'electron'
;
// Video stream acquisition
const
videoSource =
await
navigator.mediaDevices.getUserMedia({
audio
:
false
,
// No audio here; audio is captured separately
video
: {
mandatory
: {
chromeMediaSource
:
'desktop'
,
chromeMediaSourceId
: remote.getCurrentWindow().getMediaSourceId()
}
}
});

Audio Source Acquisition

Audio capture is more complex because macOS requires a virtual audio driver (e.g., BlackHole) while Windows can capture system audio directly.

Windows Audio Capture

// Windows audio stream acquisition
const
audioSource =
await
navigator.mediaDevices.getUserMedia({
audio
: {
mandatory
: {
// Capture system audio without specifying a source ID
chromeMediaSource
:
'desktop'
,
}
},
// Video must be present for audio capture to succeed
video
: {
mandatory
: {
chromeMediaSource
:
'desktop'
,
},
},
});
// Remove the unwanted video track
(audioSource.getVideoTracks() || []).forEach(track => audioSource.removeTrack(track));

macOS Audio Capture with BlackHole

macOS does not allow direct system‑audio capture; a virtual driver such as BlackHole is required. The following code checks for installation, requests microphone permission, and obtains the appropriate device ID.

import
{ remote }
from
'electron'
;
const
isWin = process.platform ===
'win32'
;
const
isMac = process.platform ===
'darwin'
;
declare type AudioRecordPermission =
|
'ALLOWED'
|
'RECORD_PERMISSION_NOT_GRANTED'
|
'NOT_INSTALL_BLACKHOLE'
|
'OS_NOT_SUPPORTED'
;
// Check if SoundFlower or BlackHole is installed
async
function
getIfAlreadyInstallSoundFlowerOrBlackHole() {
const
devices =
await
navigator.mediaDevices.enumerateDevices();
return
devices.some(device => device.label.includes(
'Soundflower (2ch)'
) || device.label.includes(
'BlackHole 2ch (Virtual)'
));
}
// Get microphone permission status (BlackHole uses the mic channel)
function
getMacAudioRecordPermission() {
return
remote.systemPreferences.getMediaAccessStatus(
'microphone'
);
}
// Request microphone permission
function
requestMacAudioRecordPermission() {
return
remote.systemPreferences.askForMediaAccess(
'microphone'
);
}
async
function
getAudioRecordPermission() {
if
(isWin) {
return
'ALLOWED'
;
}
else
if
(isMac) {
if
(
await
getIfAlreadyInstallSoundFlowerOrBlackHole()) {
if
(getMacAudioRecordPermission() !==
'granted'
) {
if
(!(
await
requestMacAudioRecordPermission())) {
return
'RECORD_PERMISSION_NOT_GRANTED'
;
}
}
return
'ALLOWED'
;
}
return
'NOT_INSTALL_BLACKHOLE'
;
}
else
{
// Linux not supported yet
return
'OS_NOT_SUPPORTED'
;
}
}

After the permission is granted, the audio stream is obtained by selecting the virtual device ID:

if
(process.platform ===
'darwin'
) {
const
permission =
await
getAudioRecordPermission();
switch
(permission) {
case
'ALLOWED'
:
const
devices =
await
navigator.mediaDevices.enumerateDevices();
const
outputdevices = devices.filter(d => d.kind ===
'audiooutput'
&& d.deviceId !==
'default'
);
const
soundFlowerDevices = outputdevices.filter(d => d.label ===
'Soundflower (2ch)'
);
const
blackHoleDevices = outputdevices.filter(d => d.label ===
'BlackHole 2ch (Virtual)'
);
const
deviceId = soundFlowerDevices.length ?
soundFlowerDevices[
0
].deviceId :
blackHoleDevices.length ?
blackHoleDevices[
0
].deviceId :
null
;
if
(deviceId) {
const
audioSource =
await
navigator.mediaDevices.getUserMedia({
audio
: {
deviceId
: {
exact
: deviceId},
sampleRate
:
44100
,
// Disable processing for raw audio
echoCancellation
:
false
,
noiseSuppression
:
false
,
autoGainControl
:
false
,
},
video
:
false
,
});
}
break
;
case
'NOT_INSTALL_BLACKHOLE'
:
// Prompt user to install BlackHole
break
;
case
'RECORD_PERMISSION_NOT_GRANTED'
:
// Prompt user to grant permission
break
;
default
:
break
;
}
}

Merging Audio and Video Streams

After acquiring both streams, we combine their tracks into a new MediaStream:

// Merge audio and video streams
const
combinedSource =
new
MediaStream([...this._audioSource.getAudioTracks(), ...this._videoSource.getVideoTracks()]);

Media Recording

Encoding Format

MediaRecorder only outputs WebM but supports several codecs (vp8, vp9, h264, etc.). The following snippet checks codec support:

let types: string[] = [
"video/webm",
"audio/webm",
"video/webm;codecs=vp9",
"video/webm;codecs=vp8",
"video/webm;codecs=daala",
"video/webm;codecs=h264",
"audio/webm;codecs=opus",
"video/mpeg"
];
for (let i in types) {
console.log("Is " + types[i] + " supported? " + (MediaRecorder.isTypeSupported(types[i]) ? "Yes" : "No :(");
}

Testing shows no significant CPU difference, so VP9 is recommended.

Creating the Recorder

const
recorder =
new
MediaRecorder(combinedSource, {
mimeType
:
'video/webm;codecs=vp9'
,
videoBitsPerSecond
:
1.5e6
});
const timeslice =
5000
;
const
fileBits: Blob[] = [];
recorder.ondataavailable = (event) => {
fileBits.push(event.data);
};
recorder.onstop = () => {
const
videoFile =
new
Blob(fileBits, {type:
'video/webm;codecs=vp9'
});
};
recorder.start(timeslice);
setTimeout(() => recorder.stop(),
30000
);

Pause / Resume

// Pause recording
recorder.pause();
// Resume recording
recorder.resume();

Handling Recording Artifacts

MediaRecorder‑generated WebM files lack duration and cue information, making them unseekable. Several Chromium bugs (e.g., 561606, 569840, 599134, 642012) describe this limitation. The article discusses three remediation strategies:

Use ffmpeg to copy and rewrite metadata (requires file I/O).

Use the npm package fix-webm-duration to inject duration (does not add cues).

Parse EBML with ts‑ebml and rebuild missing SeekHead and Cues (full fix).

The full EBML parsing and reconstruction logic is provided, along with a custom LargeFileDecoder that slices large Blobs to avoid the 2 GB ArrayBuffer limit.

Web Worker Processing

To avoid blocking the main thread, the fix runs inside a Web Worker. The worker receives sliced ArrayBuffers via Transferable Objects, decodes them with ts‑ebml , builds a new metadata block, and posts the result back. The main thread then reassembles the final Blob without extra copies, preserving disk‑space reuse.

Blob Storage and Memory Limits

Chromium stores Blobs either in shared memory, IPC, or on disk, depending on size and available resources. On 64‑bit Windows the default in‑memory limit is 2 GB, which can cause the main process to consume gigabytes of RAM during long recordings. The article shows how to modify Chromium’s BlobMemoryController to lower the in‑memory cap (e.g., 200 MB) and increase the allowed disk quota, eliminating the apparent memory leak.

Memory‑Leak Prevention in Renderer

All references to Blob objects must be cleared (set to null ) after use, and large Blob slices should be reused rather than recreated. Tools such as chrome://blob-internals/ , the Chrome Profiler, and inspection of the blob_storage directory help detect leaks.

Future Optimizations

The current fix still loads the entire file into memory for EBML parsing. Future work will stream the data to keep memory usage low.

Recruitment

We are the ByteDance Content‑Security Front‑End team, working on AI‑driven global content safety and image annotation. We welcome interns and full‑time engineers. Contact: [email protected]

References

[1] https://www.electronjs.org/docs/latest/api/desktop-capturer/#desktopcapturergetsourcesoptions

[2] https://github.com/ExistentialAudio/BlackHole

[3] https://www.webmproject.org/docs/container/

[4] https://docs.google.com/presentation/d/1MOm-8kacXAon1L2tF6VthesNjXgx0fp5AP17L7XDPSM/edit#slide=id.g91839e9b6_4_5

[5] https://source.chromium.org/chromium/chromium/src/+/master:storage/browser/blob/blob_memory_controller.cc?q=CalculateBlobStorageLimitsImpl&ss=chromium

cross‑platformElectronWebRTCblobScreen RecordingMediaRecorder
ByteFE
Written by

ByteFE

Cutting‑edge tech, article sharing, and practical insights from the ByteDance frontend team.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.