How to Build a Web Video Cutter with FFmpeg, WASM, and OffscreenCanvas

This article walks through the design and implementation of a web‑based video cutter, covering its overall architecture, the use of ffmpeg compiled to WebAssembly, OffscreenCanvas rendering, and worker threads, while providing code examples and practical tips for achieving smooth preview and export functionality.

ELab Team
ELab Team
ELab Team
How to Build a Web Video Cutter with FFmpeg, WASM, and OffscreenCanvas

Recently I have been developing a web video editing tool (cutter) that allows teachers to edit recorded videos easily.

This interesting project will be shared in multiple chapters.

This issue mainly introduces its overall process to give a simple understanding of its principles.

Web editor

Export result

After reading this article, you can learn:

Understand the basic principles of web video editing tools.

Implement a demo using ffmpeg + wasm + OffscreenCanvas.

Technical stack

The whole stack can be divided into:

Underlying page dependency: vesdk

Core page interaction: web editing tool

Backend: video architecture – video composition

Problem statements

To better understand the whole stack, we raise two questions:

Q1: How is video editing preview achieved on the web?

Q2: How to ensure consistency between preview and composition results?

Q1: How is video editing preview achieved on the web?

Currently there are two main approaches for web video editing:

Using native JavaScript based on browser‑provided APIs.

Compiling existing C/C++ video editing frameworks to WebAssembly and running them in the browser.

Image source: "VESDK技术演进之Web音视频编辑技术"

Workflow diagram

vesdk uses the second approach (ffmpeg + wasm). The overall flow diagram is as follows:

Scheduling logic:

During decoding and rendering, frames are produced as fast as possible, placed into a buffer pool, and displayed using requestAnimationFrame with FPS‑based timing.

How audio and text are drawn in editing:

Audio: built on the main thread using Web Audio's OpenAL API.

Text and effects: drawn directly with WebGL shaders.

YUV

YUV is a color encoding format that can be converted to RGB via formulas, offering smaller storage size than RGB.

R = Y + 1.140*V G = Y - 0.394*U - 0.581*V B = Y + 2.032*U

"Y" represents luminance, while "U" and "V" represent chrominance.

FFmpeg

Media source files are large; they need to be encoded to compress and then packaged into containers (encoding and containerization).

During playback, the container is demuxed and decoded to obtain raw data.

FFmpeg is a leading multimedia framework that supports most common container formats, protocols, and codecs.

FFmpeg provides two usage modes:

Command‑line tools (ffmpeg, ffprobe) for reading and writing media files.

// Example:
ffmpeg -i template.mp4 -pix_fmt yuv420p template.yuv

Second mode: calling FFmpeg libraries from C/C++ to perform encoding/decoding and obtain raw frames.

// Example snippet:
void decode() {
    char *path = "/template.mp4";
    ...
    while (true) {
        av_read_frame(avformat_context, packet);
        if (packet->stream_index == videoStream) {
            int ret = avcodec_send_packet(avcodec_context, packet);
            if (ret < 0) break;
            while (true) {
                int ret = avcodec_receive_frame(avcodec_context, frame);
                sws_scale(...);
                fwrite(...);
            }
        }
    }
}

WebAssembly

WebAssembly is a safe, portable, efficient binary format that can be generated from languages like C++ and run directly in browsers.

Example: a simple addition demo compiled to test.wasm.

OffscreenCanvas

OffscreenCanvas is a canvas that can be rendered off the main screen, usable in both the main thread and Web Workers.

It is typically used with workers in two scenarios:

Mode 1: Synchronous display of OffscreenCanvas frames

Process: Worker renders to OffscreenCanvas, then transfers the buffer back to the main thread for display.

Advantages: Main thread can directly control rendering.

Disadvantages: Canvas rendering is affected by the main thread.

Mode 2: Asynchronous display of OffscreenCanvas frames

Process: Main thread transfers Canvas to OffscreenCanvas and sends it to the worker; the worker renders and commits directly to the compositor.

Advantages: Rendering not blocked by the main thread; avoids heavy computation blocking UI.

Disadvantages: Main thread cannot control rendering.

Experiments show that moving decoding to a worker prevents main‑thread blocking, and using OffscreenCanvas for rendering avoids UI stalls.

Q2: How to ensure preview and composition consistency?

Because browsers have performance limits, front‑end composition can be unstable, so server‑side composition is used.

A draft protocol is defined so that cloud composition can follow the same steps as front‑end preview.

Combining ffmpeg + wasm + worker + OffscreenCanvas yields a performant web video editing tool.

Practical demo: Building a web GIF subtitle generator

The goal is a web‑based offline GIF subtitle generator that supports both offline synthesis and playback control.

Previously, GIF subtitles were generated on the server using ffmpeg, similar to video editing.

The new approach adapts this to the web using ffmpeg.wasm, workers, and OffscreenCanvas.

Previous pipeline issues:

Server CPU overload under high traffic.

Generated GIFs lacked preview playback control.

Specific pipeline

Result

Implementation details

We use the pre‑compiled ffmpeg.wasm library.

Step 1: Load worker

useEffect(() => {
  const gifWorker = new Worker('http://localhost:3000/gif_worker_offscreen.js');
  gifWorker.onmessage = function(msg) {
    if (msg.data.method === 'transfer') {
      setGifSrc(msg.data.url);
    }
  };
  setGifWorker(gifWorker);
}, []);

Step 2: Initialize worker and import ffmpeg

importScripts('/ffmpeg.dev.js');
const {createFFmpeg, fetchFile} = self.FFmpeg;
const ffmpeg = createFFmpeg({ corePath: 'http://localhost:3000/ffmpeg-core.js' });
onmessage = async (event) => {
  const method = event.data.method;
  if (method === 'init') {
    // initialization code...
  }
  // other methods...
};

Step 3: Generate GIF

async function decodeResource() {
  if (!ffmpeg.isLoaded()) await ffmpeg.load();
  ffmpeg.FS('writeFile', 'template.mp4', await fetchFile('http://localhost:3000/1/template.mp4'));
  // other file writes...
  await ffmpeg.run('-i', 'template.mp4', '-vf', "subtitles=template.ass:fontsdir=/tmp:force_style='Fontname=Microsoft YaHei'", 'export.gif');
  const data = ffmpeg.FS('readFile', 'export.gif');
  const url = URL.createObjectURL(new Blob([data.buffer], {type: 'image/gif'}));
  postMessage({method: "transfer", url});
}

Step 4: Play GIF

async function playCore(ctx) {
  const totalLength = Math.floor(duration / timeInterval);
  clearInterval(playTimer);
  playTimer = setInterval(async () => {
    if (!canPlay) return;
    playIndex++;
    if (playIndex === totalLength) {
      clearInterval(playTimer);
      return;
    }
    const data = ffmpeg.FS('readFile', `image${playIndex}.jpg`);
    const imageBitmap = await self.createImageBitmap(new Blob([data.buffer]));
    ctx.drawImage(imageBitmap, 0, 0);
  }, timeInterval);
}

Conclusion

This article shares interesting findings from building a video cutter; each aspect can be further explored.

Future posts will dive into the front‑end implementation details of the web editing editor.

offscreencanvasmedia processingffmpeg wasmweb video editing
ELab Team
Written by

ELab Team

Sharing fresh technical insights

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.