Frontend Development 11 min read

Implementing Multi‑Audio Stream Playback on Web with WebRTC, HTTP‑FLV/HLS and Web Audio API

This article details how to replace WebRTC‑based multi‑audio streaming with HTTP‑FLV or HLS on the web, covering iOS Safari limitations, demuxing FLV, building ADTS headers, decoding with the Web Audio API, channel merging, and real‑time audio visualization using AnalyserNode.

360 Tech Engineering
360 Tech Engineering
360 Tech Engineering
Implementing Multi‑Audio Stream Playback on Web with WebRTC, HTTP‑FLV/HLS and Web Audio API

The web version of a voice‑chat live room originally used WebRTC to transmit multiple audio streams, but due to cloud service constraints and iOS Safari restrictions (which allow only a single HTML5 audio element), the playback needed to be switched to HTTP‑FLV or HLS protocols.

Research shows that iOS Safari can play only one audio stream at a time, so the solution relies on JavaScript libraries flv.js and hls.js (which use Media Source Extensions) to deliver HTTP‑FLV/HLS streams, and the Web Audio API to decode, mix, and visualize the audio data.

Implementation steps include:

Fetching the media file as a stream with fetch and a ReadableStream reader.

Demultiplexing (demux) the FLV container to extract AAC packets.

Constructing a 7‑byte ADTS header for each AAC frame using the audioObjectType, samplingFrequencyIndex, and channelConfiguration.

Decoding the ADTS‑wrapped AAC data to an AudioBuffer via AudioContext.decodeAudioData .

Creating multiple AudioBufferSourceNode objects, merging them with AudioContext.createChannelMerger , and routing the result to the destination.

Using an AnalyserNode (FFT size 256) to obtain time‑domain and frequency‑domain data for visualisation.

Key code snippets:

fetch(url).then(response => {
  const reader = response.body.getReader();
  reader.read().then(function process(result) {
    if (result.done) return;
    // ...process chunk...
    return reader.read().then(process);
  }).then(() => {
    console.log('done!');
  });
});
const packet = new Uint8Array(7);
packet[0] = 0xff;
packet[1] = 0xf0;
packet[1] |= (0 << 3);
packet[1] |= (0 << 1);
packet[1] |= 1;
packet[2] = (audioObjectType - 1) << 6;
packet[2] |= (samplingFrequencyIndex & 0x0f) << 2;
packet[2] |= (0 << 1);
packet[2] |= (channelConfiguration & 0x04) >> 2;
packet[3] = (channelConfiguration & 0x03) << 6;
// ...set remaining header bytes...
ctx.decodeAudioData(ADTS.buffer).then(process);
const audioBufferSourceNode = ctx.createBufferSource();
const merger = ctx.createChannelMerger(2);
audioBufferSourceNode.buffer = buffer;
audioBufferSourceNode.connect(merger);
merger.connect(ctx.destination);
audioBufferSourceNode.start(this.startTime);
const analyser = ctx.createAnalyser();
analyser.fftSize = 256;
const bufferLengthAlt = analyser.frequencyBinCount;
const dataArrayAlt = new Uint8Array(bufferLengthAlt);
analyser.getByteFrequencyData(dataArrayAlt);

To avoid blocking the main thread, heavy tasks such as fetching, demuxing, and ADTS packet creation are off‑loaded to Web Workers, communicating via postMessage and Transferable objects.

References include the FLV specification, ISO‑14496‑3, MDN Web Audio API documentation, and related articles.

JavaScriptHLSHTTP-FLVAudio VisualizationWeb Audio APIWebRTCflv
360 Tech Engineering
Written by

360 Tech Engineering

Official tech channel of 360, building the most professional technology aggregation platform for the brand.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.