Unlocking Web Audio: Mastering Audio DSP with the Web Audio API
This article introduces the Web Audio API, explains core audio DSP concepts such as signal processing, time and frequency domains, and details the main AudioContext and AudioNode components, followed by practical examples, code snippets, and references for building sophisticated web‑based audio applications.
Introduction
Proposed in 2011 and implemented the same year by Google Chrome and Mozilla Firefox.
Before that, web audio was primitive and unsuitable for complex scenarios like web games or interactive apps.
The goal is to provide a complete web‑audio solution, including modern game audio engines and capabilities for mixing, processing, and filtering, achieving some functions of a digital audio workstation (DAW).
Fundamental Concepts
Audio Digital Signal Processing (Audio DSP)
Includes oscillator , filter , synthesiser and other functions.
Sound Signal
Vibrations between 20 Hz and 20 kHz generate sound waves, represented as analogue or digital signals, sampled by microphones or pickups, or directly synthesised.
The vibration frequency is called pitch ; the amplitude is called volume .
Continuous‑valued analogue signals are approximated to discrete‑valued digital signals using pulse‑code modulation (PCM).
Typical sample rates are 48 kHz or 44.1 kHz.
Time Domain
The time‑domain waveform shows how a signal varies over time; an oscilloscope can display it.
Frequency Domain
Mathematical transforms convert between time and frequency domains, commonly using the Fourier transform and the Fast Fourier Transform (FFT) algorithm.
AudioContext
The Web Audio API provides an AudioContext as the DSP operation space, implementing modular routing.
Use connect to link nodes and disconnect when finished.
Audio flows from inputs to outputs.
Control playback with suspend, resume, and close.
For security, a user gesture is required; otherwise the context stays in a suspended state.
AudioNode
Basic unit inside an audio context.
Common nodes: ScriptProcessorNode: JavaScript‑based audio generation/processing (deprecated but still used). AnalyserNode: Analyzer. ChannelMergerNode and ChannelSplitterNode: Channel merging and splitting. AudioDestinationNode (default output via AudioContext.destination). MediaStreamAudioDestinationNode: WebRTC MediaStream output. GainNode: Volume gain (dB). DelayNode: Delay effect. ConvolverNode: Reverb. StereoPannerNode: Stereo panning. PannerNode: 3‑D spatialisation. WaveShaperNode: Wave distortion. DynamicsCompressorNode: Compression/side‑chain. BiquadFilterNode: EQ filtering. OscillatorNode: Generates sine, square, sawtooth, triangle, or custom periodic waves. AudioBufferSourceNode: Plays decoded PCM data. MediaElementAudioSourceNode: Connects HTML5 media elements. MediaStreamAudioSourceNode: Connects WebRTC streams.
Example: Using Oscillator, Gain, and Custom Waveforms
See the CodePen example at https://codepen.io/jamesliu96/pen/oNGgWOb
Example: Fade In/Out Mixer
See the CodePen example at https://codepen.io/jamesliu96/pen/jOYedQR
Example: Chime
Melody is transcribed by the author; copyright belongs to the original creator.
Example: Pitcher
Amplitude
Root‑mean‑square (RMS).
Auto Correlate
Discrete auto‑correlation.
Pitch
Based on the twelve‑tone equal temperament, the standard pitch is 440 Hz.
p = #MIDI
f = frequency
when f = 440: p = 69
A440 = 440Hz = #69References
Web Audio API - Web APIs | MDN [3]
Web Audio API - Web API 接口参考 | MDN [4]
More
https://tonejs.github.io/
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
