Live Streaming Process Model: Capture, Sampling, Encoding, and Audio Channel Technologies
This article explains the live streaming workflow, detailing audio and video capture, digital sampling rates and bit depths, various sound channel configurations from mono to immersive formats, and common audio encoding methods such as PCM, AAC, MP3, and FLAC.
Live Streaming Process Model
The live streaming workflow consists of five main stages; this article focuses on the first two: capture and encoding.
1. Capture
1.1 Audio Capture
Audio capture converts sound waves into digital data using analog‑to‑digital converters (ADC) and can be reversed with digital‑to‑analog converters (DAC). Two key parameters are sampling rate (frequency) and sampling size (bit depth).
Sampling Rate
Sampling rate indicates how many samples are taken per second, measured in hertz (Hz). Higher rates improve fidelity but increase data size.
8,000HZ -- telephone sampling rate (sufficient for speech)
11,025Hz -- AM broadcast sampling rate
24,000Hz -- FM broadcast sampling rate
44,100Hz -- audio CD sampling rate
47,250Hz -- recorder
48,000Hz -- digital TV, DVD, professional audio
Higher rates exist, but frequencies above 48 kHz are inaudible to most listeners.Sampling Size (Bit Depth)
Bit depth determines how many discrete amplitude levels each sample can represent. Common depths are 8 bits (256 levels, low quality) and 16 bits (65 536 levels, CD quality). Higher bit depth yields finer detail at the cost of larger files.
1.2 Sound Channels
Channel configurations describe how many separate audio tracks are recorded and reproduced. Evolution includes:
Mono (1.0) – single speaker.
Stereo (2.0) – left and right speakers.
5.1 – five speakers plus a subwoofer.
7.1 – adds two rear speakers to 5.1.
Dolby Atmos / DTSX – object‑based audio that can map sounds to any speaker layout, including overhead speakers.
Object‑based mixing allows a single audio track to be rendered for any speaker configuration, simplifying production for immersive formats.
2. Encoding
2.1 Audio Encoding
Raw audio is captured as PCM (Pulse‑Code Modulation) data, an uncompressed binary representation of the sampled waveform. PCM offers lossless quality but large file size, so it is usually compressed.
Compression Types
Lossy compression removes perceptually irrelevant data (e.g., MP3, AAC, OGG), while lossless compression retains all original information (e.g., FLAC, ALAC, APE).
Common Formats
Format
Characteristics
WAV
Uncompressed PCM with a 44‑byte header; excellent quality, large size.
MP3
Lossy; good compression at >128 kbps, widely supported.
AAC
Modern lossy codec; high quality at low bitrates, dominant in live streaming.
FLAC
Lossless; retains PCM quality with moderate compression.
APE
Lossless; higher compression ratio than FLAC, less common.
Live streaming platforms typically use AAC for audio because it balances quality and bandwidth.
New Oriental Technology
Practical internet development experience, tech sharing, knowledge consolidation, and forward-thinking insights.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.