Fundamentals 8 min read

Fundamentals of Audio and Video: Basics, Encoding, Processing, and Real‑Time Communication

This technical sharing session by a senior audio‑video engineer from 360 Video Cloud explains core concepts of video and audio, their encoding pipelines, media processing techniques, streaming protocols, and the challenges and key technologies behind real‑time communication (RTC).

360 Tech Engineering
360 Tech Engineering
360 Tech Engineering
Fundamentals of Audio and Video: Basics, Encoding, Processing, and Real‑Time Communication

This article presents a technical sharing session by a senior audio‑video engineer from 360 Video Cloud, covering fundamental concepts of audio and video, encoding techniques, media processing, streaming protocols, and real‑time communication (RTC) challenges and solutions.

1. Video Basics – Basic Concepts Video consists of image frames and accompanying sound. Applications include TV, VCD/Blu‑ray, and internet video such as VOD (e.g., Youku, iQIYI), short video (TikTok, Kuaishou), live streaming (Huajiao, YY) and real‑time calls (Skype, video conferencing). Each frame is a grid of pixels; pixels contain three color components. Resolution evolves toward higher definitions (4K, 8K). Compression is needed because raw video data (e.g., 1280×720×30×3 ≈ 80 MB/s) exceeds typical bandwidth and storage.

2. Video Compression – Why and How Compression reduces data size: H.264 achieves ~500×, H.265 ~1000× reduction. It exploits spatial redundancy (similar pixels) and temporal redundancy (similar frames). Most video compression is lossy, discarding information that is less perceptible. The encoding pipeline consists of prediction, transform (DCT), quantization, and entropy coding. Intra‑prediction handles spatial redundancy; inter‑prediction (motion estimation & compensation) handles temporal redundancy. DCT concentrates energy, making many high‑frequency coefficients near zero. Quantization introduces loss; entropy coding (e.g., CABAC) provides lossless compression. Frame types include I, P, B frames, organized into GOP structures.

3. Audio Basics – Basic Concepts Sound is a mechanical wave represented digitally as a one‑dimensional waveform. Volume is measured in decibels (dB), a logarithmic unit.

4. Audio Compression – Principles and Codecs Audio encoding uses frequency‑domain and time‑domain masking. Different scenarios demand different codecs: low‑latency codecs for real‑time calls, wide‑band codecs for music, etc. Common codecs include AAC, Opus, and MP3.

5. Media Processing Video: transcoding (adjust bitrate/resolution), visual effects (filters, stickers, transitions, beauty, collage, speed change), watermarks, picture‑in‑picture. Audio: pitch/tempo change, reverb, fade‑in/out.

6. Bitstream, Container, and Protocol Bitstream: raw encoded data output by the encoder, usually unsuitable for storage/transmission directly. Container: packages bitstreams for storage and transport (e.g., MP4, TS, FLV, AVI). Protocol: defines how containers are transmitted (e.g., RTMP, HLS, HTTP).

7. Video Production and Consumption Production involves recording and editing; consumption involves playback on various devices.

8. Comparison of VOD, Live, and Real‑Time Calls

9. Real‑Time Audio‑Video Communication (RTC) – Challenges and Key Technologies

Challenges: echo cancellation, packet loss, network jitter.

Key Technologies: • Echo cancellation (adaptive filters) • Audio signal processing (AGC, noise suppression, NLP) • Forward Error Correction (FEC) • Jitter buffer • Packet loss concealment • Multi‑Point Control Unit (MCU) • Scalable Video Coding (SVC)

streamingencodingvideoFundamentalsaudiomedia processingRTC
360 Tech Engineering
Written by

360 Tech Engineering

Official tech channel of 360, building the most professional technology aggregation platform for the brand.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.