Cloud Computing 18 min read

How to Build a Low‑Cost, High‑Concurrency Distributed Video Transcoding System on AWS

This article explains the design of a distributed video transcoding platform that leverages AWS Lambda and EC2‑SLAVE to achieve high‑concurrency, low‑latency streaming, detailing architecture, load balancing, health checks, task monitoring, and cost‑saving strategies for scalable cloud‑based video processing.

MXPlayer Technical Team
MXPlayer Technical Team
MXPlayer Technical Team
How to Build a Low‑Cost, High‑Concurrency Distributed Video Transcoding System on AWS

Overview

Video transcoding is the core technology used in the online video industry. It compresses high‑resolution videos into multiple resolutions for adaptive streaming on various devices. Videos are sliced, each fragment is transcoded into several resolutions, and then merged into a streaming format.

Transcoding is necessary but costly; third‑party services like AWS Transcoder are expensive and have limited performance and flexibility. In addition to transcoding, CDN, storage, and bandwidth costs can be high, especially for large videos.

To address these challenges we built a distributed transcoding system that reduces costs, enables flexible scaling, and improves speed. It supports high‑concurrency multi‑video transcoding, adaptive adjustment, and per‑video bitrate optimization, saving bandwidth and delivering smooth streaming.

Video Download and Access

High Concurrency

For distributed transcoding each video segment is requested concurrently by many AWS‑Lambda or EC2 compute units. Lambda can scale to thousands of workers instantly, but S3 cannot handle simultaneous segment requests, causing latency. We therefore deploy our own file server to provide stable, high‑concurrency fragment access.

Minimum Disk Usage

We use HTTP‑RANGE to request only needed fragments, and FFMPEG supports RANGE mode. Multiple file servers can host copies of a video, but we store a single copy per server to avoid redundant downloads and storage. High‑performance servers replace many low‑performance nodes, reducing network and disk load.

Intelligent Load Balancing

File servers monitor their own transcoding queues and predict future load. A server expected to become heavily loaded voluntarily yields new download requests to less‑loaded servers, achieving a polite, self‑balancing distribution without a central dispatcher.

Automatic Health Checking

Each file server continuously monitors CPU, memory, network, and disk usage. If load exceeds a threshold, the server delays new downloads, ensuring high‑priority transcoding tasks are not affected and protecting the server from overload.

Timely Disk Recycling

Large source videos (up to 200 GB) are stored on high‑speed SSDs. Disk space is limited, so the system monitors video usage and automatically removes videos no longer needed, freeing space for upcoming tasks while prioritizing active transcoding.

Split Main Task and Trigger Sub Task

Split Main Task

Video transcoding is CPU‑intensive. We split each main transcoding job (a specific resolution/bitrate) into many small sub‑tasks, each handling a video segment, allowing concurrent processing across multiple compute units.

Trigger Sub Task

Sub‑tasks are placed in an internal queue, then dispatched to EC2‑SLAVE or AWS‑Lambda based on current resource availability, avoiding the need to keep a large pool of idle EC2 instances.

Monitor Everything

Machine Level Resource Monitoring

AWS CloudWatch tracks real‑time resource usage of transcoding machines and the master node, primarily for historical analysis.

Business Level Machine Load Monitoring

File servers report load in real time; the upstream module uses this to decide when to trigger additional transcoding units. EC2 workers also report load to adjust their task intake.

Queue Level Congestion Monitoring

Message queues control module interactions. If a queue becomes congested, upstream distribution rates are throttled, and load‑balancing strategies (preemption or politeness) are applied.

Task Level Status Monitoring

Each transcoding task records state transitions, enabling real‑time observation of main and sub‑tasks, cost calculation, automatic retries for failed sub‑tasks, and overall system health checks.

Main Task Monitoring

State transitions for a main task: Ready → Downloading → Downloaded → Accepted (split into sub‑tasks) → Running → Succeeded or Failed.

Sub Task Monitoring

Sub‑task states: Ready → CRFRunning → CRFSucceeded/CRFFailed/CRFTimeout → CRFReady → Running → Succeeded → Failed/Timeout (which may cause the main task to fail).

Low Cost Transcoding

AWS‑LAMBDA

Sub‑tasks are CPU‑intensive but short‑lived; AWS Lambda provides abundant low‑cost CPU cycles by utilizing idle resources, charging only for execution time.

EC2‑SLAVE

Our custom EC2‑SLAVE module runs on regular EC2 instances, monitors host resources, and opportunistically executes sub‑tasks when CPU is idle, preserving host performance while maximizing free compute capacity.

Dispatching Control

Excessive Lambda or EC2‑SLAVE usage can overload the file server, increasing latency and cost. Dispatch rates are throttled based on file‑server load reports to maintain high concurrency and low latency.

High Scale Ability

Both the file server and EC2 instance count can auto‑scale according to queue length, easily handling large or small workloads and meeting diverse business needs.

Summary

This article focused on how our distributed transcoding system stabilizes high‑concurrency video services while reducing computing costs. The next article will cover bitrate optimization methods that further lower bandwidth expenses and improve user experience.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Cloud ComputingCost Optimizationhigh concurrencydistributed systemVideo TranscodingAWS Lambda
MXPlayer Technical Team
Written by

MXPlayer Technical Team

Technical articles and experience sharing. MXPLAYER is the top-ranked online video content platform in India, and also the world's largest player app, with 100M+ DAU and hundreds of millions of MAU.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.