Inside Youku’s Massive Architecture: Front‑End, Database & Caching Secrets

This article examines Youku’s large‑scale system architecture, detailing its server fleet, LAMP‑based front‑end framework, evolving MySQL database designs from master‑slave replication to vertical partitioning and sharding, and its caching strategies and CDN deployment that together support over 1 billion daily users.

21CTO
21CTO
21CTO
Inside Youku’s Massive Architecture: Front‑End, Database & Caching Secrets

After introducing YouTube’s technical architecture, this article looks at Youku, a leading Chinese video site, and its own architecture.

1. Basic Data

According to big‑data statistics, Youku receives over 100 million daily unique visitors (UV) and nearly 2 billion page views (PV), ranking 217th globally and 49th in China.

Initially Youku used Dell PowerEdge 1950/860 servers with Dell MD1000 storage arrays; today it operates more than 6,000 servers across major provincial nodes.

2. Front‑End Framework

From the start, Youku built its CMS on a LAMP architecture to render front‑end pages. The modules are well abstracted, offering good extensibility and UI separation, making development and maintenance simple and flexible.

Below is the front‑end module call relationship:

The routing is determined by module, method, and params, resulting in a concise design. The following diagram shows Youku’s front‑end partial architecture:

3. Database Structure

Youku’s database architecture has undergone several iterations, starting from a single MySQL server, then simple master‑slave replication, SSD optimization, vertical partitioning, and finally horizontal sharding.

1. MySQL Master‑Slave Replication

Master‑slave replication provides read‑write separation, greatly improving read performance. The process is illustrated below:

However, this mechanism introduces performance bottlenecks such as write scalability limits, lack of write caching, replication lag, increased lock contention, and larger tables reducing cache hit rates.

Write scalability issues

Write caching unavailable

Replication delay

Higher lock contention

Table growth reduces cache efficiency

2. MySQL Vertical Partitioning

When business logic is sufficiently independent, placing each business’s data on separate database servers isolates failures and balances load, significantly boosting throughput. The architecture after vertical partitioning is shown below:

3. MySQL Horizontal Sharding (Sharding)

Horizontal sharding groups users by a rule (e.g., hash of user ID) and stores each group’s data in a separate shard. As user numbers grow, adding a new server is sufficient. The principle diagram is:

To locate a user’s shard, a mapping table stores the relationship between user ID and shard ID; each request first queries this table, then accesses the appropriate shard:

Cross‑shard queries are challenging; Youku tries to avoid them, and when necessary uses multi‑dimensional shard indexes, distributed search engines, or as a last resort, distributed database queries, which are costly and impact performance.

4. Caching Strategy

Large‑scale sites love caching, from HTTP caches to memcached. Youku, however, does not use in‑memory caching for reasons such as avoiding memory copying, lock contention, and the difficulty of removing cached items when a video is taken down.

Avoid memory copy and lock overhead

Facilitate quick removal of withdrawn videos

Additionally, Squid’s write() incurs user‑process memory consumption, and Lighttpd 1.5’s asynchronous I/O (AIO) reading files into user memory reduces efficiency.

Like YouTube, Youku maintains a robust CDN network to ensure smooth playback, delivering the nearest or best‑performing video or cache server to each user based on geographic location.

5. Summary

This overview presents Youku’s core infrastructure. As technology continuously evolves, this foundation enables rapid response to new business and product demands.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

database shardingYoukuVideo Architecture
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.