Youku Architecture Overview: Front‑End Framework, Database Design, and Caching Strategies
The article examines Youku's large‑scale video platform architecture, detailing its traffic statistics, hardware setup, custom front‑end CMS, evolution of MySQL replication to vertical partitioning and sharding, caching policies, and CDN usage that together enable high‑performance video delivery.
Based on 2010 statistics, Youku handled about 89 million daily unique visitors and 1.7 billion page views, making it one of China’s top video sites.
Hardware-wise, the service primarily used Dell PowerEdge 1950 and 860 servers with Dell MD1000 storage arrays, deploying over a thousand servers nationwide.
The front‑end was built on a custom CMS that modularized page rendering, allowing clear separation of UI components and easy extensibility; diagrams illustrate the module‑method‑parameter call relationships.
The database architecture evolved from a single MySQL instance to master‑slave replication, then to vertical partitioning and finally to horizontal sharding. Master‑slave replication provided read/write separation but introduced bottlenecks such as write scalability limits, replication lag, and lock contention. Vertical partitioning isolated business domains onto separate MySQL servers, improving load distribution, while sharding (hash‑based on user ID) further scaled storage and query capacity across multiple database shards.
To locate a user’s data, Youku maintains a mapping table linking user IDs to shard identifiers, enabling efficient lookup before querying the appropriate shard.
Caching strategies avoid in‑memory caches like memcached to prevent memory copying and locking overhead; instead, Youku relies on HTTP caching and a robust CDN. The CDN distributes video files to edge servers across the country, ensuring users receive content from the nearest, best‑performing server, which contributes to smoother playback compared to competitors.
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.