Design Analysis of Netflix’s Cloud‑Based Microservices Architecture
This article examines Netflix’s cloud‑based microservices architecture, detailing its client, backend, CDN components, design goals such as high availability, low latency, scalability, and the trade‑offs, resilience mechanisms, and scalability strategies employed on AWS to support millions of global streaming users.
Netflix has grown to serve over 167 million subscribers worldwide, delivering more than 1.65 billion streaming hours daily. To achieve the required reliability and scalability, the company migrated its infrastructure to Amazon Web Services (AWS) in 2008 and rebuilt its platform as a collection of small, independently deployable microservices.
The overall system consists of three logical layers: the client layer (web browsers, iOS/Android apps, smart‑TV applications), the backend layer (AWS EC2, S3, DynamoDB, Cassandra, Hadoop, Spark, Kafka, and custom Netflix services), and the Open Connect content‑delivery network (CDN) composed of Open Connect Appliances (OCAs) placed at ISPs and IXPs.
Playback begins when a client requests a video; the request passes through AWS Elastic Load Balancers to the Zuul API gateway, then to the Playback API, which validates the subscription, selects a healthy OCA via the Steering service, and returns a list of candidate OCAs. The client probes the list, chooses the best OCA, and streams the video.
The backend microservices handle registration, billing, recommendation, video transcoding, and other business logic. They are stateless, communicate via REST or gRPC, and are protected by Hystrix circuit breakers and EVCache caching. Data is stored in MySQL, Cassandra, Elasticsearch, and Hadoop for batch analytics.
Design goals focus on global high availability, low latency, and horizontal scalability. High availability is achieved through multi‑region AWS deployment, redundant OCAs, and load‑balancing. Low latency is maintained by fast OCA selection, client‑side adaptive bitrate, and timeout‑controlled microservice calls. Scalability is provided by AWS Auto Scaling, the Titus container platform, and partitioned data stores.
Key trade‑offs include sacrificing strict consistency for lower latency and higher availability, and balancing added instances against diminishing performance gains. The system tolerates failures using adaptive retries, circuit breaking, and Netflix’s chaos‑engineering practices that inject faults to validate self‑healing capabilities.
In summary, Netflix’s cloud‑native architecture demonstrates how a large‑scale streaming service can combine microservices, robust caching, automated scaling, and a globally distributed CDN to deliver reliable, low‑latency video to millions of users worldwide.
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.