WhatsApp Architecture: High‑Scalability Design and Engineering Insights
The article analyzes WhatsApp's high‑scalability backend architecture, detailing its Erlang‑based server stack, massive user statistics, hardware deployment, custom protocols, performance‑tuning tools, and lessons learned from scaling to billions of messages with a tiny engineering team.
WhatsApp was acquired by Facebook for $19 billion, yet its service runs on only about 32 engineers, handling 4.5 billion active users with a highly optimized backend architecture primarily built on Erlang and FreeBSD.
Statistics
5 × 10⁸ active users, 500 billion messages per day across seven platforms.
32 engineers supporting ~14 million users each.
Hundreds of servers, >11 000 CPU cores, several hundred terabytes of RAM.
Peak concurrent connections reaching 4.7 billion, with message peaks of 20 k inbound and 712 k outbound per second.
Platform
Backend: Erlang (custom‑patched BEAM), FreeBSD 9.2, Yaws/lighttpd, PHP, Mnesia, ejabberd (heavily modified), custom XMPP protocol.
Frontend: iPhone, Android, BlackBerry, Nokia Symbian, Nokia S40, Windows Phone, plus an unknown client; SQLite on devices.
Hardware
~550 servers total: ~150 chat servers, ~250 MMS servers.
Each node with 64 GB–512 GB RAM, SSD for most data, >11 000 CPU cores.
General Design
Core messaging implemented entirely in Erlang; ejabberd used as the initial XMPP server and later heavily rewritten.
Messages are queued on the server until the client retrieves them; SSL sockets protect traffic.
Registration relies on phone numbers and a PIN‑based verification flow.
Multimedia is uploaded via HTTP to SSD (images) or SATA (audio/video) and referenced by URLs.
Scaling to 2 million Connections per Server
Initial load: 200 k concurrent connections per server, later increased to 1 million and finally 2 million.
Dynamic capacity planning, load‑spike handling (e.g., major sports events), and fault isolation were essential.
Tools & Techniques for Extensibility
Custom system activity reporter (wsar) collects OS, hardware, and BEAM metrics at sub‑second intervals.
Hardware performance counters (pmcstat) to measure emulator CPU usage.
DTrace, kernel lock counters, fprof for debugging.
Patch‑level BEAM modifications, custom schedulers, memory allocators (mseg), and real‑time priority for the VM.
Extensive instrumentation of message queues, lock contention, and network stack.
Experience Summary
Continuous measurement, bottleneck elimination, and iterative testing are vital for scaling.
Erlang proved to be a powerful platform for high‑availability, high‑throughput services, despite the need for extensive tuning.
Keeping server count low while providing redundancy, and focusing on user‑centric simplicity, contributed to WhatsApp’s success.
References: HighScalability blog (2014), Zhihu answer, Quora discussion.
Architect
Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.