Why WhatsApp Skips Kafka: Erlang’s Built‑In Mailbox Powers Billion‑Message Scale
WhatsApp achieves billions of daily messages using Erlang’s lightweight, actor‑model mailboxes instead of Kafka, delivering microsecond latency, simplicity, hardware efficiency, and reliability through a massive cluster of isolated processes.
When engineers think of large‑scale messaging, Kafka often comes to mind, but WhatsApp’s architecture tells a different story. Rather than a Kafka cluster, WhatsApp relies on Erlang’s built‑in mailbox system—an actor‑model queue that predates Kafka by two decades.
Erlang as the Core Language
Founded by former Yahoo engineers in 2009, WhatsApp chose Erlang, a functional language created by Ericsson for telecom switches. Its key strength is the actor model: each lightweight process (millions can be created) runs in isolation, communicates via asynchronous message passing, and owns a mailbox that acts as a built‑in queue.
Queue as Process Mailbox
In Erlang, a queue is not a separate middleware component; it is the process’s mailbox. Messages are sent with Pid ! Message and retrieved with a receive ... end block. This eliminates the need for brokers, partitions, or ZooKeeper.
WhatsApp’s Scalable Architecture
Each user session maps to a dedicated Erlang process. When a user sends a message, it lands in the recipient’s process mailbox; acknowledgments are just another message. Group chats fan out messages to multiple process mailboxes. The overall flow can be illustrated as:
User A → WhatsApp server (Erlang Process A) → Message → Process B mailbox → Process B pulls message → Deliver to User B clientThis design runs on an Erlang cluster rather than a central Kafka cluster, allowing millions of concurrent processes to communicate directly in memory.
Why Not Kafka?
Latency : Erlang processes communicate in‑VM with microsecond latency, orders of magnitude faster than Kafka’s disk‑based broker model.
Simplicity : No need to manage partitions, replication, brokers, or external monitoring; the queue is native to the runtime.
Hardware Efficiency : At its peak, WhatsApp operated with ~50 engineers and a few hundred servers, whereas a comparable Kafka deployment would require far more hardware.
Reliability : Erlang was built for 24/7 telecom systems, providing fault‑tolerant, hot‑swappable processes that keep WhatsApp online.
Code Illustrations
A simple Erlang example shows a process sending a ping and handling a pong:
-module(pinger).
-export([start/0, ping/2]).
ping(0, Pong_PID) ->
Pong_PID ! finished,
io:format("ping finished~n");
ping(N, Pong_PID) ->
Pong_PID ! {ping, self()},
receive
pong -> io:format("Ping received pong~n")
end,
ping(N-1, Pong_PID).A simplified message flow in pseudo‑Erlang:
% user A sends message
ProcessB ! {msg, "Hello", UserA}.
loop(Messages) ->
receive
{msg, Text, From} ->
io:format("Got message ~p from ~p~n", [Text, From]),
loop([Text|Messages])
end.Enduring Relevance
Even after Meta integrated WhatsApp into a larger infrastructure, the system’s core remains Erlang‑based, leveraging millions of lightweight processes, mailbox‑driven messaging, and a battle‑tested queue model. The story illustrates that older, simpler technologies can outperform newer, flashier ones in massive, latency‑sensitive deployments.
Conclusion
WhatsApp’s secret to handling over 100 billion daily messages is not a modern queue like Kafka or Pulsar, but Erlang’s built‑in actor‑model mailbox—a minimalist, highly efficient, and reliable solution that scales with minimal hardware.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
