WeChat’s Backend Journey: From Zero to Billions with Scalable Architecture

This article chronicles how WeChat’s backend evolved from a simple messaging prototype to a globally distributed, multi‑data‑center system, detailing its message model, unified sync protocol, three‑layer architecture, platformization, disaster‑recovery design, performance tuning, and emerging resource‑scheduling challenges.

Big Data and Microservices
Big Data and Microservices
Big Data and Microservices
WeChat’s Backend Journey: From Zero to Billions with Scalable Architecture

From Zero to One

WeChat was officially launched on 2011‑01‑21, just two months after the project started. In that period the team focused on three core tasks: defining the message model, establishing a unified data‑sync protocol, and solidifying the backend architecture.

Message Model

The model mirrors email: messages are stored, forwarded, and temporarily cached on the server before a push notification is sent to the receiver, after which the client pulls the message.

WeChat message model
WeChat message model

Unified Data‑Sync Protocol

All user data (accounts, contacts, messages) are synchronized via a lightweight snapshot consisting of three key‑value pairs (account, contacts, messages). The server computes the diff and sends only the changes, eliminating the need for client‑side diff computation and reducing traffic and CPU overhead.

Backend Architecture

WeChat adopts a three‑layer architecture: an access layer (long‑ and short‑connection services), a logic layer (business and base services), and a storage layer (data‑access and data‑storage services). Each data type (account, message, contact) has dedicated access and storage modules.

Access layer provides long‑connection (bidirectional) and short‑connection (client‑initiated) services. Logic layer separates business APIs from common base services. Storage layer uses MySQL and the proprietary SDB key‑table system; each data type has its own storage service.

The backend is primarily written in C++ and built on the Svrkit RPC framework, which powers thousands of services and handles tens of trillions of RPC calls daily.

WeChat backend architecture
WeChat backend architecture

Asynchronous Queues and Group Chat

Features such as group chat and external integrations introduced the need for asynchronous queues to buffer variable processing times. Group messages are written to each member’s inbox (write‑side fan‑out) to keep sync logic simple and efficient.

Single and group chat flow
Single and group chat flow

Micro‑service Evolution (Logicsvr)

Initially a monolithic mmweb CGI host, the system was refactored into multiple Logicsvr services compiled statically with Svrkit, allowing independent deployment and scaling. Today dozens of Logicsvr binaries provide hundreds of CGI APIs across thousands of servers.

Platformization

WeChat’s backend gave rise to separate platforms such as the public account platform, payment platform, and hardware platform, each evolving from specialized handling in the core system.

WeChat platform ecosystem
WeChat platform ecosystem

International Expansion and Multi‑Data‑Center Design

Starting with version 3.0, WeChat added multilingual support and launched its first overseas data center. A master‑master storage architecture was adopted: each user’s data is written to its home data center (master) and asynchronously replicated to the other center, achieving eventual consistency while preserving strong consistency for critical operations like unique WeChat ID allocation.

Multi‑data‑center architecture
Multi‑data‑center architecture

Three‑Zone Disaster Recovery

After a massive outage in 2013, WeChat redesigned its data center topology to deploy services across three physically isolated zones. Each zone runs a full set of services, and data is replicated with at least two copies across zones, enabling automatic failover without service interruption.

Performance Optimizations

Key improvements include adding coroutine support to Svrkit (allowing asynchronous handling without code changes) and a FastReject QoS mechanism to protect services from overload‑induced cascading failures.

Security Hardening

A ticket‑based authentication system was introduced, where every client request carries a server‑issued ticket that is validated at each backend hop, preventing unauthorized data access.

New Challenges

WeChat is building a resource‑scheduling system (Yard) to automate service deployment and elastic resource allocation, and developing high‑availability storage solutions such as PhxSQL (Paxos‑based MySQL) alongside the existing Quorum‑based KVSvr.

backend architecturedisaster recoveryData SynchronizationWeChatScalable Systems
Big Data and Microservices
Written by

Big Data and Microservices

Focused on big data architecture, AI applications, and cloud‑native microservice practices, we dissect the business logic and implementation paths behind cutting‑edge technologies. No obscure theory—only battle‑tested methodologies: from data platform construction to AI engineering deployment, and from distributed system design to enterprise digital transformation.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.