WeChat Backend Architecture: Synchronization Protocol, RPC Framework, and Multi-IDC Design

The article outlines WeChat’s backend architecture, detailing extreme business requirements such as low latency and power efficiency, challenges of synchronizing diverse data across terminals, and solutions including a minimal sync protocol, high‑efficiency notification mechanisms, a three‑tier backend, unified RPC framework, coroutine‑based high‑concurrency RPC, and multi‑IDC distribution with strong consistency and disaster‑recovery strategies.

Architect
Architect
Architect
WeChat Backend Architecture: Synchronization Protocol, RPC Framework, and Multi-IDC Design

Problem

Extreme Business Features

Smooth message sending and receiving

Timely notifications

Power saving

Bandwidth saving

Thin client

Challenging Backend‑Terminal Synchronization

Synchronizing diverse data: account info, contacts, messages, Moments, etc.

Timely notification and sync

Reliable sync over mobile networks

Saving bandwidth and power

Solution

Minimal Synchronization Protocol

The backend and terminal only need to exchange a single number, allowing the backend to know all data missing on the terminal.

Change sequence number / version number:

Each change to a user's data is assigned a monotonically increasing global sequence number.

Every data batch sent from the backend includes the maximum sequence number of that batch.

The terminal includes the highest sequence number it has already received in each request.

Efficient Notification Mechanism

iOS Apple Push Notification Service

Android and others – long connections

GPRS/EDGE signaling storm optimization

Adaptive heartbeat interval adjustment

Three‑Layer Backend Architecture

Unified RPC Framework

Generate server and client code from Protocol Buffer definitions

Server: developers implement the defined interfaces

Client: applications call the generated client APIs locally

Hide network details

Support TCP/UDP based calls

Support long and short connections

Rich features

Sharding‑based SET distribution

Stateless storage using consistent hashing

Transparent service redirection

Comprehensive automated monitoring (QPS, response time, queue time, per‑interface call frequency and status code distribution, service call topology)

High‑Concurrency Coroutine RPC

Server‑side synchronous call model is easier to learn, use, and debug than an asynchronous model, but the number of processes and threads a single server can host is limited.

RPC based on user‑space threads (coroutines)

A single machine can support tens of thousands to a hundred thousand user‑space threads, limited only by CPU and memory.

Improves concurrency and performance.

Implementation of user‑space thread RPC

Based on makecontext / getcontext / swapcontext

Hook network calls: read / write / epoll

User‑space thread scheduling

Near‑by Access

Access IDC that is geographically close

Near‑by network entry covering major carriers

CDN for image upload/download

Tencent self‑built CDN

AKAMAI

Multi‑IDC Distribution Improves User Experience

Complex domestic network environment

Over 100 million overseas users distributed globally, facing diverse network conditions

Each IDC provides full functionality and all required data

Common data across IDC and independent data per IDC

Globally consistent account information

User data isolated per IDC (a user belongs to one IDC; user attributes, relationship graph, messages; selectively shared SNS data such as photos, comments, likes to reduce bandwidth)

IDC Distributed Data High‑Reliability Final Consistency Guarantee

Primary‑backup model for account and SNS data

The IDC where the user resides is the primary IDC

All other IDC act as backups for that user

Updates propagate from primary to backups

Weak real‑time cross‑IDC updates use a Zookeeper‑mediated primary‑backup task queue

Consolidate cross‑IDC access interfaces

Redo mechanisms ensure reliable cross‑IDC updates

Data sequence numbers guarantee eventual consistency during redo

Relationship‑graph cross‑IDC updates

Privacy control requires real‑time updates

Direct cross‑IDC network calls

Backend batch processing retries failed requests

Fault Tolerance and Disaster Recovery Mechanisms

Single IDC

Users are distributed by SET; each SET is independent

High‑availability remote disaster recovery

Each service’s primary IDC has a disaster‑recovery IDC

Challenges: seamless client connection during primary‑backup IDC switch and data consistency between them

Source: http://blog.xiayf.cn/2013/10/23/learning-in-tencent-backend-arch-of-weixin/

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Distributed SystemsRPCSynchronizationWeChat
Architect
Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.