How WeChat & Momo Scale IM: Lessons on Battery, Network, and Custom Protocols

This article analyzes the architectural choices behind WeChat and Momo instant‑messaging services, covering battery and traffic constraints, network reliability, the shift from XMPP to proprietary long/short connections, protocol design with protobuf, and operational strategies for scaling massive user bases.

21CTO
21CTO
21CTO
How WeChat & Momo Scale IM: Lessons on Battery, Network, and Custom Protocols
Analysis of WeChat, Momo and other IM services (note: the original analysis is dated).

Battery is the biggest bottleneck for mobile devices; developers must avoid unnecessary background processes and tune heartbeat intervals. Traffic is also critical for users with limited data plans, so every kilobyte counts.

Network Challenges

Instant messaging must work smoothly on any network. XMPP works well on strong networks but fails in weak conditions, leading to poor user experience. WeChat abandoned XMPP in favor of a hybrid long‑link/short‑link approach.

WeChat Connection Design

Two domains are used:

short.weixin.qq.com – HTTP‑based short link (port 8080, binary protobuf body) for login, friend management, message sync, user avatar, logout, and activity logging.

long.weixin.qq.com – TCP long link (port 8080, similar to Microsoft ActiveSync) for sending/receiving text, voice, image, video, etc.

All requests are based on TCP long connections; image and video uploads are split into a thumbnail request followed by a full‑data request.

Data Transfer Details

Incremental upload strategy: ~8 KB chunks are sent, each confirmed by the server before proceeding.

Upload flow: thumbnail → text message → full file. Download flow: thumbnail first, then original image, with the entire payload pushed in one batch.

Protobuf vs JSON

Protobuf, a Google‑originated binary serialization format, offers cross‑language support via the protoc compiler and reduces payload size compared to JSON, which is text‑based.

Momo Design

Early Momo used XMPP with 300‑400 k connections, suffering from high traffic, unreliability on varied networks, and complex login handshakes. Server‑client coupling caused message loss when either side entered a “half‑closed” state.

Optimization Strategies

Connection layer: simple, asynchronous message forwarding; supports 700 k connections per server. Logic layer: handles session verification, message storage, and asynchronous queues.

Adopted a private protocol inspired by Redis to achieve:

High efficiency on weak networks.

Reliability – no message loss.

Ease of extension.

Smart Routing & Connection Policy

Multiple ports and dual‑protocol support (TCP & HTTP) to bypass carrier restrictions.

Concurrent IP/port/protocol testing based on a candidate IP list.

Clients report latency when idle; backend updates IP lists accordingly.

Automatic fallback from TCP to HTTP if TCP fails.

Prefer the nearest reachable IP; avoid DNS reliance on mobile networks.

WNS (Wireless Network Services)

Provides solutions for common mobile‑Internet problems: data channels, large‑scale long‑link management, monitoring, login authentication, and rate limiting. Performance metrics include 99.9% connection success, 0.02% crash rate, and sub‑second latency even under extreme network conditions.

WeChat Backend Architecture

Focuses on distributed problem convergence, separating backend logic from data storage, and ensuring data consistency across multiple replicas.

Key components:

Sequence number generator for ordered writes.

Paxos‑like consensus and quorum algorithms for conflict resolution.

Replication and sharding strategies using KV groups, consistent hashing, and automatic rebalancing.

Storage models: pure‑memory, Bitcask, small‑table systems, LSM‑tree.

Design goals include high throughput, asynchronous processing, low complexity, and libco‑based concurrency.

Automatic repair mechanisms prevent error accumulation via full scans and proactive health checks.

---

Source: CSDN article

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Backend ArchitectureProtobufnetwork optimizationprotocol designScalable SystemsInstant Messaging
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.