Design and Optimization of Large‑Scale Instant Messaging Backend Architecture
This article analyses the architecture of high‑traffic instant‑messaging services such as WeChat and Momo, detailing long‑connection handling, short‑vs‑long HTTP/TCP protocols, custom binary messaging, smart routing, load‑balancing, sharding, replication, and the engineering trade‑offs required for massive scalability and reliability.
The article begins with an overview of IM traffic characteristics—massive concurrent connections, limited file descriptors per machine, and the need for virtual machines or kernel tuning to increase socket capacity, eventually leading to distributed solutions.
It then describes WeChat's dual‑protocol design: a short‑link HTTP service (short.weixin.qq.com) for login, friend management, sync, and other control APIs, and a long‑link TCP service (long.weixin.qq.com) for real‑time text, voice, image, and video transmission. Example request headers are shown below:
POST /cgi-bin/micromsg-bin/auth HTTP/1.1
Accept: */*
User-Agent: Mozilla/4.0
Content-Type: application/x-www-form-urlencoded
Host: short.weixin.qq.com
Content-Length: 174 POST /cgi-bin/micromsg-bin/newsync HTTP/1.1
Host: short.weixin.qq.com
User-Agent: Android QQMail HTTP Client
Cache-Control: no-cache
Connection: Keep-Alive
Content-Type: application/octet-stream
Content-Length: 206The short service runs on port 8080 using protobuf‑encoded bodies, while the long service also uses port 8080 with a binary protocol similar to Microsoft ActiveSync. Media uploads follow an incremental 8 KB chunk strategy, first sending a thumbnail, then the full file.
Problems with XMPP (large XML payloads, poor performance on weak networks) are highlighted, and the article notes that both WeChat and Momo moved away from XMPP to custom, lightweight protocols that resemble TCP/IP with a unique ID instead of an IP address.
For Momo, the early use of XMPP is described, followed by its shortcomings (high traffic, unreliable on 2G/3G, complex handshake). The redesign adopts a private protocol inspired by Redis, aiming for efficiency on weak networks, reliability (no message loss), and easy extensibility.
Smart routing and connection strategy are discussed: supporting both TCP and HTTP, concurrent IP/port testing, dynamic IP list updates, automatic fallback from TCP to HTTP, and preference for the nearest reachable IP. Load‑balancing challenges such as single‑point failures and DNS latency are also mentioned.
The WNS (Wireless Network Services) layer is introduced to address mobile‑internet issues like high latency, low bandwidth, packet loss, and carrier restrictions. Performance metrics (development time, connection success rate >99.9%, crash rate 0.02%) are presented.
Finally, the article delves into backend data storage: replication, sharding, consistency models, use of in‑memory stores (Bitcask), LSM‑tree, and automated migration. It outlines design goals—high throughput, asynchronous processing, manageable complexity, and automatic error recovery.
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.