How WeChat Optimizes Mobile TCP Connections: Timeout, Strategies, and IP Sorting
This article examines WeChat's Mars STN module, detailing TCP connection timeout handling, the trade‑offs between serial, concurrent and composite connection strategies, and the evolution of IP‑Port sorting algorithms—from random combinations to history‑aware and forgetting mechanisms—to achieve high performance, low load, and high availability on mobile networks.
Introduction
WeChat's Mars is an open‑source, cross‑platform C++ library used in Android, iOS, Windows, Mac, and other platforms. It consists of several independent components: COMM (basic libraries such as sockets, threads, message queues, coroutines), XLOG (high‑performance logging), SDT (network diagnostics), and STN (signalling transmission network).
TCP Connection Basics
TCP provides reliable end‑to‑end transmission and is established via a three‑way handshake. The simple
int connect(int sockfd, const struct *addr, socklen_t addrlen)call can be optimized further.
Connection Timeout and Retransmission
In unstable mobile networks, TCP timeout and retransmission are critical. Traditional TCP stacks may wait up to 75 seconds before reporting a timeout, which is unacceptable for most mobile apps. Since the protocol stack cannot be changed, an application‑level timeout mechanism is needed.
Choosing an appropriate timeout (e.g., 4 s, 10 s, 20 s, 30 s) depends on scenarios such as network unavailability, server overload, or weak signal conditions.
Platform Differences
Android typically uses exponential back‑off intervals (1, 2, 4, 8, 16 s) with a total timeout around 63 s, while iOS uses a more aggressive schedule (1, 1, 1, 1, 1, 2, 4, 8, 16, 32 s) totaling about 67 s.
Both platforms end up with a total timeout of roughly one minute, which degrades user experience.
Connection Termination
The four‑way handshake ends with the active side entering TIME_WAIT , lasting twice the maximum segment lifetime (MSL), typically 30–60 s. Excessive TIME_WAIT sockets can exhaust resources; solutions include using long‑lived connections or ensuring the client initiates closure.
Connection Strategies
Serial vs. Concurrent vs. Composite
Serial connections try one IP&Port at a time, using minimal resources but being slow. Concurrent connections launch multiple attempts simultaneously, achieving the fastest availability but increasing server load.
WeChat adopts a "composite" strategy: start with one connection, and if it does not succeed within 4 s, launch a second, continuing up to five IP&Port pairs. This provides low server load like serial connections while achieving high performance.
IP&Port Sorting Algorithms
Components of IP&Port
IP sources are prioritized as WXDNS IP, DNS IP, Auth IP, and Hardcode IP. Ports are typically two per service for redundancy.
Algorithm 1: Random Combination
Combine IP and port lists randomly, ensuring successive entries differ in both IP and port. This yields high performance and avoids bias but can quickly ban resources under poor network conditions.
Algorithm 2: History‑Aware ("Learn from History")
Maintain separate regular and fallback lists, use a mix of four regular + one fallback resource per composite attempt, record success/failure per IP&Port, and apply scoring with thresholds (e.g., >3 failures in recent 8 records within 10 minutes triggers a ban). Unrecorded resources receive random scores.
Algorithm 3: Forgetting History
To avoid stale high‑scoring entries after a failure, implement dual‑layer history (in‑memory and file), refresh file history every 24 hours, and blend historical and fresh scores, allowing rapid recovery after outages.
Implementation details can be found in Mars source code's simple_ipport_sort function.
Conclusion
Optimizing TCP connections in mobile environments requires balancing timeout values, connection strategies, and IP&Port ordering. WeChat's Mars STN module evolves from simple serial attempts to a composite approach and from random sorting to sophisticated history‑aware and forgetting algorithms, achieving high performance, low server load, and high availability.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
WeChat Client Technology Team
Official account of the WeChat mobile client development team, sharing development experience, cutting‑edge tech, and little‑known stories across Android, iOS, macOS, Windows Phone, and Windows.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
