How We Optimized WeChat’s Heartbeat to Cut Power and Bandwidth Usage
This article details the analysis and redesign of Line, WhatsApp, and WeChat heartbeat mechanisms, presenting a simplified adaptive heartbeat algorithm, its testing methodology, comparative push strategies, GCM characteristics, and practical improvement suggestions to reduce power consumption, network load, and latency.
1. Main Goal
The primary objective is to find the largest possible heartbeat interval for TCP keep‑alive connections without compromising message timeliness, thereby reducing Android WeChat’s channel resource consumption, server load, and battery drain. The approach references valuable practices from WhatsApp and Line and combines factors affecting TCP connection lifespan to implement an adaptive heartbeat algorithm, using GCM as an auxiliary channel for new‑message notifications.
2. WhatsApp, Line, and WeChat Push Strategy Analysis
2.1 WhatsApp
On devices without GCM support, WhatsApp uses a long‑connection plus heartbeat strategy similar to WeChat, with a 4 min 45 s interval on Wi‑Fi and mobile networks, and disconnects after five heartbeats to reconnect. On GCM‑enabled devices, WhatsApp relies on GCM to activate; it establishes a long connection, sends push messages directly, and disconnects after ten minutes of inactivity. When a message arrives, the server sends a GCM notification, prompting the client to re‑establish the long connection.
2.2 Line
Line employs different strategies in various regions:
USA (with GCM): maintains a 7‑minute heartbeat on CDMA2000, keeps the long connection for half an hour, then disconnects. Upon receiving a GCM message, it reconnects and repeats the cycle.
Domestic (no GCM): two observed strategies. One uses a long‑connection + heartbeat (4 min 45 s on Wi‑Fi, 7 min on mobile). The other uses a polling strategy (illustrated in Figure 2‑1) where the client periodically sends requests, the server replies and closes the connection, and the client re‑connects after a configurable interval.
Taiwan (no GCM): similar to the domestic polling strategy.
2.3 WeChat
WeChat does not use GCM; it maintains its own TCP long connection with a fixed heartbeat.
2.4 Typical Heartbeat Values
Platform
Wi‑Fi
Mobile
4 min 45 s
4 min 45 s
Line
3 min 20 s
7 min
GCM
15 min
28 min
2.5 Advantages of the Strategies
a) WeChat: shorter heartbeat interval yields the most timely new‑message alerts.
b) GCM (used by Line and WhatsApp): saves power and reduces system load by offloading push to the cloud.
c) Line polling: provides timely messages when active and saves power when idle.
2.6 Disadvantages of the Strategies
a) WeChat: higher heartbeat frequency increases power, data usage, and signaling load.
b) Line polling: can cause message delays up to 2.5 hours.
c) WhatsApp and Line GCM reliance: depend on Google’s push service; instability leads to delayed messages.
d) Domestic 2G/3G networks: frequent GCM disconnections result in poor push timeliness.
3. GCM Research
3.1 GCM Characteristics
a) Android 2.2 devices lack GCM support; versions 2.2‑3.0 require Google Store and account setup, while 4.0.4+ work without an account.
b) GCM only transmits data (<4 KB) and leaves processing to the developer.
c) Android apps can receive messages without running, via broadcast.
d) GCM does not guarantee message order or delivery.
3.2 GCM Heartbeat Strategy and Issues
a) GCM keeps long connections alive with a 15‑minute Wi‑Fi heartbeat and 28‑minute mobile heartbeat.
b) Google can change the heartbeat interval for all Android devices (has not done so yet).
c) Fixed, long intervals cause NAT aging to drop connections on networks with short NAT timeouts (e.g., 2G), leading to delayed push.
3.3 GCM Availability and Stability
Testing shows low domestic availability due to OEM customizations, required Google accounts on older Android versions, and short NAT timeouts on 2G/3G. Some carriers block port 5228, preventing GCM connections. In contrast, US and Taiwan 3G networks exhibit high stability, with rare disconnections.
3.4 GCM Server Types
a) HTTP Server: synchronous API, up to 1000 devices per request.
b) XMPP Server: asynchronous API, single‑device or per‑user multi‑device, concurrency < 1000, requires Google whitelist.
4. Potential Improvements for WeChat
Key improvement areas:
Public push channel
Use GCM as an auxiliary channel
Adaptive heartbeat interval optimization
4.1 Public Push Channel
Because GCM reliability is low domestically, many apps implement their own push, causing frequent wake‑ups and high power consumption. Third‑party public push services (e.g., Tencent’s Xinge) can be evaluated and adopted.
4.2 Using GCM as an Auxiliary Channel
GCM can be used to notify devices of new messages when the primary TCP connection fails, with controlled notification intervals (e.g., every five minutes).
4.3 Adaptive Heartbeat Interval Optimization
Factors influencing TCP connection lifespan include NAT timeout, DHCP lease time bugs, and network state changes. The adaptive algorithm measures NAT timeout, then dynamically adjusts heartbeat intervals to stay just below the critical value, using a “delayed heartbeat test” that requires three consecutive short‑heartbeat successes before considering the network stable.
Algorithm overview (Figure 4‑1):
During stable periods, the system monitors failures; if all five delayed‑heartbeat tests fail, it recomputes a safer interval (Figure 4‑2). Weekly, the system re‑enters the adaptive calculation phase to adapt to changing NAT timeouts.
4.4 Redundant Sync and Heartbeat
Additional sync/heartbeat actions are triggered when the user lights the screen, switches to foreground, or reconnects to the network, ensuring timely message delivery.
5. Risks and Mitigations
5.1 DHCP Lease Issues
Android’s bug of not renewing expired DHCP leases can cause sudden TCP connection loss. Mitigation: collect heartbeat success rates, report to backend, and adjust heartbeat ranges per region.
5.2 Other TCP Longevity Factors
Open invitation for further factor contributions.
6. Appendices
6.1 NAT Timeout Overview
Explanation of NAT behavior and a table of observed NAT timeout values for various carriers and regions.
Region / Network
NAT Timeout
China Mobile 2G/3G
5 min
China Unicom 2G
5 min
China Telecom 3G
> 28 min
USA 3G
> 28 min
Taiwan 3G
> 28 min
Long‑connection heartbeat must be less than NAT timeout; otherwise the connection is dropped and push cannot be delivered until reconnection.
6.2 Android DHCP Lease Bug
Android may not renew DHCP leases, continuing to use expired IPs, leading to silent TCP failures. The issue sometimes self‑corrects halfway through the lease period.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
WeChat Client Technology Team
Official account of the WeChat mobile client development team, sharing development experience, cutting‑edge tech, and little‑known stories across Android, iOS, macOS, Windows Phone, and Windows.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
