Backend Development 18 min read

How to Simulate 100 Billion WeChat Red Packet Requests on a Single Server

This article details a backend engineering practice that simulates 100 billion shake‑red‑packet requests by building a single‑machine prototype supporting one million connections, achieving peak QPS of 60 k, and demonstrates the methodology, hardware setup, code design, monitoring, and performance analysis.

21CTO

Mar 6, 2020

How to Simulate 100 Billion WeChat Red Packet Requests on a Single Server

Background

QPS (Queries per second) and PPS (Packets per second) are defined, along with the concepts of shaking a red packet (client requests a red packet and receives it if available) and sending a red packet (system generates a red packet for a set of users who can claim portions of its amount).

Goal

The objective is to support one million concurrent connections on a single machine, simulate the shake‑red‑packet and send‑red‑packet processes, and achieve a peak QPS of 60 000 while maintaining stable operation.

Target Determination

Based on the production environment of 638 servers handling roughly 1.43 billion users, the per‑machine user capacity is estimated at about 900 k. Assuming 600 active servers, the per‑machine QPS target is calculated as 2.3 k–6 k, with specific targets of 3 k and 6 k QPS for testing.

Hardware & Software

Software stack: Golang 1.8r3, shell scripts, Python (Golang chosen for prototype speed). Server OS: Ubuntu 12.04. Client OS: Debian 5.0. Server hardware: Dell R2950, 8‑core CPU, 16 GB RAM (non‑dedicated). Client hardware: 17 ESXi 5.0 VMs, each with 4 CPU and 5 GB RAM, establishing 60 000 connections per VM to reach one million total connections.

Technical Implementation

Connections are partitioned into multiple independent sets (SETs), each managing a few thousand connections, reducing the number of goroutines needed. Only one goroutine per connection reads messages; a dedicated goroutine per SET processes them, minimizing lock contention. NTP synchronizes server clocks, and clients use timestamps to decide when to send requests. An algorithm distributes client IDs into groups so that each client sends requests at the correct rate (e.g., 100 000 users / 5 000 QPS = 20 groups, each group sends once per second).

Monitoring

Two main metrics are collected: per‑second request counts (embedded counters in the code) and network packet statistics (using a Python script combined with ethtool). Monitoring screenshots illustrate packet flow and request rates.

Practice Phases

Phase 1 : Start server and monitoring services, then launch 17 client VMs to establish one million connections. Verify connection counts using ss (e.g.,

Alias ss2='ss -ant | grep 1025 | grep EST | awk -F: "{print $8}" | sort | uniq -c'

Phase 2 : Increase client QPS to 30 k via HTTP interface, run a red‑packet generator at 200 packets/s (total 40 k packets), and observe stable QPS around 30 k on both client and server graphs.

Phase 3 : Raise client QPS to 60 k, repeat the generation and consumption of red packets, and note increased variance in QPS and packet counts due to network saturation.

Data Analysis

Client QPS graphs show three intervals (connection establishment, 30 k QPS, 60 k QPS). Server QPS graphs mirror the client behavior but exhibit a noticeable dip around minute 22, indicating a need for further code optimization. Combined client‑server graphs confirm that the system meets design expectations at 30 k QPS, while 60 k QPS introduces instability.

Additional metrics include red‑packet generation counts, shake‑red‑packet acquisition rates (≈200 per second at 30 k QPS), and Golang pprof data showing GC pauses under 10 ms on the older hardware.

Conclusion

The prototype successfully demonstrates a design that supports one million users and 30 k–60 k QPS per machine, accurately simulating the shake‑red‑packet and send‑red‑packet workflows of a large‑scale messaging platform. With 600 such machines, the system could process 10 billion shake requests in roughly seven minutes, validating the scalability of the approach.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Backend Golang Performance Testing Load Testing distributed-systems high-concurrency

Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.