How to Simulate 100 Billion Red Packet Requests on a Single Server
This article details a practical approach to building a single‑machine prototype that supports one million concurrent connections and processes up to 60,000 QPS, simulating the massive load of 100 billion WeChat red‑packet requests with Go, Linux, and custom monitoring tools.
1. Introduction
Inspired by the InfoQ presentation "How to Build a Spring Festival Bonus System" (2015), the author explores whether the ideas can be reproduced locally to handle 100 billion red‑packet requests. The goal is a single‑machine prototype that supports one million connections and a peak QPS of 60 k.
2. Background Knowledge
QPS: Queries per second.
PPS: Packets per second.
Shake‑red‑packet: Client sends a request; if a red packet is available the server returns it.
Send‑red‑packet: Server creates a packet with a certain amount, assigns it to several users, and users request to claim portions of the amount.
3. Define Goals
3.1 User Count
Based on the original system (638 servers, ~1.43 billion users), a single server would handle roughly 2.28 million users, but realistic concurrent online users are far lower (≈540 million total, ≈5.4 million active during the 2015 Spring Festival).
3.2 Server Count
The real system used 638 servers; assuming 600 are active for the experiment.
3.3 Per‑Machine Load
Each server should support about 900 k users (5.4 billion / 600).
3.4 Peak QPS per Machine
The original peak was 14 million QPS across all servers, i.e., ~23 k QPS per server; the author targets at least 30 k and up to 60 k QPS.
3.5 Red‑Packet Distribution Rate
The system should be able to issue about 83 packets per second per server (50 k total / 600).
4. Summary of Requirements
Support at least 1 million concurrent connections.
Handle a minimum of 30 k QPS, with targets of 30 k, 40 k and 60 k.
Shake‑red‑packet: process 83 successful shakes per second (≈2.3 k shake requests per second, the rest are failures).
Send‑red‑packet: support 200 packets per second distribution.
5. Differences from a Real System
Aspect
Real Service
Prototype
Business Complexity
Higher
Very simple
Protocol
Protobuf + encryption
Simple custom protocol
Payment
Complex
None
Logging
Complex
None
Performance
Higher
None
User Distribution
Hash‑based across servers
Sequential IDs, easier optimization
Security Controls
Complex
None
Hot Update & Version Control
Complex
None
Monitoring
Fine‑grained
Simple
6. Basic Software and Hardware
6.1 Software
Golang 1.8r3, shell, Python. The prototype uses Go because the initial version already met the requirements despite some limitations.
Server OS: Ubuntu 12.04.
Client OS: Debian 5.0.
6.2 Hardware
Server: Dell R2950, 8‑core CPU, 16 GB RAM (non‑dedicated, shared with other workloads).
CPU details:
Client: ESXi 5.0 VM, 4 cores, 5 GB RAM, 17 instances, each establishing 60 k connections to achieve 1 million simulated clients.
7. Technical Analysis and Implementation
7.1 Achieving 1 Million Connections on One Machine
The author had already built a prototype capable of handling a million connections; modern servers can easily support this.
Source code and documentation:
https://github.com/xiaojiaqi/C1000kPracticeGuide
7.2 Reaching 30 k QPS
Client‑side QPS
With 1 million connections, 30 k QPS means each connection sends a shake request roughly every 33 seconds. The client coordination is achieved by synchronising time via NTP and using a simple modulo algorithm to decide which users send a request in a given second.
Alias ss2=Ss –ant | grep 1025 | grep EST | awk –F: “{print $8}” | sort | uniq –cServer‑side QPS
The server mainly processes incoming shake requests. Two additional tasks are required: a per‑second request counter and network monitoring using a Python script combined with ethtool.
7.3 Shake‑Red‑Packet Business
The server continuously generates packets. When a client requests a shake, the server checks the packet queue; if a packet exists it is returned, otherwise a failure response is sent. To reduce lock contention, users are partitioned into separate buckets, and a high‑performance queue such as Disruptor could be used for further optimisation.
7.4 Send‑Red‑Packet Business
The server randomly creates packets, assigns them to a few users, and those users request to claim portions of the amount. Payment handling is omitted in the prototype.
7.5 Monitoring
A lightweight monitoring module (borrowed from the fakewechat project) aggregates per‑client counters and logs them for later analysis. In production, a time‑series database like OpenTSDB would be used.
8. Code Implementation and Analysis
The prototype uses Go goroutines. Managing >1 million connections would otherwise require millions of goroutines; the design groups connections into multiple independent SET objects, each handling a few thousand connections, thereby reducing the total goroutine count dramatically.
Each SET has its own receive queue and a single worker goroutine that processes three types of messages: client shake requests, other client messages (e.g., chat), and server responses. The shake‑request handling checks the SET’s packet queue and returns either a packet or a failure message.
9. Practice
The experiment proceeds in three phases:
Start the server and monitoring, then launch 17 client VMs to establish 1 million connections. Verify connection counts with ss.
Set client QPS to 30 k via an HTTP interface, observe stable QPS and packet distribution at 200 packets/s.
Increase client QPS to 60 k, repeat the packet generation and consumption steps.
10. Data Analysis
Client‑side QPS over time shows three intervals (connection ramp‑up, 30 k QPS, 60 k QPS). Fluctuations are caused by goroutine scheduling, network latency, and occasional packet loss.
Server‑side QPS mirrors the client graph, with a noticeable dip around 22:57 due to code‑level bottlenecks.
Combined graphs confirm the system behaves as expected.
Red‑packet generation and consumption remain stable throughout the test.
11. Conclusion
The prototype successfully simulates a system that supports one million users and sustains at least 30 k QPS (up to 60 k QPS) for shake‑red‑packet and send‑red‑packet operations. With 600 such servers, the entire 100 billion request workload could be completed in roughly seven minutes, demonstrating that the design goals are achievable.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Programmer DD
A tinkering programmer and author of "Spring Cloud Microservices in Action"
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
