How to Simulate 100 Billion Red Packet Requests on a Single Server

This article details a practical approach to building a single‑machine prototype that supports one million concurrent connections and processes up to 60,000 QPS, simulating the massive load of 100 billion WeChat red‑packet requests with Go, Linux, and custom monitoring tools.

Programmer DD
Programmer DD
Programmer DD
How to Simulate 100 Billion Red Packet Requests on a Single Server

1. Introduction

Inspired by the InfoQ presentation "How to Build a Spring Festival Bonus System" (2015), the author explores whether the ideas can be reproduced locally to handle 100 billion red‑packet requests. The goal is a single‑machine prototype that supports one million connections and a peak QPS of 60 k.

2. Background Knowledge

QPS: Queries per second.

PPS: Packets per second.

Shake‑red‑packet: Client sends a request; if a red packet is available the server returns it.

Send‑red‑packet: Server creates a packet with a certain amount, assigns it to several users, and users request to claim portions of the amount.

3. Define Goals

3.1 User Count

Based on the original system (638 servers, ~1.43 billion users), a single server would handle roughly 2.28 million users, but realistic concurrent online users are far lower (≈540 million total, ≈5.4 million active during the 2015 Spring Festival).

3.2 Server Count

The real system used 638 servers; assuming 600 are active for the experiment.

3.3 Per‑Machine Load

Each server should support about 900 k users (5.4 billion / 600).

3.4 Peak QPS per Machine

The original peak was 14 million QPS across all servers, i.e., ~23 k QPS per server; the author targets at least 30 k and up to 60 k QPS.

3.5 Red‑Packet Distribution Rate

The system should be able to issue about 83 packets per second per server (50 k total / 600).

4. Summary of Requirements

Support at least 1 million concurrent connections.

Handle a minimum of 30 k QPS, with targets of 30 k, 40 k and 60 k.

Shake‑red‑packet: process 83 successful shakes per second (≈2.3 k shake requests per second, the rest are failures).

Send‑red‑packet: support 200 packets per second distribution.

5. Differences from a Real System

Aspect

Real Service

Prototype

Business Complexity

Higher

Very simple

Protocol

Protobuf + encryption

Simple custom protocol

Payment

Complex

None

Logging

Complex

None

Performance

Higher

None

User Distribution

Hash‑based across servers

Sequential IDs, easier optimization

Security Controls

Complex

None

Hot Update & Version Control

Complex

None

Monitoring

Fine‑grained

Simple

6. Basic Software and Hardware

6.1 Software

Golang 1.8r3, shell, Python. The prototype uses Go because the initial version already met the requirements despite some limitations.

Server OS: Ubuntu 12.04.

Client OS: Debian 5.0.

6.2 Hardware

Server: Dell R2950, 8‑core CPU, 16 GB RAM (non‑dedicated, shared with other workloads).

Server hardware
Server hardware

CPU details:

CPU info
CPU info

Client: ESXi 5.0 VM, 4 cores, 5 GB RAM, 17 instances, each establishing 60 k connections to achieve 1 million simulated clients.

7. Technical Analysis and Implementation

7.1 Achieving 1 Million Connections on One Machine

The author had already built a prototype capable of handling a million connections; modern servers can easily support this.

Source code and documentation:

https://github.com/xiaojiaqi/C1000kPracticeGuide

7.2 Reaching 30 k QPS

Client‑side QPS

With 1 million connections, 30 k QPS means each connection sends a shake request roughly every 33 seconds. The client coordination is achieved by synchronising time via NTP and using a simple modulo algorithm to decide which users send a request in a given second.

Alias ss2=Ss –ant | grep 1025 | grep EST | awk –F: “{print $8}” | sort | uniq –c

Server‑side QPS

The server mainly processes incoming shake requests. Two additional tasks are required: a per‑second request counter and network monitoring using a Python script combined with ethtool.

Network monitoring tool
Network monitoring tool

7.3 Shake‑Red‑Packet Business

The server continuously generates packets. When a client requests a shake, the server checks the packet queue; if a packet exists it is returned, otherwise a failure response is sent. To reduce lock contention, users are partitioned into separate buckets, and a high‑performance queue such as Disruptor could be used for further optimisation.

7.4 Send‑Red‑Packet Business

The server randomly creates packets, assigns them to a few users, and those users request to claim portions of the amount. Payment handling is omitted in the prototype.

7.5 Monitoring

A lightweight monitoring module (borrowed from the fakewechat project) aggregates per‑client counters and logs them for later analysis. In production, a time‑series database like OpenTSDB would be used.

Monitoring screenshot
Monitoring screenshot

8. Code Implementation and Analysis

The prototype uses Go goroutines. Managing >1 million connections would otherwise require millions of goroutines; the design groups connections into multiple independent SET objects, each handling a few thousand connections, thereby reducing the total goroutine count dramatically.

Architecture diagram
Architecture diagram

Each SET has its own receive queue and a single worker goroutine that processes three types of messages: client shake requests, other client messages (e.g., chat), and server responses. The shake‑request handling checks the SET’s packet queue and returns either a packet or a failure message.

9. Practice

The experiment proceeds in three phases:

Start the server and monitoring, then launch 17 client VMs to establish 1 million connections. Verify connection counts with ss.

Set client QPS to 30 k via an HTTP interface, observe stable QPS and packet distribution at 200 packets/s.

Increase client QPS to 60 k, repeat the packet generation and consumption steps.

10. Data Analysis

Client‑side QPS over time shows three intervals (connection ramp‑up, 30 k QPS, 60 k QPS). Fluctuations are caused by goroutine scheduling, network latency, and occasional packet loss.

Client QPS graph
Client QPS graph

Server‑side QPS mirrors the client graph, with a noticeable dip around 22:57 due to code‑level bottlenecks.

Server QPS graph
Server QPS graph

Combined graphs confirm the system behaves as expected.

Combined QPS graph
Combined QPS graph

Red‑packet generation and consumption remain stable throughout the test.

Packet generation graph
Packet generation graph

11. Conclusion

The prototype successfully simulates a system that supports one million users and sustains at least 30 k QPS (up to 60 k QPS) for shake‑red‑packet and send‑red‑packet operations. With 600 such servers, the entire 100 billion request workload could be completed in roughly seven minutes, demonstrating that the design goals are achievable.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

GolangLoad Testingdistributed-systemshigh-performance
Programmer DD
Written by

Programmer DD

A tinkering programmer and author of "Spring Cloud Microservices in Action"

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.