Simulating 10 Billion Red Packet Requests: A Go‑Based High‑QPS Backend Blueprint
This article details a step‑by‑step engineering effort to model and benchmark a high‑throughput red‑packet service capable of handling 10 billion requests, covering target metrics, hardware setup, Go implementation, load generation, monitoring, and performance analysis.
Introduction
This work reproduces a large‑scale “shake‑red‑packet” benchmark originally described in a 2015 article. The goal is to simulate 10 billion shake‑red‑packet requests in a local environment, validate the feasibility of the load, and extract backend design lessons.
Background
Key concepts:
QPS – queries (requests) per second.
Shake‑red‑packet – a client request for a random red packet; the server returns a packet if one is available.
Send‑red‑packet – creation of a fixed‑amount red packet assigned to a set of users; users later claim portions of the amount.
Objectives
The prototype targets the following load characteristics derived from public 2016 data:
Total active users ≈ 540 million; with 600 servers this yields ~5.4 million users per server.
Peak per‑server QPS ≈ 23 k (stress‑test target up to 60 k).
Shake‑red‑packet request rate ≈ 83 requests / s per server.
Send‑red‑packet rate ≈ 200 packets / s per server.
Software & Hardware
Software
Go 1.8r3, shell scripts, Python for auxiliary tools. Server OS: Ubuntu 12.04. Client OS: Debian 5.0.
Hardware
Server: Dell R2950, 8‑core CPU, 16 GB RAM (non‑dedicated). Client side: 17 ESXi 5.0 VMs, each with 4 cores and 5 GB RAM, establishing 60 k TCP connections per VM to simulate 1 million concurrent clients.
Technical Implementation
1. One‑million connections on a single machine
Go’s goroutine model allows a single host to maintain one million TCP connections. The implementation is available at:
https://github.com/xiaojiaqi/C1000kPracticeGuide
2. Achieving 30 k QPS
Client side: All client VMs synchronize clocks via NTP. Each client decides whether to send a request in the current second using the rule time() % 20 == user_id % 20, which partitions 1 million users into 20 groups. This yields roughly 30 k requests per second.
Server side: The server increments a per‑second counter for processed requests. Network traffic is monitored with a Python wrapper around ethtool. The monitoring script logs packets per second and displays them in a simple UI.
3. Shake‑red‑packet logic
The server continuously generates red packets at a fixed rate and stores them in per‑user buckets. When a client request arrives, the server checks the bucket; if a packet exists it is returned, otherwise a failure response is sent. Bucket partitioning reduces lock contention. For higher throughput a Disruptor‑style ring buffer could replace the simple queue.
4. Send‑red‑packet logic
Red packets are created with random amounts and assigned to a small set of users. Clients request to claim a portion of the amount. The prototype omits payment processing and encryption.
5. Monitoring
Monitoring reuses code from another project (https://github.com/xiaojiaqi/fakewechat). Each client and the server periodically push their counters to a central collector, which aggregates and visualises the data. Logs are persisted for offline analysis.
6. Code architecture
Connections are grouped into multiple independent SET objects. Each SET manages a few thousand connections and owns a single goroutine for reading from those connections. This reduces the total goroutine count to roughly the number of connections plus a small overhead.
Inside a SET, a worker goroutine processes three message types:
Shake‑red‑packet request.
Other client messages (e.g., chat).
Server responses.
The red‑packet generator pushes packets into each SET’s queue at a steady pace, ensuring fairness across SETs.
Full source code is hosted at:
https://github.com/xiaojiaqi/10billionhongbaos
Experimental Procedure
Phase 1 – Connection establishment
Start the server and monitoring service, then launch the 17 client VMs to create 1 million TCP connections. Verify connection counts with:
Alias ss2='ss -ant | grep 1025 | grep EST | awk -F: "{print $8}" | sort | uniq -c'Phase 2 – 30 k QPS
Set client QPS to 30 k via an HTTP control interface, start a red‑packet generator at 200 packets / s, and observe that clients receive roughly 200 packets per second.
Phase 3 – 60 k QPS
Increase client QPS to 60 k, repeat the packet generation, and confirm the system continues to process the load, albeit with higher variance.
Data Analysis
Client‑side and server‑side QPS were recorded and plotted using Python and gnuplot. Three distinct regions appear: baseline, 30 k QPS, and 60 k QPS. Fluctuations are attributed to goroutine scheduling, network latency, and occasional packet loss.
Additional charts show red‑packet generation rates, per‑second client acquisition, and a Go pprof snapshot confirming acceptable GC pauses on the legacy hardware.
Conclusion
The prototype demonstrates that a single server can sustain 1 million concurrent connections and up to 60 k QPS, meeting the target of processing 10 billion shake‑red‑packet requests in roughly 7 minutes when 600 such servers are deployed. Differences from a production system include the absence of protobuf encryption, payment integration, sophisticated logging, and advanced monitoring, but the core scalability principles are validated.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
IT Architects Alliance
Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
