Design and Performance Optimization of a High‑Concurrency Volunteer Registration System
This article recounts the end‑to‑end design, bottleneck analysis, and iterative performance tuning—including MySQL, Redis, RocketMQ, compression, sharding, and connection‑pool adjustments—that enabled a volunteer registration platform to meet demanding high‑concurrency and data‑accuracy requirements under limited resources.
In the preface, the author explains that after reading many high‑concurrency articles, they decided to share the practical experience of a volunteer registration system project that faced strict performance and data‑accuracy demands.
The project required handling up to 40,000 QPS for various student‑side queries and 4,000 QPS for teacher‑side operations, with strict latency, fault‑recovery, and data‑integrity constraints, all on a limited number of physical machines.
Initial analysis identified the main challenges as achieving the required concurrency while guaranteeing precise data without loss, and the need for data masking and anti‑tampering.
Benchmarking showed a single‑node MySQL could only deliver about 5k TPS and 12k QPS, insufficient for the target, leading to the decision to avoid MySQL‑only solutions and consider alternative architectures.
Redis was then evaluated; a single‑node Redis achieved 100k QPS for GET and 80k TPS for SET, but its volatility required a high‑availability design.
The chosen architecture combined Redis for fast reads/writes, RocketMQ for asynchronous persistence, and MySQL for durable storage, with the workflow: Redis transaction → RocketMQ synchronous disk write → Redis commit → MySQL asynchronous insert.
Fault‑recovery strategies were added: using Redis transactions, synchronously persisting RocketMQ messages, and a timestamp‑based reconciliation task that periodically aligns Redis and MySQL data and alerts on prolonged inconsistencies.
First stress tests revealed correct data consistency but only ~4k TPS, prompting deeper profiling. Using Arthas, the slowest step was identified as Redis data modification, caused by large JSON payloads saturating upstream bandwidth.
To mitigate bandwidth, the JSON payloads were compressed with GZIP before storing in Redis, and oversized strings were split across multiple keys.
Subsequent tests doubled TPS to ~8k, yet still fell short of the 20k goal. The next bottleneck was RocketMQ’s synchronous disk writes, and query interfaces also hit bandwidth limits.
Horizontal scaling was applied: multiple RocketMQ brokers and Redis hash‑slot partitioning based on the last digits of identification numbers, achieving more balanced load distribution.
An unexpected performance drop was traced to Redis hash‑slot calculation on the master node, which added 20‑30% latency; moving the slot calculation to the client side restored QPS to ~20k and TPS to ~12k.
Further tuning exposed that the built‑in Tomcat container limited concurrent connections to about 1k per instance; with a 100 ms response time this capped overall TPS around 3k per node. Optimizing response times to ≤100 ms allowed the system to approach the intended 40k QPS.
Additional refinements included reducing Redis lock/transaction usage, employing pipelining, and adjusting connection pools, which finally pushed TPS to ~25k and QPS to >40k for critical interfaces.
Before launch, the deployment consisted of 8+4+1+2 Java service nodes, multiple Nginx instances, 4 RocketMQ brokers, 4 Redis nodes, a primary‑secondary MySQL pair, and an Elasticsearch service, with automated failover and partition‑aware routing.
Post‑deployment, the system successfully handled the required load, recovered quickly from node failures, and validated the high‑concurrency design, while also providing valuable lessons on monitoring, middleware selection, and performance bottleneck identification.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Rare Earth Juejin Tech Community
Juejin, a tech community that helps developers grow.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
