Design and Optimization of a High‑Concurrency Lottery System
This article details how a large‑scale lottery service was architected and tuned for extreme traffic spikes by applying server‑side rate limiting, application‑level throttling, behavior‑based filtering, caching strategies, database optimizations, and hardware upgrades, resulting in a ten‑fold performance improvement.
The author introduces a typical high‑concurrency scenario—an online lottery during a major promotion where daily unique visitors exceed one million—and explains the need for both traffic shaping and performance tuning.
Server‑side rate limiting is achieved using an A10 hardware load balancer with per‑IP request caps (200 requests per minute) and by adjusting Tomcat's maxThreads from the default 500 to 400, then further limiting to 350 via a Semaphore in the application.
Application‑level throttling includes:
Using a Java Semaphore (set to 350 permits) to quickly reject excess requests with a friendly “no prize” response.
Implementing real‑time human‑machine detection based on click patterns, IP, User‑Agent, and device identifiers to filter scripted or bot traffic.
Maintaining a risk‑based blacklist for known abusive accounts.
Additional rule‑based limits such as activity‑specific caps.
Performance optimization focuses on reducing database pressure:
Two‑tier caching: a distributed cache (Ycache, a Memcached derivative) for large user‑related data and a local cache (EhCache or a custom ConcurrentHashMap ) for static hot data.
Avoiding heavyweight transactions; instead using optimistic locking with version numbers ( update award set award_num=award_num-1 where id=#{id} and version=#{version} and award_num>0 ) and unique indexes to guarantee single‑winner integrity.
Database and hardware bottlenecks were uncovered through JMeter load tests and VisualVM profiling, revealing connection pool limits and slow I/O on an old mechanical disk. Raising the DB connection pool to 100 and swapping to SSD reduced average response time from >600 ms to 136 ms under 441 concurrent requests, comfortably handling the projected 190 k requests per minute.
The article concludes with additional ideas such as message‑queue buffering, asynchronous RPC calls, read‑write splitting, activity‑specific databases, in‑memory databases, and future hardware upgrades, followed by reflections on the importance of holistic performance engineering.
Wukong Talks Architecture
Explaining distributed systems and architecture through stories. Author of the "JVM Performance Tuning in Practice" column, open-source author of "Spring Cloud in Practice PassJava", and independently developed a PMP practice quiz mini-program.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.