How We Scaled a Lottery System to Over 1M Daily Users: Architecture & Performance Hacks
This article details the end‑to‑end architecture and step‑by‑step performance tuning of a high‑traffic lottery platform, covering server‑level rate limiting, application‑level throttling, semaphore usage, user‑behavior detection, caching strategies, database optimizations, and hardware upgrades that together enabled stable handling of millions of daily requests.
Overall Design Overview
The lottery feature experiences occasional traffic spikes, especially during major promotions, where daily unique visitors exceed one million. To handle such bursts, the system was refactored with two main strategies: rate‑limiting (traffic shaping) and performance optimization (throughout the stack).
1. Server‑Level Rate Limiting
We use an A10 hardware load balancer (commercial alternative to Nginx) in front of Tomcat web servers. Two key configurations were applied:
CC protection : limit each IP to 200 requests per minute; excess requests are rejected. This can be configured directly on A10 or via Nginx's connection‑limit module.
Tomcat concurrency tuning : the default maxThreads=500 caused timeouts under heavy load. After load testing we reduced it to maxThreads=400 to cap the number of concurrent requests and prevent downstream timeouts.
2. Application‑Level Rate Limiting
At the code level we introduced two mechanisms:
Semaphore control : a Java Semaphore with 350 permits (leaving 50 threads for error responses) ensures that excess requests receive a quick “no prize” response instead of hanging. This improves user experience during peak seconds.
User‑behavior identification : using real‑time data (click patterns, IP, User‑Agent, device ID) we feed requests to a risk‑assessment module. Requests lacking legitimate interaction are flagged and rejected, cutting malicious traffic roughly in half.
3. Application‑Level Performance Optimization
The main bottleneck was database pressure. We applied several tactics:
Caching :
Distributed cache (Ycache, a Memcached‑based component) stores large user‑related data.
Local cache (EhCache or a custom ConcurrentHashMap wheel) holds small, rarely‑updated data such as activity rules.
Transaction avoidance : Instead of heavyweight JDBC transactions that hold a DB connection for the entire request, we used optimistic locking (version field) and unique indexes to guarantee that only one award record is inserted per user.
UPDATE award SET award_num = award_num - 1 WHERE id = #{id} AND version = #{version} AND award_num > 0;4. Database and Hardware
Initial load tests with 50 concurrent users showed average response times >600 ms and peaks >1000 ms, mainly due to DB connection limits (30) and a mechanical hard drive. After increasing the connection pool to 100 and swapping the test server’s disk to SSD, performance improved dramatically.
Final benchmark: 441 concurrent threads, average latency 136 ms, capable of handling ~190 k requests per minute, comfortably above the estimated peak of 150 k‑250 k per minute.
5. Additional Optimization Ideas
Message queue to decouple the spin‑wheel UI from the result generation, allowing true request queuing.
Asynchronous processing for the heavy RPC call that consumed ~50 % of request time.
Read‑write splitting (discarded for this case due to consistency requirements).
Activity‑level database sharding to isolate load.
In‑memory databases for ultra‑low latency.
Hardware upgrades (SSD already proved effective; future upgrades could further raise capacity).
6. Key Takeaways
High traffic spikes often contain a large proportion of scripted requests; behavior‑based filtering is essential to protect genuine users.
Performance tuning must consider the entire stack—from front‑end throttling to JVM settings, database configuration, and underlying hardware.
Never rely solely on code‑level optimizations; hardware bottlenecks (e.g., old HDDs) can nullify all other efforts.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architect
Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
