Backend Development 12 min read

Design and Optimization of a High‑Concurrency Lottery System

This article details how a large‑scale lottery service was architected and tuned for extreme traffic spikes by applying server‑side rate limiting, application‑level throttling, behavior‑based filtering, caching strategies, database optimizations, and hardware upgrades, resulting in a ten‑fold performance improvement.

Wukong Talks Architecture
Wukong Talks Architecture
Wukong Talks Architecture
Design and Optimization of a High‑Concurrency Lottery System

The author introduces a typical high‑concurrency scenario—an online lottery during a major promotion where daily unique visitors exceed one million—and explains the need for both traffic shaping and performance tuning.

Server‑side rate limiting is achieved using an A10 hardware load balancer with per‑IP request caps (200 requests per minute) and by adjusting Tomcat's maxThreads from the default 500 to 400, then further limiting to 350 via a Semaphore in the application.

Application‑level throttling includes:

Using a Java Semaphore (set to 350 permits) to quickly reject excess requests with a friendly “no prize” response.

Implementing real‑time human‑machine detection based on click patterns, IP, User‑Agent, and device identifiers to filter scripted or bot traffic.

Maintaining a risk‑based blacklist for known abusive accounts.

Additional rule‑based limits such as activity‑specific caps.

Performance optimization focuses on reducing database pressure:

Two‑tier caching: a distributed cache (Ycache, a Memcached derivative) for large user‑related data and a local cache (EhCache or a custom ConcurrentHashMap ) for static hot data.

Avoiding heavyweight transactions; instead using optimistic locking with version numbers ( update award set award_num=award_num-1 where id=#{id} and version=#{version} and award_num>0 ) and unique indexes to guarantee single‑winner integrity.

Database and hardware bottlenecks were uncovered through JMeter load tests and VisualVM profiling, revealing connection pool limits and slow I/O on an old mechanical disk. Raising the DB connection pool to 100 and swapping to SSD reduced average response time from >600 ms to 136 ms under 441 concurrent requests, comfortably handling the projected 190 k requests per minute.

The article concludes with additional ideas such as message‑queue buffering, asynchronous RPC calls, read‑write splitting, activity‑specific databases, in‑memory databases, and future hardware upgrades, followed by reflections on the importance of holistic performance engineering.

performance optimizationbackend architectureLoad Balancinghigh concurrencySemaphoreRate LimitingDatabase Tuning
Wukong Talks Architecture
Written by

Wukong Talks Architecture

Explaining distributed systems and architecture through stories. Author of the "JVM Performance Tuning in Practice" column, open-source author of "Spring Cloud in Practice PassJava", and independently developed a PMP practice quiz mini-program.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.