Backend Development 9 min read

Optimizing a High‑Concurrency Lottery System: Caching, Queueing, Optimistic Locking, and Read/Write Splitting

The article analyzes a lottery‑service bottleneck caused by massive concurrent database reads and writes and presents a comprehensive set of backend optimization techniques—including caching, queue‑based peak‑shaving, optimistic locking, asynchronous processing, read‑write splitting, and semaphore‑based rate limiting—to improve throughput and stability under high load.

Architecture Digest

Sep 8, 2017

Optimizing a High‑Concurrency Lottery System: Caching, Queueing, Optimistic Locking, and Read/Write Splitting

1. Project Consideration

The lottery activity sent SMS reminders to users about their draw rights; when users accessed the draw page, the page quickly became unavailable. Logs showed the bottleneck was the database due to severe read‑write conflicts, causing many connections to timeout. Monitoring revealed the lottery microservice QPS surged 12× and the DB QPS rose 10×, highlighting a classic high‑concurrency I/O bottleneck.

2. Optimization Ideas

Based on senior engineers' experience and online references, the main measures are downgrade, rate limiting, caching, and message queues, with the principle of minimizing direct DB exposure by handling most requests at the service layer.

3. Optimization Details

1. Lottery Detail Page

a. Enable online caching

Although cache logic existed, the switch was not turned on. Enabling it reduces DB concurrent I/O pressure and lock contention.

b. Local cache eviction strategy

Instead of clearing the entire cache when it reaches the limit, use eviction algorithms such as LRU, LFU, or NRU. Example Guava cache configuration:

// Set initial cache capacity to 10
initialCapacity(10)
// Set maximum size to 100; excess entries are evicted using LRU
maximumSize(100)

2. Lottery Logic

a. Queue‑based peak shaving

Introduce a single‑process queue; incoming draw requests are enqueued and processed one by one, eliminating the QPS spike. When the queue length exceeds a threshold (e.g., 1000 for 100 prizes), further requests are immediately returned as “no win”. Tair can record the queue length, and when prizes are exhausted the queue is cleared.

b. Replace pessimistic row locks with optimistic locks

The original code used `FOR UPDATE` pessimistic locking, causing many threads to wait indefinitely and exhaust DB connections. Switching to optimistic locking with a version field allows concurrent updates; only requests with a matching version succeed, others receive a failure response.

c. Asynchronous handling of non‑critical steps

After a successful draw, send SMS via a dedicated thread pool to improve overall request throughput.

d. Database read‑write separation

Redirect read‑heavy queries to a replica, relieving the primary DB of read load while writes continue on the master.

e. Semaphore control per time slice

Limit the number of concurrent users entering the draw window, preventing overload and pairing with the queue for additional rate limiting.

f. Message‑based persistence

Enqueue data changes (e.g., in Tair) and let a scheduled task batch‑write them to the DB, drastically reducing concurrent DB writes while requiring careful consistency handling.

g. Conditional rate limiting and degradation

When concurrency exceeds safe limits, treat excess requests as “no win” to preserve overall system availability.

3. Additional Considerations

a. Prevent malicious abuse

Apply per‑UID request caps at the service entry point to mitigate CC attacks and excessive QPS.

b. Pre‑select winning candidates

Randomly choose a pool of potential winners (e.g., 500 out of 100 000 users for 100 prizes) and only route those candidates through the full draw logic, filtering out the majority early to reduce DB load.

4. Architecture Diagram

Source: CSDN Blog

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

backend performance caching queue optimistic-lock read-write-splitting

Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.