How to Build a Lightning‑Fast, Stable Seckill System That Handles Millions of Requests

Learn how to design a robust, high‑performance seckill architecture that can endure millions of concurrent requests by breaking down business flow, defining precise technical metrics, and implementing a four‑layer system—access, traffic‑shaping, business logic, and data—using CDN, APISIX, Redis, RocketMQ, and MySQL with detailed code examples.

NiuNiu MaTe
NiuNiu MaTe
NiuNiu MaTe
How to Build a Lightning‑Fast, Stable Seckill System That Handles Millions of Requests

1. From the User Perspective: Decompose the Business

The user journey from clicking "Buy Now" to receiving a success SMS can be split into six key steps. By breaking the process into fine‑grained stages, we can identify where to add throttling, asynchronous handling, or intercept invalid requests.

2. From the Technical Perspective: Define Metrics

2.1 Concurrency Capability

For a Double‑11 level flash‑sale, peak QPS must support 1,000,000 requests, with a core TPS of at least 50,000; otherwise the UI will become sluggish.

2.2 Stability Metric

99% of requests must receive a response within 200 ms, and the recovery time (RTO) for the seckill service must be ≤ 5 minutes.

2.3 Data Consistency

Inventory must be 100 % accurate; over‑selling even one item leads to massive complaints. Payment and order status must be synchronized within 10 seconds.

2.4 Security Metric

Block more than 99 % of scripted traffic; black‑listed users must be completely denied participation.

3. Architecture Blueprint: Four‑Layer Skeleton

3.1 Access Layer – Block Garbage Traffic

Use CDN + APISIX gateway . CDN handles static assets, while APISIX provides flexible rule‑based rate limiting suitable for rapid adjustments during a flash‑sale.

CDN Acceleration

All product images, activity copy, and countdown animations are cached on Alibaba Cloud CDN. Requests are served from the nearest node (e.g., Beijing users from Beijing nodes).

Regional Routing

North‑China users are routed to North‑China gateways, East‑China users to East‑China gateways, reducing cross‑region latency.

Cache Warm‑up

One hour before the flash‑sale, pre‑heat product detail pages to all CDN nodes; after the event, expire the cache immediately to avoid stale inventory display.

3.2 Traffic‑Shaping Layer – Buffer the Surge

Seckill traffic spikes like a tsunami: 80 % of requests arrive within the first 10 seconds. This layer provides buffer + queue to transform 1 million requests in 10 seconds into 1 million requests over 10 minutes.

Message Queue Buffer

Requests are first placed into RocketMQ . Consumers process them at a controlled rate based on service capacity.

# Push user ID into the seckill queue
LPUSH seckill:queue:{productId} {userId}
# Get current queue length (position)
LLEN seckill:queue:{productId}

User Queue Feedback

Calculate an estimated wait time and return a message like "You are #58 in line, estimated wait 2 minutes" to reduce anxiety‑driven refreshes.

MQ Back‑pressure Handling

If a partition accumulates > 100 k messages, spin up two temporary consumer groups to drain the backlog.

3.3 Business Logic Layer – Core Processing

Implemented with Spring Cloud Alibaba (Nacos, Sentinel, Dubbo) to avoid piecing together components.

Three micro‑services:

Qualification Service – checks login, blacklist, previous participation.

Inventory Deduction Service – handles Redis pre‑deduction and MySQL final deduction.

Order Generation Service – creates orders and integrates with payment channels.

Caffeine Local Cache

Store user qualification, level, and recent purchase history in Caffeine to avoid repeated remote calls.

// Update Caffeine cache (user qualification)
caffeineCache.put("user:12345:product:67890:qualification", true);
// Retrieve qualification
boolean hasQualification = caffeineCache.getIfPresent("user:12345:product:67890:qualification");

3.4 Data Layer – High Read/Write Support

Use Redis Cluster for hot data (real‑time inventory, queue positions) and MySQL for cold data (orders, payments) with Sharding‑JDBC for horizontal scaling.

Redis Cluster

3 master + 3 replicas, 16 shards; each shard handles ~50 k QPS, totaling ~800 k QPS.

MySQL Sharding

Orders and payment records are stored in 8 databases, each split into 16 tables (128 tables total) to keep a single table under 500 k rows, improving query speed five‑fold.

Read‑Write Separation

Writes (order creation, inventory deduction) go to the master; reads (order lookup, payment status) go to replicas, reducing master load by ~40 %.

4. Critical Details to Avoid Pitfalls

4.1 Prevent Over‑selling

Use Redis pre‑deduction + MySQL optimistic lock . The Lua script atomically decrements stock; MySQL updates only when version matches.

-- Load initial stock into Redis
redis.call("HSET", KEYS[1], "available", ARGV[1])
-- Atomic deduction during purchase
local available = redis.call("HGET", KEYS[1], "available")
if not available or tonumber(available) < tonumber(ARGV[2]) then
  return 0 -- insufficient stock
end
redis.call("HINCRBY", KEYS[1], "available", -tonumber(ARGV[2]))
return 1 -- success

MySQL optimistic‑lock update:

UPDATE seckill_stock
SET available_stock = available_stock - 1,
    version = version + 1
WHERE product_id = #{productId}
  AND available_stock >= 1
  AND version = #{version};

4.2 Cache Optimization

Mitigate cache penetration, breakdown, and avalanche using Bloom filters, empty‑value caching, distributed locks, and random TTL.

Bloom filter + empty‑value cache to block requests for nonexistent product IDs.

Distributed lock + never‑expire hot keys to prevent stampedes.

Randomized TTL + local Caffeine fallback to spread expiration spikes.

4.3 Distributed Consistency

Absolute consistency is impossible; aim for eventual consistency using lightweight TCC transactions, transactional MQ messages, and a per‑minute reconciliation task that aligns Redis and MySQL stock.

TCC Transaction Flow

Try : validate eligibility, pre‑deduct Redis stock.

Confirm : permanently deduct MySQL stock and create order.

Cancel : on timeout or failure, roll back Redis stock and remove queue entry.

Transactional MQ (RocketMQ)

After order creation, send a transactional message; the inventory service consumes it and performs stock deduction. Failure triggers retries and eventual dead‑letter handling.

Real‑time Reconciliation

Every minute compare Redis and MySQL inventory. If the discrepancy exceeds a threshold (e.g., >10), correct Redis using MySQL as the source of truth.

5. Optimization Directions

Filter invalid requests early – CDN, APISIX, and application‑level checks block up to 80 % of noise.

Asynchronous processing – offload SMS, notifications, and points updates to ordinary RocketMQ queues.

Elastic scaling – Kubernetes HPA automatically expands instances during the traffic surge and shrinks afterward, saving ~40 % of server costs.

6. Design Takeaways

1. Front‑end traffic interception must be upstream. Block invalid traffic at CDN and gateway before it reaches the database.

2. Data consistency is non‑negotiable. Combine Redis atomic operations, MySQL optimistic locks, TCC, and reconciliation to guarantee no over‑selling.

3. Monitoring and disaster recovery cannot be ignored. Full‑link monitoring, chaos engineering, and degradation plans ensure resilience.

4. Continuous optimization is essential. As traffic patterns and business evolve, the architecture must be iteratively refined.

System ArchitectureHigh ConcurrencyRocketMQTCCSeckillapixis
NiuNiu MaTe
Written by

NiuNiu MaTe

Joined Tencent (nicknamed "Goose Factory") through campus recruitment at a second‑tier university. Career path: Tencent → foreign firm → ByteDance → Tencent. Started as an interviewer at the foreign firm and hopes to help others.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.