How to Build a Lightning‑Fast, Stable Seckill System That Handles Millions of Requests
Learn how to design a robust, high‑performance seckill architecture that can endure millions of concurrent requests by breaking down business flow, defining precise technical metrics, and implementing a four‑layer system—access, traffic‑shaping, business logic, and data—using CDN, APISIX, Redis, RocketMQ, and MySQL with detailed code examples.
1. From the User Perspective: Decompose the Business
The user journey from clicking "Buy Now" to receiving a success SMS can be split into six key steps. By breaking the process into fine‑grained stages, we can identify where to add throttling, asynchronous handling, or intercept invalid requests.
2. From the Technical Perspective: Define Metrics
2.1 Concurrency Capability
For a Double‑11 level flash‑sale, peak QPS must support 1,000,000 requests, with a core TPS of at least 50,000; otherwise the UI will become sluggish.
2.2 Stability Metric
99% of requests must receive a response within 200 ms, and the recovery time (RTO) for the seckill service must be ≤ 5 minutes.
2.3 Data Consistency
Inventory must be 100 % accurate; over‑selling even one item leads to massive complaints. Payment and order status must be synchronized within 10 seconds.
2.4 Security Metric
Block more than 99 % of scripted traffic; black‑listed users must be completely denied participation.
3. Architecture Blueprint: Four‑Layer Skeleton
3.1 Access Layer – Block Garbage Traffic
Use CDN + APISIX gateway . CDN handles static assets, while APISIX provides flexible rule‑based rate limiting suitable for rapid adjustments during a flash‑sale.
CDN Acceleration
All product images, activity copy, and countdown animations are cached on Alibaba Cloud CDN. Requests are served from the nearest node (e.g., Beijing users from Beijing nodes).
Regional Routing
North‑China users are routed to North‑China gateways, East‑China users to East‑China gateways, reducing cross‑region latency.
Cache Warm‑up
One hour before the flash‑sale, pre‑heat product detail pages to all CDN nodes; after the event, expire the cache immediately to avoid stale inventory display.
3.2 Traffic‑Shaping Layer – Buffer the Surge
Seckill traffic spikes like a tsunami: 80 % of requests arrive within the first 10 seconds. This layer provides buffer + queue to transform 1 million requests in 10 seconds into 1 million requests over 10 minutes.
Message Queue Buffer
Requests are first placed into RocketMQ . Consumers process them at a controlled rate based on service capacity.
# Push user ID into the seckill queue
LPUSH seckill:queue:{productId} {userId}
# Get current queue length (position)
LLEN seckill:queue:{productId}User Queue Feedback
Calculate an estimated wait time and return a message like "You are #58 in line, estimated wait 2 minutes" to reduce anxiety‑driven refreshes.
MQ Back‑pressure Handling
If a partition accumulates > 100 k messages, spin up two temporary consumer groups to drain the backlog.
3.3 Business Logic Layer – Core Processing
Implemented with Spring Cloud Alibaba (Nacos, Sentinel, Dubbo) to avoid piecing together components.
Three micro‑services:
Qualification Service – checks login, blacklist, previous participation.
Inventory Deduction Service – handles Redis pre‑deduction and MySQL final deduction.
Order Generation Service – creates orders and integrates with payment channels.
Caffeine Local Cache
Store user qualification, level, and recent purchase history in Caffeine to avoid repeated remote calls.
// Update Caffeine cache (user qualification)
caffeineCache.put("user:12345:product:67890:qualification", true);
// Retrieve qualification
boolean hasQualification = caffeineCache.getIfPresent("user:12345:product:67890:qualification");3.4 Data Layer – High Read/Write Support
Use Redis Cluster for hot data (real‑time inventory, queue positions) and MySQL for cold data (orders, payments) with Sharding‑JDBC for horizontal scaling.
Redis Cluster
3 master + 3 replicas, 16 shards; each shard handles ~50 k QPS, totaling ~800 k QPS.
MySQL Sharding
Orders and payment records are stored in 8 databases, each split into 16 tables (128 tables total) to keep a single table under 500 k rows, improving query speed five‑fold.
Read‑Write Separation
Writes (order creation, inventory deduction) go to the master; reads (order lookup, payment status) go to replicas, reducing master load by ~40 %.
4. Critical Details to Avoid Pitfalls
4.1 Prevent Over‑selling
Use Redis pre‑deduction + MySQL optimistic lock . The Lua script atomically decrements stock; MySQL updates only when version matches.
-- Load initial stock into Redis
redis.call("HSET", KEYS[1], "available", ARGV[1])
-- Atomic deduction during purchase
local available = redis.call("HGET", KEYS[1], "available")
if not available or tonumber(available) < tonumber(ARGV[2]) then
return 0 -- insufficient stock
end
redis.call("HINCRBY", KEYS[1], "available", -tonumber(ARGV[2]))
return 1 -- successMySQL optimistic‑lock update:
UPDATE seckill_stock
SET available_stock = available_stock - 1,
version = version + 1
WHERE product_id = #{productId}
AND available_stock >= 1
AND version = #{version};4.2 Cache Optimization
Mitigate cache penetration, breakdown, and avalanche using Bloom filters, empty‑value caching, distributed locks, and random TTL.
Bloom filter + empty‑value cache to block requests for nonexistent product IDs.
Distributed lock + never‑expire hot keys to prevent stampedes.
Randomized TTL + local Caffeine fallback to spread expiration spikes.
4.3 Distributed Consistency
Absolute consistency is impossible; aim for eventual consistency using lightweight TCC transactions, transactional MQ messages, and a per‑minute reconciliation task that aligns Redis and MySQL stock.
TCC Transaction Flow
Try : validate eligibility, pre‑deduct Redis stock.
Confirm : permanently deduct MySQL stock and create order.
Cancel : on timeout or failure, roll back Redis stock and remove queue entry.
Transactional MQ (RocketMQ)
After order creation, send a transactional message; the inventory service consumes it and performs stock deduction. Failure triggers retries and eventual dead‑letter handling.
Real‑time Reconciliation
Every minute compare Redis and MySQL inventory. If the discrepancy exceeds a threshold (e.g., >10), correct Redis using MySQL as the source of truth.
5. Optimization Directions
Filter invalid requests early – CDN, APISIX, and application‑level checks block up to 80 % of noise.
Asynchronous processing – offload SMS, notifications, and points updates to ordinary RocketMQ queues.
Elastic scaling – Kubernetes HPA automatically expands instances during the traffic surge and shrinks afterward, saving ~40 % of server costs.
6. Design Takeaways
1. Front‑end traffic interception must be upstream. Block invalid traffic at CDN and gateway before it reaches the database.
2. Data consistency is non‑negotiable. Combine Redis atomic operations, MySQL optimistic locks, TCC, and reconciliation to guarantee no over‑selling.
3. Monitoring and disaster recovery cannot be ignored. Full‑link monitoring, chaos engineering, and degradation plans ensure resilience.
4. Continuous optimization is essential. As traffic patterns and business evolve, the architecture must be iteratively refined.
NiuNiu MaTe
Joined Tencent (nicknamed "Goose Factory") through campus recruitment at a second‑tier university. Career path: Tencent → foreign firm → ByteDance → Tencent. Started as an interviewer at the foreign firm and hopes to help others.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
