Mastering Flash Sale Systems: Redis, MQ, and DB Optimization Guide

This comprehensive guide walks you through the core challenges of high‑concurrency flash‑sale systems and presents a layered architecture with design principles, Redis caching, message‑queue integration, MySQL persistence, Lua scripting, monitoring, stress‑testing, anti‑fraud measures, and practical deliverables for a production‑ready implementation.

Ray's Galactic Tech
Ray's Galactic Tech
Ray's Galactic Tech
Mastering Flash Sale Systems: Redis, MQ, and DB Optimization Guide

Overview & Core Challenges

Flash‑sale (秒杀) systems face extreme concurrent read/write traffic, inventory contention, overselling risks, anti‑fraud requirements, user‑experience expectations, and overall system stability.

Instantaneous QPS spikes (hundreds to thousands times normal traffic)

Inventory competition leading to over‑sell or under‑sell

Cheat prevention and fairness

Clear user feedback and queuing indicators

System stability to avoid impacting the main site

Design Principles

Traffic Shaping : Intercept and dilute traffic at edge and gateway layers.

Read‑Write Separation : Reads from cache, writes via asynchronous queues.

Extreme Performance : Use Redis atomic operations, Lua scripts, and async MQ.

Business Isolation : Deploy the flash‑sale service independently from the main site.

Eventual Consistency : Allow brief inconsistency while preventing oversell and ensuring final persistence.

Layered Architecture Overview

The system consists of CDN‑served static pages, a gateway layer, a dedicated flash‑sale service, Redis cache, a message queue, and a relational database.

Key Components & Implementation Details

Frontend & Client

Static resources hosted on CDN; the flash‑sale page is fully static to reduce site load.

Dynamic API wrapper hides the real flash‑sale endpoint.

Client shows countdown, disables button after click, and polls for result.

CAPTCHA or behavioral verification for high‑risk scenarios.

Gateway Layer

Rate limiting (token‑bucket / leaky‑bucket) for the flash‑sale endpoint.

Anti‑bot measures: sliding CAPTCHA, per‑user/IP/token limits.

Circuit breaker to degrade gracefully on backend failures.

Flash‑Sale Service & Cache (Redis)

Pre‑warm product inventory and flash‑sale data into Redis.

All reads go to Redis; no DB access during the sale.

Use DECR or Lua scripts for atomic stock decrement, deduplication, and queue insertion.

Combine stock decrement, user marking, and queue push into a single atomic operation.

Message Queue (MQ)

Asynchronous write operations via Kafka or RocketMQ.

MQ smooths traffic spikes and reliably delivers order‑creation messages.

Consumers process messages at a controlled rate to avoid DB overload.

Database & Persistence

Dedicated order table with minimal fields, sharded as needed.

Unique index on (user_id, sku_id) guarantees idempotency.

Consumers perform deduplication, idempotent retries, and dead‑letter handling.

Redis Key Design (Example)

seckill:stock:{sku_id}        -> integer (remaining stock)
seckill:users:{sku_id}        -> set (user IDs that successfully purchased)
seckill:queue:{sku_id}        -> list or stream storing order messages (reliable async processing)
seckill:delay                -> sorted set (score = expire_ts) for timeout rollback detection

Redis Lua Script (Atomic Check‑Decrement‑Dedup‑Queue)

-- KEYS[1] = stock key (seckill:stock:{sku})
-- KEYS[2] = users set (seckill:users:{sku})
-- KEYS[3] = queue list (seckill:queue:{sku})
-- ARGV[1] = user_id
-- ARGV[2] = order_payload (json string)

local stock = tonumber(redis.call('GET', KEYS[1]) or '-1')
if stock <= 0 then
  return {err="OUT_OF_STOCK"}
end

if redis.call('SISMEMBER', KEYS[2], ARGV[1]) == 1 then
  return {err="ALREADY_BUY"}
end

redis.call('DECR', KEYS[1])
redis.call('SADD', KEYS[2], ARGV[1])
redis.call('RPUSH', KEYS[3], ARGV[2])

return {ok="OK"}

Explanation : The script atomically checks stock, prevents duplicate purchases, decrements inventory, records the user, and pushes the order payload to a Redis list. If MQ is unavailable, the message remains in Redis, ensuring reliability.

Queue Consumer (Pseudo‑code, Python style)

while True:
    msg = redis.brpop(queue_key, timeout=5)
    if not msg:
        continue
    payload = parse(msg)
    success = try_insert_order(payload)  # use unique index for idempotency
    if success:
        redis.zadd("seckill:delay", {payload['order_no']: expire_ts})
    else:
        handle_failure(payload)

Idempotency is enforced by a UNIQUE index on (user_id, sku_id); duplicate inserts are treated as successful.

MySQL Table Design (Simplified)

CREATE TABLE seckill_order (
  id BIGINT AUTO_INCREMENT PRIMARY KEY,
  order_no VARCHAR(64) NOT NULL UNIQUE,
  user_id BIGINT NOT NULL,
  sku_id BIGINT NOT NULL,
  amount DECIMAL(10,2),
  status TINYINT NOT NULL DEFAULT 0, -- 0: processing, 1: paid, 2: cancelled
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  UNIQUE KEY ux_user_sku (user_id, sku_id)
) ENGINE=InnoDB;

Delayed Rollback (Unpaid Orders)

After order creation, add order_no to seckill:delay (sorted set) with score = expiration timestamp.

A scheduled task runs every second, using ZRANGEBYSCORE to fetch expired orders.

If the order is still unpaid, increment stock with INCR seckill:stock:{sku}.

Remove the user from seckill:users:{sku} (or keep based on business rules).

Mark the order status as cancelled.

Use DB status checks to avoid duplicate rollbacks.

Idempotency & Compensation Strategies

Consumer relies on the unique DB constraint; duplicate inserts are ignored.

If messages are lost during transfer, run an offline script to compare Redis and DB and compensate missing records.

For cases where stock was deducted in Redis but DB write failed and no message exists, use a compensation script or audit queue to correct the discrepancy.

Anti‑Fraud / Bot Prevention (Advanced)

Dynamic flash‑sale path: obtain a short‑lived token (10 s) before calling the sale API.

CAPTCHA & behavioral risk control: slider, fingerprint, device ID, phone/real‑name verification for high‑value items.

Rate limiting per user/IP/token at the gateway.

Black/white lists to block known bot IPs/User‑Agents.

Risk scoring combining IP, UA, mouse trajectory, and historical behavior for tiered handling.

Capacity Estimation (Example)

Assuming each flash‑sale request performs four Redis operations (GET, DECR, SADD, RPUSH) and targets 100 k QPS: 100,000 × 4 = 400,000 ops/sec Deployment recommendations:

Use Redis Cluster, sharding by SKU or hash to avoid hotspot on a single shard.

Reserve 1.5‑2× redundancy and adjust instance specs and shard count based on benchmark results.

Monitoring Metrics & Alerts (Essential)

Redis

ops/sec, used_memory, key count, blocked_clients, latency, slowlog.

MQ

Enqueue rate, consume rate, consumer lag, pending message count.

Database

TPS, slow queries, row lock wait time, connection count, InnoDB status.

Application

Flash‑sale API QPS, success rate, 95/99‑pct latency, error rate.

Business

Remaining stock, successful order count, timeout rollback count.

Alert Examples

Redis ops or latency exceeds threshold.

MQ backlog exceeds threshold.

DB slow queries > N.

Success rate drops below X%.

Stress‑Test Plan (Practical)

Preparation : Simulate the full flow (token/CAPTCHA → wait → flash‑sale request → order status query).

Scenarios :

Spike: reach target concurrency (e.g., 100 k) within 1‑2 seconds.

Tail load: sustain 20 k QPS for 10 minutes.

Target Metrics : 95 % of requests latency < 200 ms (gateway to app), MQ consumer lag under control.

Tools : Locust, K6, or Gatling in distributed mode.

Note : Run tests in a pre‑production environment to avoid impacting production.

Common Pitfalls & Mitigations

MQ write failure causing message loss – write first to a local Redis queue, then transfer to MQ, or use transactional/half‑message patterns.

Inventory display delay or cache penetration – pre‑warm cache, apply local rate limiting, and refresh near‑real‑time.

Duplicate rollbacks – check DB order status before rolling back.

Redis single‑node or single‑shard bottleneck – horizontal sharding, hash‑based SKU partitioning, or further split hot keys.

Deliverables (Ready to Deploy)

Complete Redis Lua scripts and consumer implementations (Go/Python) with retry and dead‑letter logic.

Stress‑test scripts (Locust/K6) that emulate the full user flow.

Visual architecture diagram (SVG/PNG): CDN → Gateway → Redis → MQ → DB.

Operations monitoring and alert list (Prometheus + Grafana rule examples).

Flash‑sale runbook: warm‑up, execution, fault handling, rollback steps.

Conclusion

This guide provides a production‑ready flash‑sale system implementation, integrating system design, engineering details, and operational safeguards.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

BackendSystem Designflash sale
Ray's Galactic Tech
Written by

Ray's Galactic Tech

Practice together, never alone. We cover programming languages, development tools, learning methods, and pitfall notes. We simplify complex topics, guiding you from beginner to advanced. Weekly practical content—let's grow together!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.