How to Prevent Order Loss in a 100k TPS Flash Sale When the Master DB Crashes – 5 Practical Solutions
The article dissects a high‑traffic flash‑sale interview question—how to guarantee zero order loss at 100,000 TPS when the master MySQL instance fails—by explaining the underlying performance‑consistency conflict, the three skills interviewers assess, and presenting five concrete, code‑driven solutions ranging from MySQL parameter tuning to semi‑sync replication, local message tables, group replication, and Redis‑Kafka traffic shaping.
During a ByteDance interview a candidate was asked: in a 100 k TPS flash‑sale scenario, a user sees a "order successful" popup but the master database crashes and the replica does not contain the order. The article first explains why this "lost‑order" problem occurs: (1) the conflict between ultra‑high concurrency and durable persistence (MySQL default innodb_flush_log_at_trx_commit=2 keeps the transaction only in memory) and (2) the gap between master‑slave asynchronous replication and the need for zero data loss.
What interviewers really test
Fundamental principle understanding : whether the candidate knows that "success" can mean only an in‑memory state, not a persisted one, and where the replication delay lies.
Scenario‑adaptation ability : choosing between async, semi‑sync, or full‑sync based on TPS and business criticality.
Fallback design thinking : having a backup plan when the primary solution (e.g., semi‑sync) times out.
Five practical solutions
1. Parameter optimization – trade a small performance hit for full durability
Modify my.cnf to enforce immediate redo‑log flush and binlog sync:
# my.cnf
[mysqld]
innodb_flush_log_at_trx_commit = 1 # flush redo log on each commit
sync_binlog = 1 # sync binlog to disk
innodb_log_buffer_size = 64M
innodb_log_file_size = 2G
innodb_log_files_in_group = 3Advantages: simple, no code changes, guarantees no loss after a crash. Drawbacks: 10‑15% CPU increase at 100 k TPS.
2. Semi‑synchronous replication (AFTER_SYNC mode)
Configure MySQL master and slave plugins and set the wait point to AFTER_SYNC with a 500 ms timeout:
# Master
INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';
SET GLOBAL rpl_semi_sync_master_enabled = 1;
SET GLOBAL rpl_semi_sync_master_timeout = 500;
SET GLOBAL rpl_semi_sync_master_wait_point = AFTER_SYNC;
# Slave
INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so';
SET GLOBAL rpl_semi_sync_slave_enabled = 1;In application code, wrap order creation with a retryable method that logs a warning when the semi‑sync times out and falls back to async.
@Service
public class SeckillOrderService {
@Autowired private OrderDAO orderDAO;
@Retryable(maxAttempts = 3, backoff = @Backoff(delay = 100))
public OrderResult createOrder(OrderRequest request) {
try {
return doCreateOrder(request); // semi‑sync path
} catch (DatabaseTimeoutException e) {
log.warn("Semi‑sync timeout, degraded to async, retrying order creation");
throw e;
}
}
}Pros: 99.9% of transactions survive a master crash; minimal performance loss (5‑8%). Cons: 0.1% risk if timeout occurs.
3. Local message table + scheduled compensation
Create a seckill_message table to record pending orders, then use a scheduled job to reconcile missing orders.
CREATE TABLE seckill_message (
message_id VARCHAR(64) PRIMARY KEY COMMENT 'global unique ID',
user_id BIGINT NOT NULL,
goods_id BIGINT NOT NULL,
order_id BIGINT DEFAULT NULL,
status TINYINT NOT NULL DEFAULT 0 COMMENT '0‑pending,1‑done,2‑failed',
create_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
update_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
KEY idx_status_update_time (status, update_time)
) ENGINE=InnoDB COMMENT='flash‑sale local message table';Business logic inserts a pending message, attempts order creation, updates the status, and the compensation job runs every 30 seconds to retry failed messages.
@Service
public class SeckillMessageService {
@Autowired private JdbcTemplate jdbcTemplate;
@Autowired private SeckillOrderService orderService;
@Transactional
public SeckillResult createOrderWithMessage(SeckillRequest request) {
String msgId = UUID.randomUUID().toString().replace("-", "");
jdbcTemplate.update("INSERT INTO seckill_message (message_id,user_id,goods_id,status) VALUES (?,?,?,0)",
msgId, request.getUserId(), request.getGoodsId());
try {
SeckillResult r = orderService.createOrder(request);
if (r.isSuccess()) {
jdbcTemplate.update("UPDATE seckill_message SET status=1,order_id=? WHERE message_id=?",
((Order)r.getData()).getOrderId(), msgId);
}
return r;
} catch (Exception e) {
jdbcTemplate.update("UPDATE seckill_message SET status=2 WHERE message_id=?", msgId);
throw e;
}
}
@Scheduled(fixedDelay = 30000)
public void compensatePendingMessages() {
List<Message> pending = jdbcTemplate.query(
"SELECT * FROM seckill_message WHERE status=0 AND create_time>DATE_SUB(NOW(),INTERVAL 1 HOUR)",
new MessageRowMapper());
for (Message m : pending) {
if (!orderDAO.existsByMessageId(m.getMessageId())) {
orderService.retryCreateOrder(m);
}
}
}
}Advantages: near‑zero loss even if semi‑sync fails; negligible extra latency. Drawbacks: extra table maintenance and a short window of inconsistency.
4. MySQL Group Replication (full‑sync)
Configure a 3‑node group with binary‑log row format, GTID, and enable the group‑replication plugin. Transactions are committed only after a majority of nodes acknowledge the write, guaranteeing zero loss.
# my.cnf (common to all nodes)
log_bin = /var/lib/mysql/mysql-bin
binlog_format = ROW
binlog_row_image = FULL
gtid_mode = ON
enforce_gtid_consistency = ON
log_slave_updates = ON
plugin-load-add = group_replication.so
group_replication_group_name = "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"
group_replication_start_on_boot = OFFBootstrap first node, then start replication on the others. The "majority‑confirm" rule ensures that as long as a majority of nodes stay alive, no transaction is lost.
Pros: absolute data safety, automatic failover, suitable for financial‑grade flash sales. Cons: 20‑30% performance overhead, limited to a few nodes.
5. Redis pre‑deduction + Kafka traffic shaping
Pre‑reserve inventory in Redis to filter invalid requests, then push the order request to Kafka for asynchronous persistence.
@Service
public class SeckillRedisService {
@Autowired private RedisTemplate<String,Integer> redisTemplate;
public boolean preReduceStock(String goodsId, int qty) {
String key = "seckill:stock:"+goodsId;
Long remain = redisTemplate.opsForValue().decrement(key, qty);
if (remain != null && remain >= 0) return true;
redisTemplate.opsForValue().increment(key, qty);
return false;
}
}The Kafka consumer asynchronously calls the order service (which may use solution 2 or 3). If persistence fails, the consumer rolls back the Redis stock.
@Component
public class SeckillKafkaConsumer {
@KafkaListener(topics="seckill_orders")
public void consume(SeckillRequest req) {
try {
orderService.createOrderWithMessage(req);
} catch (Exception e) {
redisTemplate.opsForValue().increment("seckill:stock:"+req.getGoodsId(), req.getQuantity());
}
}
}Pros: reduces peak load on the master (20 k TPS → 5 k TPS sustained), fast Redis operations. Cons: added system complexity and a small consistency window.
Interview answer template
Question : How to ensure zero order loss at 100 k TPS when the master crashes?
Explain the gap between "user sees success" and "data persisted".
Identify two risk points: async persistence (innodb_flush_log_at_trx_commit) and async replication.
Present a layered solution:
Base layer – parameter tuning for durability.
Core layer – semi‑sync AFTER_SYNC (or full‑sync for finance).
Fallback layer – local message table with compensation.
Traffic‑shaping layer – Redis pre‑deduction + Kafka.
Choose the combination based on business:
TPS ≤ 5 k: parameters + semi‑sync.
TPS ≈ 10 k: add local message table.
TPS > 20 k: add Redis + Kafka.
Financial‑grade: group replication + message table.
Key takeaways
Performance vs consistency is the core trade‑off in flash‑sale systems.
Understanding MySQL’s flush and binlog behavior is essential.
Semi‑sync AFTER_SYNC gives the best balance for most high‑concurrency cases.
Always design a fallback (local message table or traffic shaping) for extreme scenarios.
Match the solution to the business’s tolerance for latency and data loss.
Tech Freedom Circle
Crazy Maker Circle (Tech Freedom Architecture Circle): a community of tech enthusiasts, experts, and high‑performance fans. Many top‑level masters, architects, and hobbyists have achieved tech freedom; another wave of go‑getters are hustling hard toward tech freedom.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
