Backend Development 10 min read

How to Design a High‑Throughput Database Architecture for a Billion‑Row Daily Log System

This guide breaks down a real‑world interview scenario where a backend engineer must design a scalable database solution for a billing‑log or feed system handling 100 million daily inserts and 100 k QPS reads, covering partitioning vs sharding, sharding key selection, shard count, read‑write separation, multi‑level caching, consistency patterns, hot‑key mitigation, and online schema changes.

dbaplus Community

Jan 15, 2026

How to Design a High‑Throughput Database Architecture for a Billion‑Row Daily Log System

Backend engineers often face a gap between writing SQL and designing high‑concurrency, massive‑data systems; interviewers now test architecture skills rather than isolated knowledge.

Scenario Overview

The interview presents a core system responsible for a national‑level app’s billing‑log or dynamic feed, with the following characteristics:

Write load: 100 million new rows per day in a single table.

Read load: Peak QPS can reach 100 000, with reads dominating.

Data shape: Time‑series data, continuously growing, rarely updated or deleted; queries are typically by user ID and time range.

The candidate is asked to propose a complete database architecture and justify each decision.

1. Partitioning vs Sharding

Given the volume, a single MySQL instance cannot survive more than a day. The recommended answer is horizontal sharding rather than partitioning, because partitioning keeps all data and indexes on the same server, hitting I/O and storage limits, whereas sharding distributes data across multiple machines.

2. Sharding Strategy

Sharding key: user_id is natural since most queries filter by user ID; it keeps a user’s data on the same shard and avoids cross‑shard joins, though one must consider data skew for very large users.

Shard count: Estimate based on MySQL performance (optimal < 50‑100 million rows per table). With 365 × 100 million ≈ 36.5 billion rows per year, roughly 730 shards are needed; designing for 1024 or 2048 shards provides headroom.

Shard implementation: Choose between client‑side sharding (e.g., Sharding‑JDBC) for lower latency but tighter coupling, or middleware sharding (e.g., MyCAT) for transparency at the cost of added complexity.

3. Handling 100 k QPS Reads

Two classic techniques are required:

Read‑write separation: Deploy a master‑slave cluster per shard; writes go to the master, reads are served by multiple slaves, allowing horizontal scaling of read capacity.

Multi‑level caching: Implement an L1 local cache (e.g., Caffeine) in the application server, followed by an L2 distributed cache (e.g., Redis). Most reads should be satisfied by the cache; only cache misses fall through to the database.

4. Cache Consistency & Hot‑Key Problems

Use the Cache‑Aside pattern: on read, check cache then DB; on write, update DB then delete the cache entry. Deleting avoids stale data issues.

Address three common cache pitfalls:

Cache penetration: Filter nonexistent keys with a Bloom filter.

Cache breakdown (hot‑key miss): Guard cache rebuild with a distributed lock so only one request hits the DB.

Cache avalanche: Stagger TTLs with random offsets to prevent massive simultaneous expirations.

5. Long‑Term Data Management & Online DDL

Cold‑data separation: Archive data older than a threshold (e.g., one year) to cheaper storage such as HBase, ClickHouse, or cloud object storage (OSS/S3), reducing load on the primary MySQL cluster.

Online schema changes: Never run ALTER TABLE directly on large sharded tables. Use tools like gh‑ost or pt‑online‑schema‑change to create a shadow table, migrate data incrementally, and switch without locking the master.

Conclusion

Successfully answering all “levels” demonstrates a shift from being a code implementer to an architect with a holistic view, balancing performance, scalability, cost, and operational risk.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

backend architecture caching strategies database sharding Read-Write Separation Online Schema Change

Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Scenario Overview

1. Partitioning vs Sharding

2. Sharding Strategy

3. Handling 100 k QPS Reads

4. Cache Consistency & Hot‑Key Problems

5. Long‑Term Data Management & Online DDL

Conclusion

dbaplus Community

How this landed with the community

Was this worth your time?

0 Comments

3. Handling 100 k QPS Reads