Databases 9 min read

When to Shard Your Database? Practical Strategies and Common Pitfalls

This article explains why and how to shard a MySQL database, covering vertical and horizontal partitioning, distributed transaction handling, unique ID generation, a real‑world e‑commerce case study, and future considerations for distributed architectures.

Architecture & Thinking
Architecture & Thinking
Architecture & Thinking
When to Shard Your Database? Practical Strategies and Common Pitfalls

1 Introduction

As business traffic grows, a single MySQL instance becomes a CPU, memory, and I/O bottleneck, potentially causing service outages. Sharding (splitting databases and tables) is a key strategy to handle high concurrency by distributing data across multiple instances, improving throughput and availability. This article shares practical experience, core solutions, and implementation points.

2 Soul‑checking: Do You Really Need Sharding?

Sharding is not a silver bullet; before proceeding, answer the following questions:

1. Is the current bottleneck caused by the database?

Confirm via monitoring (QPS, TPS, connections, slow queries) whether the database is the performance bottleneck.

2. Have all low‑cost optimizations been exhausted?

Index tuning, SQL rewrite, caching (Redis), read/write splitting, and other low‑cost solutions should be tried first.

3. Can you accept the added complexity after sharding?

Issues such as cross‑database JOIN, distributed transactions, and unique ID generation must have viable solutions.

Conclusion: Sharding is a trade‑off of complexity for performance; balance business growth expectations with the team’s technical capability.

3 Two Core Sharding Strategies

3.1 Vertical Partitioning (Database Splitting)

Principle: Split tables into different database instances based on business domains, e.g., separate user, product, cart, order, payment, and logistics databases.

Applicable Scenarios:

Clear business module boundaries (e.g., user DB, order DB, product DB).

Avoid too many tables in a single DB causing management chaos.

Example:

Original DB: user_orders (user table + order table)
After split:
- user_db (user table)
- order_db (order table)
- cart_db (cart table)

Advantages:

Simple to split, low coupling.

Easy to scale per business (e.g., order DB can be expanded independently).

Disadvantages:

Cross‑database transactions require a distributed transaction framework (e.g., Seata).

image
image

3.2 Horizontal Partitioning (Table Sharding)

Principle: Distribute rows of a single table across multiple databases/tables using a shard key (e.g., user ID).

Applicable Scenarios:

Single table exceeds massive row count (e.g., >5 million rows).

Hot data with frequent access (e.g., login table, cart table).

Example:

Original table: orders (1 billion rows)
After split into five tables:
- orders_0 (UserID % 5 = 0)
- orders_1 (UserID % 5 = 1)
- orders_2 (UserID % 5 = 2)
- orders_3 (UserID % 5 = 3)
- orders_4 (UserID % 5 = 4)

Advantages:

Even data distribution, pressure spread.

Unlimited horizontal scalability.

Disadvantages:

Cross‑shard queries become complex and need aggregation.

Shard key selection must avoid data skew.

image-1
image-1

3.3 Post‑sharding Pitfalls and Guidance

3.1 Distributed Transactions: How to Ensure Consistency?

Solution Comparison:

Local transaction – suitable for single‑DB operations, low performance loss, low complexity.

Final consistency – asynchronous compensation (e.g., message queue), medium loss, medium complexity.

TCC – strong consistency for finance, high loss, high complexity.

SAGA – long‑running transactions (e.g., order refund), medium loss, high complexity.

Recommendation: Prefer business design that avoids cross‑DB transactions; if necessary, use frameworks like Seata.

3.2 Cross‑DB Queries: Efficient Aggregation

Redundant Design: Duplicate user nickname in order DB to avoid joining user DB.

Data Heterogeneity: Sync binlog via Canal to ES/HBase for query.

Distributed SQL Engine: Use MyCat or ShardingSphere‑JDBC as proxy.

3.3 Unique ID Generation: Say Goodbye to Auto‑Increment

Snowflake: Twitter’s distributed ID generator using timestamp, machine ID, and sequence.

UUID: Universal but unordered, impacts index performance.

Database Cluster: MySQL AUTO_INCREMENT_INCREMENT configuration for multi‑master IDs.

4 Real‑World Case: Sharding an E‑commerce Platform

Background: During a major promotion, order DB QPS exceeded 20 k, latency over 80 ms, severely affecting user experience.

Solution:

Vertical split: Separate order DB from user DB.

Horizontal split: Hash user ID into four order shards.

Read‑write separation: Primary for writes, replicas for reads via ProxySQL.

Cache hot product data in Redis.

Result: QPS increased to 50 k, latency stabilized below 30 ms.

5 Future Outlook: Sharding Is Not the End

After sharding, systems evolve toward distributed architectures. Ongoing concerns include:

Automated Operations: Monitoring, alerts, and auto‑scaling for self‑healing.

Active‑Active Architecture: Cross‑region data disaster recovery.

Serverless Databases: Solutions like AWS Aurora Serverless for on‑demand scaling.

scalabilitydatabase shardingdistributed transactionshorizontal partitioningVertical Partitioning
Architecture & Thinking
Written by

Architecture & Thinking

🍭 Frontline tech director and chief architect at top-tier companies 🥝 Years of deep experience in internet, e‑commerce, social, and finance sectors 🌾 Committed to publishing high‑quality articles covering core technologies of leading internet firms, application architecture, and AI breakthroughs.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.