Databases 8 min read

Mastering Database Sharding: Design Strategies and Practical Cases

This article introduces the fundamentals of database sharding, outlines architectural evolution, explains vertical and horizontal splitting dimensions and strategies, discusses middleware choices, and presents real‑world case studies while addressing common challenges such as ID generation, join queries, pagination, and distributed transactions.

ITFLY8 Architecture Home
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Mastering Database Sharding: Design Strategies and Practical Cases

Introduction

Sharding (splitting databases and tables) is a core technology in distributed system architecture, providing solutions for massive data storage and high‑performance queries. Designing an effective sharding strategy is a key test of an architect's expertise.

Outline

1. Fundamentals

Evolution of the data layer and data access layer

Sharding dimensions and strategies

Problems caused by sharding and their solutions

2. Middleware Selection

Mycat

TDDL

Sharding‑JDBC

Custom solutions

3. Sharding in Practice

User system splitting case

Order system splitting case

Payment system splitting case

Other examples

Data Layer Evolution

Single Database : Suitable for early stages or low traffic, using a single database is the simplest access pattern.

Read/Write Separation : When read traffic grows, a read‑write split distributes queries to read replicas, with automatic switching in the data access layer.

Hotspot Cache : To handle uneven data access, combine local (JVM or disk) and distributed caches; a read‑write ratio of about 4:1 often justifies adding cache.

Sharding (Database/Table Splitting) : When write pressure exceeds a single database’s capacity, split databases and, based on table access patterns, split tables accordingly.

Sharding Dimensions and Strategies

2.1 Splitting Dimensions

Sharding can be vertical or horizontal, and can target either databases or tables, resulting in four combinations: vertical database, vertical table, horizontal database, and horizontal table.

Vertical Database Splitting : Addresses high concurrency by separating business domains (e.g., user domain, product domain) into different databases, and further splitting within a domain based on business attributes.

Vertical Table Splitting : Solves single‑table high‑concurrency or hot‑cold data issues by separating a large table into multiple tables (e.g., basic user info vs. extended user info).

Horizontal Database Splitting : Distributes the same table’s data across multiple databases to relieve pressure on a single instance.

Horizontal Table Splitting : Divides a large table into multiple tables based on a sharding key, improving write and storage capacity.

2.2 Splitting Strategies

Choose sharding keys with business meaning (e.g., userId, orderId).

Estimate the number of databases/tables based on projected data volume and traffic.

Alibaba’s guideline: split when a single table exceeds 5 million rows or 2 GB, but adjust according to field types, length, and access patterns.

Plan scaling (expansion and contraction) strategies.

Design migration plans for moving data.

Select appropriate access‑layer code or middleware.

Problems and Solutions

3.1 ID Generation

After sharding, use distributed ID generators such as UUID, Snowflake, or custom solutions; Snowflake is commonly preferred for its uniqueness, monotonicity, high availability, and performance.

3.2 Join Queries

Approaches include querying each shard then merging results, using global tables, field redundancy, E‑R tables (grouping related tables in one database), or wide tables.

3.3 Pagination and Sorting

Can be handled by querying each shard then merging, or by leveraging Elasticsearch or wide tables; ES or wide‑table solutions are recommended.

3.4 Distributed Transactions

Common solutions are 2PC, 3PC, TCC, SAGA, and eventual consistency (via local/transactional messages, notifications, reconciliation, etc.). Prefer avoiding distributed transactions; if needed, favor eventual consistency.

References

Sharding concepts: https://blog.csdn.net/dgfdhgghd/article/details/128426013

When to consider sharding: https://zhuanlan.zhihu.com/p/1894001842972767471

Recommended MySQL single‑table size: https://www.zhihu.com/question/60438093/answer/51559087283

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Scalabilitymiddlewaredatabase shardingTransaction Management
ITFLY8 Architecture Home
Written by

ITFLY8 Architecture Home

ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.