How to Tackle Database Bottlenecks with Sharding and Partitioning
This article examines common database performance bottlenecks such as I/O and CPU limits, explains when to apply horizontal or vertical sharding, partitioning, and scaling techniques, compares tools like ShardingSphere, TDDL, and Mycat, and provides practical steps for implementation, migration, and troubleshooting.
1. Database Bottlenecks
Both I/O and CPU bottlenecks increase the number of active database connections, eventually reaching the maximum capacity, which can cause service failures.
1. I/O Bottleneck
Disk read I/O bottleneck occurs when hot data exceeds cache capacity, leading to heavy I/O; solution: database sharding and vertical partitioning.
Network I/O bottleneck occurs when request volume exceeds bandwidth; solution: sharding.
2. CPU Bottleneck
SQL issues such as joins, group by, order by, or non-indexed queries increase CPU load; solution: SQL optimization, proper indexing, and moving business calculations to the service layer.
Large single-table data causing full scans also burdens CPU; solution: horizontal partitioning.
2. Sharding and Partitioning
1. Horizontal Sharding
Concept: based on a field, split a database into multiple databases using strategies like hash or range.
Each database has identical schema.
Data in each database is distinct with no overlap.
The union of all databases represents the full dataset.
Scenario: high concurrent traffic where table partitioning alone cannot solve the problem and there is no clear business module for vertical sharding.
Analysis: Adding more databases reduces I/O and CPU pressure proportionally.
2. Horizontal Partitioning
Concept: based on a field, split a table into multiple tables using hash or range.
Each table shares the same schema.
Data in each table is distinct with no overlap.
The union of all tables equals the full dataset.
Scenario: large single-table size degrades SQL efficiency and increases CPU load.
Analysis: Smaller tables improve SQL execution speed and reduce CPU usage.
3. Vertical Sharding (Database)
Concept: based on business domains, split tables into different databases.
Each database has a different schema.
Data in each database is distinct with no overlap.
The union of all databases equals the full dataset.
Scenario: high concurrency with clear business modules that can be isolated.
Analysis: Enables service‑oriented architecture.
4. Vertical Partitioning (Table)
Concept: based on field activity, split a table's columns into a main table and extension tables.
Each table has a different schema.
Data overlap exists on at least one column (usually the primary key) for joining.
The union of all tables equals the full dataset.
Scenario: tables with many columns where hot and cold data are mixed, causing large rows and cache pressure.
Analysis: Hot data stays in the main table for caching, reducing random read I/O; queries must join main and extension tables, and joins should be avoided at the database level.
3. Sharding Tools
ShardingSphere (formerly Sharding-JDBC)
TDDL (Taobao Distributed Data Layer)
Mycat (middleware)
Note: Evaluate tool advantages and disadvantages yourself; prioritize official documentation and community support.
4. Sharding Implementation Steps
Assess capacity and growth to determine the number of shards, choose a uniform key, define sharding rules (hash or range), execute (usually with dual writes), and handle scaling while minimizing data movement.
5. Common Sharding Issues
1. Queries without partition key
Approaches: mapping method, gene method, redundancy method, NoSQL solutions, etc.
2. Cross‑shard pagination
Solution: use NoSQL search engines such as Elasticsearch.
3. Scaling
Horizontal database scaling via master‑slave upgrade; horizontal table scaling via dual‑write migration.
6. Summary
Identify the real bottleneck before deciding how to shard.
Select keys that ensure even distribution and consider non‑partition queries.
Simplify sharding rules as much as possible while meeting requirements.
7. Example Repository
GitHub: https://github.com/LiHaodong888/SpringBootLear
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java Backend Technology
Focus on Java-related technologies: SSM, Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading. Occasionally cover DevOps tools like Jenkins, Nexus, Docker, and ELK. Also share technical insights from time to time, committed to Java full-stack development!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
