How to Tackle Database Bottlenecks with Sharding, Horizontal & Vertical Partitioning
This article analyzes common database performance bottlenecks, explains when to apply IO‑ or CPU‑focused optimizations, and provides a detailed guide to horizontal and vertical sharding, partitioning strategies, tooling choices, implementation steps, and practical pitfalls for scalable systems.
Database Bottlenecks
Both I/O and CPU bottlenecks increase the number of active connections, eventually reaching the database's connection limit and causing reduced concurrency, throughput, or crashes.
IO Bottleneck
Disk read I/O : Hot data exceeds cache capacity, causing heavy read operations. Solution: database sharding or vertical partitioning.
Network I/O : Excessive data transfer overwhelms bandwidth. Solution: sharding to distribute traffic.
CPU Bottleneck
SQL complexity (joins, GROUP BY, ORDER BY, non‑indexed predicates) increases CPU load. Solution: SQL optimization, proper indexing, and moving business calculations to the service layer.
Large single‑table scans cause high CPU usage. Solution: horizontal partitioning (sharding) of the table.
Sharding Strategies
Horizontal Sharding (Database Sharding)
Split data across multiple databases based on a chosen key (hash, range, etc.). All databases share the same schema; data sets are disjoint and their union equals the full dataset.
Typical scenario: System experiences high absolute concurrency without clear business modules for vertical sharding.
Effect: Adding databases reduces I/O and CPU pressure proportionally.
Horizontal Partitioning (Table Sharding)
Split a single logical table into multiple physical tables using a key. All tables share the same schema; data sets are disjoint and together form the complete dataset.
Typical scenario: A single table grows large, degrading SQL efficiency and increasing CPU load.
Effect: Smaller tables improve query execution speed and lower CPU usage.
Vertical Sharding (Database Vertical Partitioning)
Distribute tables belonging to different business domains into separate databases. Schemas may differ; data sets are disjoint and together form the full dataset.
Typical scenario: High concurrency with distinct business modules that can be isolated.
Effect: Enables service‑oriented architecture.
Vertical Partitioning (Column Partitioning)
Split a table's columns into a main table (hot columns) and an extension table (cold columns) based on activity. The tables share at least one common column (usually the primary key) for joining.
Typical scenario: Many columns with mixed hot/cold data cause large rows, reducing cache effectiveness and generating random read I/O.
Effect: Placing hot columns together improves cache hit rate; full data is retrieved by joining the tables in the service layer. Avoid database‑side JOINs to reduce CPU load and coupling.
Sharding Tools
Sharding‑Sphere (JAR, formerly Sharding‑JDBC)
TDDL – Taobao Distributed Data Layer (JAR)
Mycat – middleware solution
Evaluate each tool’s advantages and disadvantages yourself; prioritize official documentation and community support.
Sharding Implementation Steps
Assess current and projected capacity → select a uniformly distributed key → define sharding rule (hash, range, etc.) → execute migration (typically with dual‑write) → plan capacity expansion while minimizing data movement.
Common Sharding Issues
Queries on Non‑Partition Keys
When a query uses only non‑partition keys, routing requires mapping strategies such as hash mapping, gene‑based mapping, or redundancy. These methods generate a deterministic target shard based on the non‑key value.
For example, routing by order_id or buyer_id may direct queries to a buyer‑oriented database, which can feel counter‑intuitive.
Cross‑Database Pagination
Pagination on non‑partition keys across multiple shards cannot be performed efficiently with plain SQL. A typical solution is to offload the pagination to a NoSQL store such as Elasticsearch.
Expansion Challenges
Horizontal database expansion often uses the “upgrade slave” method; horizontal table expansion typically employs a dual‑write migration approach.
Expansion is usually performed in multiples to keep data movement manageable.
Expansion Techniques
Horizontal Database Expansion (Upgrade Slave)
Dual‑write is a generic solution for data migration.
Horizontal Table Expansion (Dual‑Write Migration)
Enable dual‑write in the application configuration, update code, and deploy.
Copy historical data from the old database to the new one.
Validate that the new database contains all required data.
Remove dual‑write configuration and redeploy.
Dual‑write is a universal approach for seamless migration.
Summary
Identify the true bottleneck before deciding between database‑level or table‑level sharding, and between horizontal or vertical strategies.
Choosing the partition key is critical; it must distribute data evenly and support queries that involve non‑partition keys.
Simpler sharding rules are preferable as long as they satisfy business requirements.
Example Repository
GitHub example:
https://github.com/littlecharacter4s/study-shardingSigned-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Code Ape Tech Column
Former Ant Group P8 engineer, pure technologist, sharing full‑stack Java, job interview and career advice through a column. Site: java-family.cn
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
