Is Sharding Ready to Retire? Why the Classic Split‑Database Approach Is Becoming Legacy
The article reviews the rise and decline of traditional database sharding, explains its technical pitfalls such as wrong shard keys, cross‑database joins, and distributed transactions, compares it with emerging NewSQL solutions like TiDB, OceanBase and PolarDB‑X, and offers practical criteria for choosing the right architecture.
Preface
Two days before a release, a colleague reported a painfully slow query that scanned all databases because it used phone instead of the sharding key user_id. The middleware broadcast the request to all 16 databases, merged the results, and exposed a design flaw after three months of work.
How Sharding Came About
Around 2010, MySQL tables of tens of millions of rows became a bottleneck for fast‑growing e‑commerce, social and payment services. The only solution was to split data across multiple tables and databases, giving birth to sharding.
At that time there were no mature distributed databases; MySQL partitioning was limited and Oracle was too expensive. Large companies like Taobao, Meituan and Weibo discussed shard‑key selection and data migration extensively.
Sharding‑JDBC (later ShardingSphere) and MyCat, both open‑source, became the dominant middleware in 2016‑2017, allowing a single logical schema to be spread over 16 databases and tables, handling write volume and retaining users.
Common Pain Points Experienced by Practitioners
Choosing the wrong shard key is painful to change. A shard key such as user_id works for per‑user queries, but a new requirement like “find all orders for a merchant” forces the middleware to scan every database, resulting in severe latency. Changing the shard key requires downtime, data migration that can take days for a billion‑row table, and re‑validation.
Cross‑database JOINs are almost unusable. When three tables (user, order, product) reside in different databases, the middleware cannot perform a true join. Developers must issue separate queries, batch‑fetch related rows, and assemble the result in application code, which is error‑prone and cumbersome.
Distributed transactions are a major trap. An order creation needs to write to both order and inventory databases. Because the writes occur on separate MySQL instances, XA or Seata must be introduced. XA incurs high performance overhead; Seata is complex to configure and hard to debug, leading many teams to abandon strong consistency in favor of eventual consistency with compensation logic.
Scaling is not elastic. Expanding from 8 to 16 databases requires a multi‑step process: stop writes, re‑hash data, migrate, adjust shard rules, verify integrity, and resume writes. This often happens overnight with manual monitoring, contradicting the notion of “elastic” scaling.
SQL limitations. Aggregations, ORDER BY, and GROUP BY face pitfalls. ORDER BY must be re‑sorted at the middleware layer, risking OOM for large result sets. Deep pagination such as LIMIT 100000, 10 forces each shard to fetch over 100 k rows before merging, consuming memory and time.
The World Has Changed
In 2012 Google published the Spanner paper, introducing a globally distributed relational database with separated storage and compute, automatic sharding, and globally consistent transactions while exposing standard SQL.
NewSQL databases—TiDB, OceanBase, CockroachDB, YugabyteDB, and Alibaba Cloud PolarDB‑X—follow this model, making sharding transparent to applications.
Current Main Options
TiDB (PingCAP)
Open‑source, MySQL‑compatible, HTAP (Hybrid Transactional/Analytical Processing) capability. Over 4 000 enterprises use it (e.g., ZaloPay, Hangzhou Bank). Suitable for large data volumes, real‑time analytics, and teams capable of self‑operating. Drawbacks: high resource consumption; a minimal production cluster needs at least three TiKV and three PD nodes, increasing hardware cost.
OceanBase (Ant Group)
Born from Alibaba’s internal transaction system, emphasizing stability and strong transactional guarantees. Used by high‑availability scenarios such as China Southern Airlines and Xiamen Metro. Community edition is free; commercial edition offers full support. Drawbacks: operational complexity and less mature international documentation.
PolarDB‑X (Alibaba Cloud)
Evolved from DRDS, now a true distributed database. In February 2025 it achieved a TPC‑C benchmark of 20.55 billion transactions per minute, a world record. A case study shows a 200 % increase in processing capacity, 90 % reduction in slow SQL, and 40 % cost saving after migration. Suitable for teams already on Alibaba Cloud or needing commercial support; however, it is tightly coupled to the cloud and harder to privatize.
Sharding Is Not Dead—But Its Role Has Shifted
Sharding remains reasonable in several scenarios:
Legacy systems that have run stably for years; migration risk outweighs benefits.
Small teams with modest data volumes where a full NewSQL stack would be overkill.
Projects that rely on MySQL‑native behavior and cannot tolerate unknown side effects of a new engine.
Ultra‑high‑write, latency‑sensitive workloads (e.g., logging, monitoring) where local MySQL tables outperform networked distributed databases.
Thus, sharding is moving from a primary solution to a backup option.
How to Choose: Decision Dimensions
Data volume. Below 1 billion rows per table usually does not require sharding; focus on indexing, query optimization, or slow‑SQL fixes. Above five hundred million rows, consider archiving or table‑level splitting.
Signals that indicate sharding is needed:
Table size > 1 TB or daily growth > 10 GB with no ceiling.
Frequent DDL failures, backup timeouts, or replica lag due to table size.
Write throughput approaching a single instance’s limit, not alleviated by read replicas.
Highly variable query dimensions that cannot be solved by simple indexes.
Team capability. TiDB, OceanBase, and PolarDB‑X require engineers who can read slow‑query logs, understand region hotspots, and handle MVCC‑induced write amplification. Without such expertise or commercial support, sharding may be safer.
SQL compatibility requirements. New projects can follow distributed‑database best practices. Legacy migrations must rigorously test compatibility, especially for MySQL‑specific syntax, stored procedures, and triggers, using full‑SQL replay before cut‑over.
Budget. Self‑hosted clusters need multiple machines (e.g., three TiKV + three PD + TiDB servers). Cloud‑hosted versions are pay‑as‑you‑go but may become costly over time. Remember to account for hidden engineering effort required to maintain sharding middleware.
Conclusion
Sharding solved real problems a decade ago and helped many teams survive rapid growth. However, modern NewSQL databases now provide transparent scaling, global consistency, and standard SQL without manual sharding. For new projects, evaluate TiDB, OceanBase, or PolarDB‑X. For existing projects, weigh migration cost against the benefits; keep the sharding system stable until a clear need to upgrade arises. The right solution is the one that fits the team’s skill set, workload characteristics, and budget.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
