When to Use Distributed vs. Centralized Databases: Analysis, Benchmarks, and Best Practices
This article examines the trade‑offs between centralized and distributed OLTP databases, presents industry usage statistics, performance benchmarks, practical questions for migration, and detailed guidance on sharding, SQL design, and operational considerations to help decide when a distributed solution is truly needed.
Choosing between a centralized or distributed database is a common dilemma in OLTP system modernization, especially for domestic database vendors. The article first outlines the current landscape: over 200 domestic database vendors exist, with traditional centralized products like Kingbase and Dameng, and distributed options such as GaussDB, Kingwow, TDSQL, GoldenDB, and OceanBase, many of which support both deployment modes.
It highlights that the financial sector still heavily relies on centralized databases (about 89% overall, 80% in banking), while distributed databases account for a smaller but growing share. The discussion then questions whether every architecture truly needs distributed capabilities.
The article proposes a checklist before moving to a distributed setup: optimize the existing centralized database, consider vertical scaling, evaluate storage‑compute separation, assess application‑level sharding, and understand the operational overhead of distributed systems.
Experimental data is presented: a benchmark on a 4‑shard distributed database shows significant performance gains for full‑table scans and joins at around 5 million rows, while point queries see little difference. Another sysbench test indicates that a centralized database reaches a practical TPS ceiling of roughly 5 000 under 75% resource utilization, suggesting a threshold for considering sharding.
Practical advice follows on how to use distributed databases effectively: choose a highly selective shard key (often the primary key), prefer hash distribution, define global tables for frequently accessed small datasets, write SQL that includes the shard key to avoid cross‑node traffic, and limit distributed transactions to under 10% of total transactions.
A comparison table summarizes key differences between centralized and distributed databases in terms of data sharding, transaction support, scalability, stored‑procedure compatibility, and migration costs.
Finally, the article concludes that while distributed databases offer high availability and scalability, they introduce complexity and higher operational costs; therefore, teams should carefully evaluate business growth, performance requirements, and development capabilities before deciding.
Aikesheng Open Source Community
The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.