Choosing Between NewSQL Databases and Middleware‑Based Sharding: A Comparative Analysis
This article objectively compares NewSQL distributed databases with middleware‑based sharding solutions, examining their architectures, distributed transaction handling, scalability, performance, high‑availability, and operational considerations, and provides guidance on selecting the appropriate approach based on workload, consistency, and organizational constraints.
Author Introduction Wenbin Wen works in the Information Technology Department of China Minsheng Bank, focusing on distributed data technologies.
Recently, many colleagues ask how to choose between sharding (middleware + traditional relational databases) and NewSQL distributed databases. This article aims to objectively compare the two approaches by analyzing their key characteristics, advantages, disadvantages, and suitable scenarios.
1. What Makes NewSQL Databases Advanced?
According to the SIGMOD 2016 paper (pavlo‑newsql‑sigmodrec), NewSQL can be classified into two architectures: (1) true NewSQL databases such as Spanner, TiDB, OceanBase, and (2) middleware‑based sharding solutions like Sharding‑Sphere, Mycat, DRDS. The latter can be considered a “pseudo‑distributed” database because SQL parsing and execution planning are duplicated in both middleware and the underlying DB, leading to inefficiency.
Key advantages of NewSQL over middleware‑based sharding:
Better utilization of memory‑centric storage and concurrency control.
Avoids redundant SQL parsing and execution‑plan generation.
Optimized distributed transaction protocols (e.g., per‑transaction timestamps) provide higher performance than traditional XA.
Multi‑replica storage based on Paxos or Raft offers true high‑availability (RTO < 30 s, RPO = 0).
Built‑in automatic sharding, data migration, and online scaling are transparent to applications.
However, the promised features need to be examined in practice.
2. Distributed Transactions
NewSQL databases still face the CAP theorem; they do not magically bypass it. Google Spanner claims to be “practically CA” because its private global network makes network partitions extremely rare. Most NewSQL systems still rely on two‑phase commit (2PC) or variants such as Percolator’s timestamp‑oracle + MVCC + Snapshot Isolation, which improve performance but cannot eliminate the inherent overhead of 2PC.
Complete support for distributed transactions is difficult: it must handle network failures, hardware faults, and must be rigorously tested. Many NewSQL products still have gaps, and strong‑consistency transactions often remain a performance bottleneck.
For high‑throughput OLTP workloads, many practitioners prefer flexible (BASE) transaction models—Saga, TCC, reliable messaging—over strict ACID.
3. High Availability and Multi‑Active Deployments
Traditional master‑slave replication suffers from data loss under failures. NewSQL databases adopt Paxos/Raft multi‑replica designs, providing automatic leader election and fast failover. Some vendors claim geo‑distributed active‑active setups, but these require low inter‑datacenter latency; otherwise, the latency penalty makes true active‑active impractical.
Hybrid approaches (e.g., Ant Financial’s dual‑write with distributed cache) are used to achieve limited geo‑replication while keeping latency acceptable.
4. Horizontal Scaling and Sharding Mechanisms
While Paxos solves consistency and availability, it does not address scaling; therefore, NewSQL databases embed automatic sharding. For example, TiDB splits data into regions and migrates them when they reach ~64 MiB. In contrast, middleware‑based sharding requires the application to define shard keys, routing rules, and manual scaling procedures, increasing complexity.
5. Distributed SQL Support
NewSQL systems typically support full SQL, including cross‑shard joins and aggregations, thanks to global statistics and cost‑based optimizers (CBO). Middleware solutions often rely on rule‑based optimization (RBO) and may lack efficient cross‑shard query capabilities.
Both approaches usually support MySQL or PostgreSQL protocols, but NewSQL may lack stored procedures, views, or foreign keys.
6. Storage Engine
Traditional relational databases use B‑Tree engines optimized for disk access. NewSQL databases often employ LSM‑tree engines, turning random writes into sequential writes, which improves write throughput at the cost of more complex read paths (requiring compaction, bloom filters, etc.).
7. Maturity and Ecosystem
NewSQL is still evolving; its ecosystem, tooling, and community are younger than those of mature relational databases. Traditional RDBMS benefit from decades of stability, extensive tooling, and a larger talent pool.
8. Conclusion
When deciding between NewSQL and middleware‑based sharding, consider the following questions:
Do you need strong‑consistent distributed transactions?
Is data growth unpredictable?
Is frequent scaling beyond your ops capacity?
Do you prioritize throughput over latency?
Must the solution be completely transparent to applications?
Do you have DBAs experienced with NewSQL?
If several answers are “yes,” NewSQL may be worth the investment despite higher learning curves and potential risks. Otherwise, middleware‑based sharding remains a lower‑cost, lower‑risk option that leverages existing relational database ecosystems.
Ultimately, the choice depends on workload characteristics, organizational expertise, and risk tolerance.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Aikesheng Open Source Community
The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
