Databases 18 min read

Why NewSQL Databases Outperform Middleware Sharding? A Deep Comparison

This article objectively compares NewSQL databases with middleware‑based sharding solutions, examining architecture, distributed transactions, CAP constraints, performance, high availability, scaling, SQL support, storage engines, maturity, and provides practical guidance for choosing the right approach.

Java Backend Technology

Apr 6, 2020

Why NewSQL Databases Outperform Middleware Sharding? A Deep Comparison

What Makes NewSQL Databases Advanced?

When discussing sharding versus distributed databases, many wonder whether middleware + traditional relational databases count as NewSQL; academic papers classify Spanner, TiDB, and OceanBase as the first new‑architecture type, while Sharding‑Sphere, Mycat, DRDS, etc., belong to the second middleware type.

Middleware + sharding does distribute storage and enables horizontal scaling, but it repeats SQL parsing and execution‑plan generation at both middleware and DB layers, making it a “pseudo‑distributed” system.

Architecture Comparison

The following diagram illustrates the structural differences:

Traditional databases are disk‑oriented; NewSQL leverages memory‑centric designs for higher efficiency.

Middleware repeats SQL parsing and optimization, reducing overall efficiency.

NewSQL’s distributed transactions are optimized beyond XA, offering higher performance.

NewSQL stores data using Paxos or Raft multi‑replica protocols, achieving true high availability (RTO < 30 s, RPO = 0) compared to traditional master‑slave setups.

Built‑in sharding in NewSQL automates data migration and scaling, relieving DBA workload and remaining transparent to applications.

Distributed Transactions and CAP

CAP theorem still applies: achieving strong consistency inevitably sacrifices either availability or partition tolerance. NewSQL does not break CAP; instead, systems like Google Spanner achieve a practical CA state by minimizing network partitions through a private global network.

In distributed systems you can know where work happens or when it finishes, but not both simultaneously; two‑phase commit is inherently anti‑available.

Completeness

Two‑phase commit (2PC) cannot guarantee strict ACID under all failure scenarios; recovery mechanisms can eventually restore consistency, but true atomicity may be temporarily compromised.

Performance

Traditional relational databases support XA, but its high network overhead and blocking make it unsuitable for high‑throughput OLTP. NewSQL often uses optimized 2PC variants (e.g., Google Percolator) with timestamp oracle, MVCC, and snapshot isolation, reducing lock contention and improving throughput, though cross‑node commits still incur overhead.

SI is optimistic; in hotspot scenarios it may cause many aborts, and its isolation differs from repeatable read.

HA and Multi‑Active Deployment

While Paxos/Raft‑based multi‑replica designs provide strong HA, real‑world multi‑active deployments are limited by network latency; distant data centers cannot achieve sub‑10 ms round‑trip times required for synchronous commits.

Scale and Sharding Mechanism

NewSQL inherently supports automatic sharding, hotspot detection, and region splitting (e.g., TiDB splits regions at 64 MiB). Middleware‑based sharding requires upfront design of split keys, routing rules, and manual scaling, adding significant complexity.

Distributed SQL Support

NewSQL offers full‑stack SQL support, including cross‑shard joins and aggregations, thanks to built‑in statistics and cost‑based optimization (CBO). Middleware relies on rule‑based optimization (RBO) and often cannot efficiently handle cross‑database joins.

Storage Engine

Traditional engines use B+‑tree structures optimized for disk reads but suffer from random‑write penalties. NewSQL typically adopts LSM‑tree storage, converting random writes into sequential writes for higher write throughput, while employing Bloom filters and SSD caching to mitigate read performance loss.

Maturity and Ecosystem

NewSQL is still evolving, with strong adoption in internet companies but limited penetration in risk‑averse industries like banking. Traditional relational databases boast decades of stability, extensive tooling, and broader talent pools.

Conclusion

When deciding between NewSQL and middleware + sharding, consider questions such as the necessity of strong consistency, data growth predictability, scaling frequency, throughput versus latency priorities, application transparency, and available DBA expertise. If several answers are affirmative, NewSQL may be worth the learning curve; otherwise, middleware sharding remains a lower‑risk, mature solution.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

transaction CAP theorem sharding NewSQL distributed databases

Written by

Java Backend Technology

Focus on Java-related technologies: SSM, Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading. Occasionally cover DevOps tools like Jenkins, Nexus, Docker, and ELK. Also share technical insights from time to time, committed to Java full-stack development!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.