Databases 11 min read

Why NewSQL? Classification, Technical Challenges, and Discussion

This article reviews the motivations behind NewSQL, outlines its three main classifications, examines the technical challenges such as memory storage, sharding, concurrency control, replication and crash recovery, and provides a discussion on its scalability and architectural trade‑offs.

Architecture Digest
Architecture Digest
Architecture Digest
Why NewSQL? Classification, Technical Challenges, and Discussion

Why NewSQL?

Database workloads grew dramatically with the rise of the Internet around 2000, creating scalability bottlenecks that traditional relational databases could not handle. Two primary scaling approaches emerged: vertical scaling (more powerful hardware) and horizontal scaling (sharding via middleware). Vertical scaling incurs high cost and downtime, while middleware‑based sharding struggles with distributed transactions and complex joins, prompting the need for a new solution.

NewSQL Definition and Benefits

NewSQL targets OLTP workloads, offering the scalability and performance of NoSQL while preserving ACID‑compliant transactions and the relational model. Its advantages include easy scalability and simplified application logic thanks to native transaction support.

NewSQL Classification

NewSQL systems fall into three categories:

New architectures built from the ground up.

Transparent sharding middleware.

Database‑as‑a‑Service (DBaaS) offerings.

New Architectures

Key characteristics of new‑architecture NewSQL include:

No shared storage.

Multi‑node concurrency control.

Replication for high availability and disaster recovery.

Traffic control.

Distributed query processing.

Advantages are distributed‑aware query optimization, data‑local query routing, and flexible multi‑replica storage. The main drawback is a shortage of skilled operators. Representative products: Google Spanner, CockroachDB.

Transparent Sharding Middleware

Middleware responsibilities:

Routing queries to appropriate shards.

Coordinating distributed transactions.

Managing data placement, replication, and partitioning.

Pros: applications need little or no changes. Cons: underlying nodes still run traditional disk‑based DBMSs, leading to inefficient use of large‑memory, multi‑core servers and duplicated query planning/optimization.

Representative products: MariaDB MaxScale, ScaleArc.

Database‑as‑a‑Service (DBaaS)

Features:

On‑demand usage.

Cloud‑native storage enabling easy scalability.

Representative products: Amazon Aurora, ClearDB.

Technical Challenges of NewSQL

Main Memory Storage

Traditional databases rely on disk‑centered designs with caching, but modern large‑memory servers require new memory management techniques beyond simple page caching.

Partitioning/Sharding

Data is partitioned by hash or range on selected columns, requiring the DBMS to execute SQL across partitions, merge results, support online node addition/removal, and enable online migration or replication of partitions.

Concurrency Control

Provides atomicity and isolation for ACID. Atomicity often uses variants of two‑phase commit (2PC) with either a central coordinator or a decentralized clock‑synchronised approach. For example, Spanner relies on hardware atomic clocks, while CockroachDB uses a hybrid clock combining loosely synchronised hardware clocks with logical counters.

Isolation Techniques

2PL (Two‑Phase Locking)

MVCC (Multiversion Concurrency Control)

OCC (Optimistic Concurrency Control)

Most NewSQL systems adopt MVCC (e.g., CockroachDB) or a combination of 2PL+MVCC (e.g., InnoDB, Spanner).

Secondary Indexes

Two implementation styles exist:

Local indexes: each partition holds a fragment of the index; updates affect a single node, but reads may span many nodes.

Global indexes: every partition stores the full index; updates require distributed transactions, but reads are single‑node.

Replication

Key concerns are consistency (using Paxos or 2PC across partitions) and the replication mode (synchronous command execution vs. asynchronous state sync).

Crash Recovery

Minimising downtime typically involves primary‑secondary failover and using checkpoints to accelerate the reintegration of newly added nodes.

Discussion

Scalability is a core NewSQL attribute, yet middleware‑based solutions face inherent scalability limits due to routing metadata. Additional challenges include multi‑tenant isolation, load balancing, and the lack of groundbreaking theoretical innovations; NewSQL mainly represents architectural integration of existing techniques to meet modern distributed system demands.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

middlewareNewSQL
Architecture Digest
Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.