Databases 24 min read

What Makes TiDB a NewSQL Powerhouse? A Deep Dive into Architecture, Features, and Use Cases

This article analyzes TiDB as a distributed NewSQL database, explaining the evolution from traditional SQL to NoSQL and NewSQL, detailing TiDB's core components, elastic scaling, ACID transactions, HTAP capabilities, high‑availability design, compatibility with MySQL, real‑world use cases, and its limitations compared to conventional databases.

Java Web Project
Java Web Project
Java Web Project
What Makes TiDB a NewSQL Powerhouse? A Deep Dive into Architecture, Features, and Use Cases

What is NewSQL

Database development can be divided into three generations: (1) traditional SQL databases such as MySQL, (2) NoSQL systems like MongoDB and Redis, and (3) NewSQL, which aims to combine the scalability of NoSQL with the relational model and ACID guarantees of SQL.

Problems with Traditional SQL

With the rapid growth of Internet applications, user scale and data volume have exploded, demanding 24/7 availability. Conventional relational databases become bottlenecks. Two typical remedies are upgrading hardware—still limited by a performance ceiling—and sharding data across cheap commodity nodes, which introduces middleware complexity, especially for cross‑shard joins and transactions.

NoSQL Issues

NoSQL prioritizes high availability and horizontal scalability, often sacrificing strong consistency and full SQL compatibility. While it offers flexible data models, it lacks many relational features and requires custom APIs for each system.

NewSQL Characteristics

NewSQL provides NoSQL‑level scalability while retaining a relational model and mature SQL query language, preserving ACID transaction semantics. In the cloud era, NewSQL systems are born as distributed architectures rather than retrofitted monoliths.

Main Features of TiDB (NewSQL)

Full SQL support with MySQL 5.7 protocol compatibility.

ACID‑compliant transactions and configurable isolation levels.

Elastic horizontal scaling—adding or removing nodes transparently expands capacity.

Financial‑grade high availability via Raft‑based majority election.

Rich ecosystem of tools for data migration, synchronization, and backup.

TiDB Core Features

Horizontal Elastic Scaling

Adding new nodes instantly expands TiDB’s throughput and storage, handling high‑concurrency and massive data scenarios without service interruption.

The storage‑compute separation allows independent scaling of compute (TiDB Server) and storage (TiKV), with PD orchestrating the process transparently to operators.

Distributed Transaction Support

TiDB fully supports standard ACID transactions. Internally it uses a two‑phase commit protocol across TiKV regions to guarantee atomicity and consistency.

Financial‑Grade High Availability

Based on Raft, TiDB’s majority‑vote mechanism ensures 100 % data consistency and automatic failover without manual intervention.

Data is stored as multiple replicas of each Region; PD schedules replicas and handles leader election. If a TiKV node fails, the Raft group elects a new leader and, after a configurable timeout (default 10 minutes), PD migrates the lost Region to healthy nodes.

Real‑Time HTAP

TiDB + TiSpark enables simultaneous OLTP and OLAP on the same dataset, eliminating the need for cumbersome ETL pipelines.

TiKV provides a row‑store engine, while TiFlash offers a column‑store engine that replicates data in real time via a Multi‑Raft Learner protocol, ensuring strong consistency between the two stores.

Overall Architecture

TiDB consists of three core components plus optional extensions:

TiDB Server : Stateless SQL layer that parses queries, generates execution plans, and fetches data from TiKV via PD. It can be scaled horizontally behind a load balancer (e.g., LVS, HAProxy).

PD (Placement Driver) : Central metadata service that stores cluster information, schedules Regions, balances load, and allocates globally unique, monotonically increasing transaction IDs. PD runs a Raft cluster (odd number of nodes recommended).

TiKV Server : Distributed key‑value store organized into Regions (default size 144 MB). Each Region is replicated (default three replicas) using Raft for consistency. PD manages Region splitting, merging, and migration.

TiSpark : Spark plugin that runs Spark SQL directly on TiKV, providing large‑scale analytical capabilities.

TiFlash : Columnar storage node that accelerates analytical queries while staying strongly consistent with TiKV.

TiDB Server

Receives SQL requests, performs parsing and planning, locates required TiKV nodes via PD, and returns results. Because it holds no state, it can be added or removed without affecting the cluster.

PD Server

Manages metadata, performs Region scheduling, and ensures high availability through Raft. Only the leader handles client requests; followers provide redundancy.

TiKV Server

Stores data as Regions; each Region is a Raft group with a leader and followers. PD balances Regions across nodes, and Region splitting occurs automatically when a Region exceeds the size limit (default 144 MB). Small adjacent Regions are merged when they shrink.

TiSpark & TiFlash

TiSpark runs Spark SQL on TiKV, while TiFlash stores data in columnar format for fast OLAP queries. Both components inherit TiKV’s strong consistency guarantees.

High‑Availability Design

TiDB servers are stateless; deploying at least two instances behind a load balancer ensures request continuity. PD should run an odd number of nodes (typically three) to avoid split‑brain scenarios; leader election takes about three seconds. TiKV’s Raft groups tolerate node failures, and PD migrates data from permanently failed nodes.

Typical Use Cases

MySQL sharding replacement : TiDB can act as a MySQL slave using PingCAP’s Syncer, allowing real‑time cross‑shard queries without complex middleware.

Direct MySQL replacement : For applications that started with a single MySQL instance, TiDB provides out‑of‑the‑box horizontal scaling and strong consistency, eliminating the need for manual sharding.

Data warehouse : TiDB 2.0 runs TPC‑H benchmark queries within ~10 seconds, and TiSpark extends capability for larger analytical workloads.

Component of other systems : TiKV can replace HBase as a distributed key‑value store, and its Raw API offers high‑performance single‑row operations (e.g., a Redis‑compatible layer built on TiKV).

Compatibility with MySQL

TiDB supports the MySQL 5.7 protocol and most syntax, allowing existing MySQL clients, tools (phpMyAdmin, Navicat, MySQL Workbench), and backup utilities (mysqldump, Mydumper) to work unchanged. However, several MySQL features are not supported or behave differently:

Stored procedures, functions, triggers, events, and custom functions.

Foreign key constraints and certain index types (full‑text, spatial).

SYSTEM schemas (e.g., SYS), X‑Protocol, and some optimizer hints.

SQL statements such as SELECT ... INTO @var, GROUP BY ... WITH ROLLUP, and CREATE TABLE ... AS SELECT are unsupported.

Auto‑increment columns guarantee uniqueness but not continuity across multiple TiDB servers; mixing default and custom values may cause duplicate‑key errors.

Configuration Differences

Default charset: TiDB uses utf8mb4, while MySQL 5.7 defaults to latin1 and MySQL 8.0 to utf8mb4.

Default collation: TiDB uses utf8mb4_bin; MySQL 5.7 uses utf8mb4_general_ci, MySQL 8.0 uses utf8mb4_0900_ai_ci. lower_case_table_names is fixed at 2 in TiDB (stores names as given but compares case‑insensitively). explicit_defaults_for_timestamp defaults to ON in TiDB (timestamp columns do not auto‑update), whereas MySQL 5.7 defaults to OFF and MySQL 8.0 to ON.

Foreign key checks are OFF by default in TiDB, while MySQL 5.7 enables them.

Limitations

TiDB does not support view‑based write operations (UPDATE/INSERT/DELETE on a view). Certain DDL statements such as CREATE TABLE ... ENGINE=... are parsed but ignored. Advanced MySQL features like XA transactions, CHECK TABLE, CHECKSUM TABLE, GET_LOCK/RELEASE_LOCK, and column‑level privileges are unavailable.

Overall, TiDB offers a compelling NewSQL solution that blends the scalability of NoSQL with the transactional guarantees of traditional relational databases, making it suitable for HTAP workloads, cloud‑native deployments, and seamless migration from existing MySQL environments.

high availabilitydistributed databaseTiDBHTAPNewSQLMySQL compatibility
Java Web Project
Written by

Java Web Project

Focused on Java backend technologies, trending internet tech, and the latest industry developments. The platform serves over 200,000 Java developers, inviting you to learn and exchange ideas together. Check the menu for Java learning resources.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.