Databases 12 min read

TiDB Overview: Architecture, Core Features, and Performance Compared with MySQL

The article explains why traditional MySQL sharding is discouraged, introduces the distributed database TiDB, details its architecture, core capabilities such as HTAP and Raft‑based consistency, compares feature support and resource usage with MySQL, and presents benchmark results demonstrating TiDB’s advantages at large scale.

IT Services Circle
IT Services Circle
IT Services Circle
TiDB Overview: Architecture, Core Features, and Performance Compared with MySQL

Why Sharding Is Not Recommended

When MySQL reaches a certain scale, performance degrades, and many teams resort to middleware such as Mycat , ShardingSphere , or tddl for sharding, but these approaches introduce pagination issues, distributed transaction complexities, data migration and expansion challenges, changes in development patterns, cross‑database queries, and business trade‑offs.

TiDB Introduction

TiDB, an open‑source distributed relational database developed by PingCAP since September 2015, supports both OLTP and OLAP workloads, offers horizontal scalability, high‑availability, real‑time HTAP, cloud‑native deployment, and MySQL 5.7 protocol compatibility, making it suitable for high‑availability, strong‑consistency, and large‑scale data scenarios.

Core Features

Financial‑grade high availability

Online horizontal scaling with compute‑storage separation

Cloud‑native, deployable on public, private, or hybrid clouds

Real‑time HTAP with TiKV row store and TiFlash column store

MySQL protocol and ecosystem compatibility

Strongly consistent distributed transactions

Seamless migration from MySQL with minimal code changes

PD provides CP consistency in the CAP theorem

Use Cases

Financial systems requiring strong consistency, high reliability, and disaster recovery

High‑concurrency OLTP scenarios with massive data and scalability needs

Data aggregation and secondary processing pipelines

Architecture

TiDB Server : Stateless SQL layer exposing MySQL endpoints, parses queries, contacts PD for region location, interacts with TiKV, and returns results; horizontally scalable behind load balancers.

PD Server : Manages cluster metadata, performs scheduling, leader election, and allocates globally unique, monotonically increasing transaction IDs; typically deployed as an odd‑numbered quorum (minimum three nodes).

TiKV Server : Distributed transactional key‑value store handling OLTP data; data is partitioned into Regions, each covering a key range, and replicated (default three replicas) using the Raft consensus algorithm.

How TiKV Guarantees No Data Loss

Data is replicated across multiple nodes; writes are committed via the Raft protocol, requiring a majority of replicas (usually two of three) to acknowledge before confirming success.

Distributed Transaction Support

TiKV supports multi‑key transactions across regions, following Google Percolator’s model; see the original Percolator paper for details.

Comparison with MySQL

Supported Features

Distributed transactions based on Google Percolator (which itself builds on Bigtable)

Optimistic lock + MVCC; MySQL uses pessimistic lock + MVCC; TiDB checks write‑write conflicts only at commit time.

Unsupported Features

No stored procedures, functions, or triggers

Auto‑increment IDs work only within a single TiDB server

No foreign key constraints, temporary tables, or MySQL optimizer trace

XA syntax is not exposed via SQL (TiDB uses two‑phase commit internally)

Resource Usage Comparison

TiDB compresses data heavily: 10.8 TB in MySQL becomes 3.2 TB in TiDB (3.4:1 space ratio). For comparable workloads, TiDB uses far fewer nodes, CPU cores, and storage than MySQL.

Performance Tests

Test Report 1

Benchmarks on various AWS instance types (t2.medium to m5.24xlarge) with a 70 GB MySQL dataset vs. a 30 GB TiDB dataset (compressed). Sample queries include simple counts, group‑by, filtered scans, and complex aggregations. Results show TiDB’s response times improve relative to MySQL as CPU resources increase.

select count(*) from ontime;
select count(*), year from ontime group by year order by year;
select * from ontime where UniqueCarrier='DL' and TailNum='N317NB' and FlightNum='2' and Origin='JFK' and Dest='FLL' limit 10;
select SQL_CALC_FOUND_ROWS FlightDate, UniqueCarrier as carrier, FlightNum, Origin, Dest FROM ontime WHERE DestState not in ('AK','HI','PR','VI') and OriginState not in ('AK','HI','PR','VI') and flightdate > '2015-01-01' and ArrDelay < 15 and cancelled = 0 and Diverted = 0 and DivAirportLandings='0' ORDER by DepDelay DESC LIMIT 10;

System Benchmark

Using Sysbench on an m4.16xlarge instance, TiDB achieved higher transactions per second than MySQL for point‑select workloads with thread counts up to 128.

Test Report 2

Another benchmark with 1 M and 13 M records, followed by a JMeter load of 100 k operations, shows MySQL outperforms TiDB on small datasets due to TiDB’s distributed overhead, but TiDB scales better as data volume grows.

Conclusion

TiDB provides a mature, cloud‑native, HTAP‑capable distributed database that eliminates the need for traditional sharding; it is advantageous for large‑scale workloads, while smaller datasets may not justify its deployment.

Distributed DatabaseMySQL ComparisonTiDBHTAPRaft
IT Services Circle
Written by

IT Services Circle

Delivering cutting-edge internet insights and practical learning resources. We're a passionate and principled IT media platform.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.