Databases 32 min read

What Drives TiDB’s Architecture? The Philosophy Behind a NewSQL Database

The article explores TiDB’s evolution philosophy, detailing its core beliefs, early user stories, and the three‑step "Make it work, make it right, make it fast" approach, while covering technical choices such as cloud‑first design, hardware‑agnosticism, formal verification, massive testing, performance tuning, and cost‑effective scaling.

dbaplus Community
dbaplus Community
dbaplus Community
What Drives TiDB’s Architecture? The Philosophy Behind a NewSQL Database

Foundational Beliefs

TiDB was designed from the start with three core principles: (1) the cloud is the future, so Kubernetes was adopted early; (2) the system must be hardware‑agnostic and runnable on any public cloud, private cloud, or on‑premise environment; (3) support for multiple hardware architectures, including x86, ARM, MIPS (e.g., Loongson) and future GPU acceleration via TiFlash.

Product Development Philosophy

The team follows the classic software mantra “Make it work, make it right, make it fast”.

Make it work

The first prototype was built on top of an existing distributed KV store (initially HBase) to demonstrate a working system quickly. A layered architecture—SQL layer above a key‑value layer—was chosen so that early feedback could be gathered from the SQL interface, which is closest to users. The first open‑source release was in September 2015, after which early adopters validated the design and guided further iteration.

Make it right

Correctness is essential, especially for financial workloads. All core algorithms are formally verified with TLA+; the verification repository is https://github.com/pingcap/tla-plus. Testing is performed at massive scale: the test suite contains tens of millions of cases, leveraging MySQL compatibility to import existing MySQL tests, and a 24/7 fault‑injection framework runs on clusters of hundreds of servers. This ensures reliability despite the system’s complexity.

Make it fast

Performance improvements have been introduced as user workloads grew:

Execution‑plan tuning : index hints were added, followed by a migration from a rule‑based optimizer (RBO) to a cost‑based optimizer (CBO) for better plan selection.

Hotspot scheduler : a novel scheduler monitors hot regions and automatically migrates them to less loaded nodes, reducing contention.

Raft multithreading : in TiDB 3.0 the Raft store processing was off‑loaded to additional threads, eliminating the single‑thread bottleneck and lowering latency.

Bulk import and analytical queries : parallel scan parameters and TiDB Lightning accelerate data loading and large‑scale analytical workloads.

Mixed OLTP/OLAP support : TiSpark enables simultaneous transactional and analytical processing on the same data set.

Cost‑Efficiency via Hot‑Data Isolation

Customers reported high cost for all‑SSD clusters. TiDB’s block‑level storage (64 MiB blocks) allows detection of hot data. Dedicated nodes are spun up to serve only the hot blocks, and are released after peak periods, avoiding the need to provision the entire dataset for peak load.

Horizontal vs. Vertical Development

Instead of focusing on a single industry, TiDB pursued a “horizontal first” strategy, targeting large‑scale internet companies with massive data, high QPS, and strict stability requirements. This broad validation ensured robustness before expanding to traditional industries. The roadmap prioritises stability, ease of use, performance, and finally new features.

Future Directions

Ongoing work includes further latency reductions, deeper integration with public and private clouds, self‑driving hot‑data scaling, and extending TiFlash to support GPU acceleration for analytical workloads.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performancearchitectureTiDBNewSQLdistributed databases
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.