Databases 13 min read

TiDB Architecture Explained: TiKV, PD, and Raft in Distributed Databases

TiDB is a distributed, MySQL-compatible database built from three core components—TiDB Server for stateless SQL processing, PD for global scheduling and metadata management, and TiKV for high‑performance key‑value storage—coordinated via the Raft consensus algorithm to ensure strong consistency and fault tolerance.

Programmer DD
Programmer DD
Programmer DD
TiDB Architecture Explained: TiKV, PD, and Raft in Distributed Databases

Preface

After studying TiDB, the author summarizes its architecture.

Overall Framework

TiDB consists of three core components: TiDB Server, PD Server, and TiKV Server, plus TiSpark for complex OLAP needs. A single‑node deployment requires all three components; production clusters are deployed with Ansible.

A complete TiDB cluster diagram is shown below:

TiDB cluster architecture
TiDB cluster architecture

TiKV Server

TiKV Server stores data and must provide:

Cross‑data‑center disaster recovery

High write speed

Convenient read speed

Support for data modification and concurrent updates

Atomicity for multi‑record modifications

TiKV uses a key‑value model with ordered traversal; keys are stored in binary order, enabling range scans via successive Next calls.

TiKV’s storage model is independent of SQL tables; it is a high‑performance, highly reliable distributed map.

Data is persisted to disk through RocksDB, a high‑performance single‑node engine maintained by Facebook.

To ensure no data loss and cross‑data‑center disaster recovery, TiDB replicates data across multiple machines using the Raft consensus algorithm, which has been heavily optimized by PingCAP.

Raft provides leader election, membership changes, and log replication.

TiKV writes data via Raft; each change becomes a Raft log entry replicated to a majority of nodes in the Raft group, ensuring safety even if a node fails.

Data storage flow is illustrated below:

TiKV storage flow
TiKV storage flow

TiKV stores data in Regions, each a contiguous key range (default ≤64 MB). Regions are balanced across nodes, and a metadata component tracks which Region resides on which node.

Each Region has multiple Replicas forming a Raft group; one Replica acts as Leader, others as Followers. All reads and writes go through the Leader, which replicates to Followers.

TiKV replication architecture
TiKV replication architecture

Summary: TiKV is a distributed key‑value storage system, a massive ordered map.

PD Server

Placement Driver (PD) is the global control node of TiDB, responsible for cluster scheduling.

PD collects information such as node status, Raft group metrics, and operation statistics to make scheduling decisions.

Information Collection

PD relies on two heartbeat sources:

1. TiKV node heartbeats

TiKV stores (stores) send periodic heartbeats containing total and available disk capacity, number of Regions, write speed, snapshot counts, overload status, and label information.

2. Raft group leader heartbeats

Leaders report leader and follower locations, number of offline Replicas, and read/write speeds.

PD uses this data to schedule replicas, balance load, and handle node failures.

Scheduling Strategies

PD applies several policies:

Ensure each Region has the correct number of Replicas.

Distribute Replicas of a Raft group across different locations.

Balance Replica count across Stores.

Evenly distribute Leaders among Stores.

Spread hot keys evenly.

Keep Store storage usage roughly equal.

Control scheduling speed to avoid impacting online services.

Support manual node decommissioning.

TiDB Server

TiDB Server receives SQL requests, parses MySQL protocol packets, performs syntax analysis, query planning, optimization, and executes the plan by fetching data from TiKV. It is stateless and can be horizontally scaled behind a load balancer.

TiDB server architecture
TiDB server architecture

TiSpark

TiSpark provides Spark SQL on TiKV, enabling HTAP (Hybrid Transactional/Analytical Processing) by running Spark directly on TiDB’s storage layer. It requires a Spark cluster and the presence of TiKV and PD.

Conclusion

TiKV handles storage, PD handles scheduling, TiDB Server handles computation, and the Raft protocol ensures strong consistency and data safety across the distributed TiDB database.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Database ArchitectureTiDBRaftTiKVPD
Programmer DD
Written by

Programmer DD

A tinkering programmer and author of "Spring Cloud Microservices in Action"

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.