Databases 11 min read

Overview of TiDB Architecture: TiKV, PD, TiDB Server, and TiSpark

This article provides a comprehensive overview of TiDB's architecture, detailing the roles of TiKV Server, Placement Driver (PD), TiDB Server, and the TiSpark component, and explains how Raft ensures data consistency across the distributed database system.

Architect
Architect
Architect
Overview of TiDB Architecture: TiKV, PD, TiDB Server, and TiSpark

Introduction

After studying TiDB for several days, the author summarizes the core concepts and components of the TiDB distributed database.

Overall Framework

TiDB consists of three core components—TiDB Server, PD Server, and TiKV Server—plus an optional TiSpark component for complex OLAP workloads. In a single‑node deployment all three components must be started, while production clusters are typically deployed with Ansible.

TiKV Server

TiKV is the storage layer responsible for persisting data. It provides cross‑data‑center disaster recovery, high write/read throughput, concurrent modifications, and atomic multi‑record updates.

TiKV stores data as an ordered key‑value map backed by RocksDB. To guarantee consistency across replicas, TiKV uses the Raft consensus algorithm, which handles leader election, membership changes, and log replication.

Data is partitioned into Regions (default size 64 MiB). Each Region is replicated across multiple nodes as a Raft group, with one replica acting as the leader. Reads and writes are routed through the leader, which replicates changes to followers.

In summary, TiKV is a distributed, globally ordered key‑value store that functions as a massive map.

Placement Driver (PD) Server

PD acts as the central control plane for the TiDB cluster, collecting status information from TiKV nodes and Raft group leaders, and making scheduling decisions based on that data.

PD gathers two types of heartbeat information: (1) periodic reports from each TiKV store containing disk capacity, region count, write speed, snapshot activity, overload status, and tags; (2) reports from each Raft group leader describing leader/follower locations, missing replicas, and read/write rates.

Based on this information, PD applies several scheduling strategies, such as ensuring the correct number of replicas per Region, distributing replicas and leaders evenly across stores, balancing storage usage, handling hot spots, and controlling the pace of migrations to avoid impacting online services.

PD also manages global ID generation, timestamps (TSO), and provides routing information to clients.

TiDB Server

TiDB Server is a stateless SQL layer that receives MySQL‑compatible queries, parses them, generates execution plans, and interacts with TiKV to fetch data. It can be horizontally scaled behind load balancers (e.g., LVS, HAProxy, F5).

Because TiKV stores data as key‑value pairs, TiDB includes a mapping layer that translates SQL operations into KV reads/writes.

TiSpark

TiSpark integrates Spark SQL with TiKV, enabling HTAP (Hybrid Transactional/Analytical Processing) by running analytical queries directly on the TiKV storage layer. It requires a separate Spark cluster and depends on TiKV and PD.

Conclusion

TiKV provides distributed storage, PD handles scheduling and metadata, TiDB Server performs stateless SQL computation, and Raft ensures strong consistency across the cluster. Together they form a highly available, scalable distributed database system.

ArchitectureDistributed DatabaseTiDBRaftTiKVPlacement Driver
Architect
Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.