Design and Implementation of the NewSQL Distributed Database TiDB
This article presents a comprehensive technical overview of TiDB, a NewSQL distributed database, covering its architecture, SQL layer, KV engine, distributed transaction mechanisms, code implementation in Go, open‑source practices, and future roadmap.
The speaker, Liu Qi (goroutine), founder and CEO of PingCAP, introduces TiDB, an open‑source NewSQL distributed database and the related distributed cache Codis, describing his background in infrastructure at JD.com and Wandoujia.
He explains the motivation behind NewSQL: combining the scalability of NoSQL with the strong consistency and transactional guarantees of traditional relational databases, citing examples such as Google Spanner, F1, FoundationDB, CockroachDB, and TiDB itself.
The article outlines common approaches to scaling relational databases—master‑slave replication, sharding middleware (Cobar, TDDL, Vitess, MyCat, etc.)—and their limitations, especially regarding dynamic scaling and transaction support.
It then contrasts NoSQL solutions (HBase, Cassandra, MongoDB) and discusses why pure NoSQL often lacks expressive SQL interfaces and robust transaction semantics.
Moving to TiDB’s architecture, the author presents a layered view: a SQL layer on top of a distributed KV layer. The SQL layer handles lexical analysis and parsing using Go tools such as cznic/goyacc and cznic/ebnf2y , generating an abstract syntax tree (AST) and a plan tree for query execution.
Example code snippets illustrate the execution flow:
func (s *session) Execute(sql string) ([]rset.Recordset, error) {
statements, err := Compile(sql)
var rs []rset.Recordset
for _, st := range statements {
r := runStmt(s, st)
rs = append(rs, r)
}
return rs, nil
}
func Compile(src string) ([]stmt.Statement, error) {
l := parser.NewLexer(src)
if parser.YYParse(l) != 0 {
return nil, errors.Trace(l.Errors()[0])
}
return l.Stmts(), nil
}The plan generation process is described step‑by‑step (FROM → WHERE → LOCK → GROUP BY → HAVING → SELECT → DISTINCT → ORDER BY → LIMIT → FINAL), with examples of how a simple SELECT statement is transformed into a series of plan nodes.
TiDB’s KV mapping is explained using a table‑row‑column key format (TableID:RowID:ColumnID). Sample key‑value pairs illustrate how a row is stored and how indexes are represented, including unique and non‑unique index layouts.
The transaction interface is shown, highlighting required operations (Get, Set, Seek, Delete, Commit, Rollback) and the need for ordered KV stores to support scans. Distributed transaction handling follows a two‑phase commit (2PC) model, with discussion of coordinator selection, transaction status tables, MVCC, and conflict resolution strategies.
Implementation details of the storage engine abstraction are provided, noting support for LevelDB, RocksDB, LMDB, BoltDB, and a planned HBase‑based engine. An example of a lightweight LMDB engine implementation (~200 lines) is referenced.
The author shares practical insights on open‑source project management: community building, contribution guidelines, PR handling, and the importance of English documentation for global collaboration.
Future roadmap items include improving SQL compatibility, asynchronous schema changes (referencing Google’s research), developing an HBase engine, eventually building a custom KV layer, multi‑tenant support, and containerization.
The article concludes with a Q&A covering transaction status tables, differences between TiDB and MySQL, roadmap details, distributed transaction support, language choice (Go), and comparisons with other distributed databases.
Architect
Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.