Databases 13 min read

Scaling Zhihu's Moneta Service with TiDB: Architecture, Performance, and Lessons Learned

This article describes how Zhihu's Moneta service, which stores over a trillion rows of user‑read data, migrated from MySQL sharding to the distributed NewSQL database TiDB to achieve high availability, horizontal scalability, millisecond‑level query latency, and improved overall system performance.

Java Architect Essentials
Java Architect Essentials
Java Architect Essentials
Scaling Zhihu's Moneta Service with TiDB: Architecture, Performance, and Lessons Learned

Zhihu's Moneta application stores about 1.3 trillion rows of data and generates roughly 1 trillion new rows each month, projected to reach 3 trillion rows in two years, creating severe scalability challenges for the backend while maintaining a good user experience.

The main pain points include the need for high‑availability data, massive write throughput (over 40 k records per second at peak), long‑term storage of historical data, handling high‑throughput queries on millions of posts per second, and keeping query response times under 90 ms.

Initial attempts with MySQL sharding and Master High Availability (MHA) proved inadequate due to complex application code, difficulty changing shard keys, lack of read load balancing, and security concerns.

TiDB, an open‑source MySQL‑compatible NewSQL HTAP database, was adopted. Its architecture consists of stateless TiDB servers (SQL layer), distributed TiKV key‑value stores with Raft consensus, PD servers for metadata management, and TiSpark for OLAP workloads.

After migrating to TiDB, the system achieved horizontal scalability, strong consistency, and cloud‑native resilience. Performance metrics showed peak write throughput of 40 k rows/s, query processing of 30 k queries/s on 12 million posts, 99th‑percentile response time around 25 ms, and 99.9th‑percentile around 50 ms.

Key lessons learned include the importance of separating latency‑sensitive queries into dedicated TiDB instances, using SQL hints and low‑precision timestamps to optimize execution plans, and leveraging TiDB Lightning and DM for fast data import (1.1 trillion rows imported in four days).

TiDB 3.0 introduces features such as the Titan key‑value engine, multi‑threaded Raftstore, batch Raft messages, SQL plan management, and TiFlash columnar storage, all of which further reduce write and query latency and simplify cluster management.

Future work involves expanding TiDB 3.0 capabilities in Moneta and related anti‑spam services, taking advantage of table partitioning for time‑range queries, and continuing open‑source contributions to strengthen the TiDB ecosystem.

distributed systemsPerformance OptimizationTiDBHTAPDatabase Scalabilitylarge-scale data
Java Architect Essentials
Written by

Java Architect Essentials

Committed to sharing quality articles and tutorials to help Java programmers progress from junior to mid-level to senior architect. We curate high-quality learning resources, interview questions, videos, and projects from across the internet to help you systematically improve your Java architecture skills. Follow and reply '1024' to get Java programming resources. Learn together, grow together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.