Databases 16 min read

How Zhihu Scaled to Trillions of Rows with TiDB: Lessons from Moneta

Zhihu’s Moneta service, handling over 1.3 trillion rows and billions of daily writes, migrated from MySQL sharding to TiDB, achieving millisecond query latency, high availability, and horizontal scalability, while sharing architectural choices, performance metrics, migration challenges, and future expectations for TiDB 3.0.

Java Backend Technology

Mar 21, 2020

How Zhihu Scaled to Trillions of Rows with TiDB: Lessons from Moneta

Background

Zhihu, China’s largest knowledge‑sharing platform, serves 220 million registered users, 30 million questions and over 130 billion answers. Its Moneta service stores the posts each user has already read, amounting to roughly 1.3 trillion rows and growing by about 100 billion rows per month. Within two years the dataset is expected to exceed 3 trillion rows.

Key Pain Points

High availability: Users must not see already‑read posts on the recommendation feed.

Massive write volume: Peak traffic exceeds 40 k writes per second, adding nearly 30 billion new rows daily.

Long‑term storage: Historic data must be retained for years.

High‑throughput queries: Up to 12 million post‑level queries per second during peaks.

Strict latency: Query response time must stay under 90 ms, even for long‑tail queries.

Tolerate false positives: The system may occasionally surface an already‑read post.

Architecture Requirements

High availability to avoid a poor user experience.

Outstanding system performance to meet high throughput and latency goals.

Easy scalability as business and data volume grow.

Exploration of Solutions

We evaluated three core components for the ideal architecture:

Proxy: Routes user requests to healthy nodes, ensuring HA.

Cache: Handles frequent reads in memory, reducing database load.

Storage: Initially a standalone MySQL cluster, later replaced by TiDB.

Drawbacks of MySQL Sharding & MHA

MySQL Sharding

Application code becomes complex and hard to maintain.

Changing the shard key is cumbersome.

Upgrading logic impacts availability.

MHA

Requires custom scripts or third‑party tools for virtual IP configuration.

Only monitors the primary node.

Needs password‑less SSH, introducing security risks.

No read‑load‑balancing for replicas.

Only checks primary server health.

Why TiDB?

TiDB is an open‑source, MySQL‑compatible NewSQL database with HTAP capabilities. Its main components are:

TiDB Server: Stateless SQL layer that processes queries and talks to TiKV.

TiKV: Distributed transactional key‑value store using Raft for strong consistency and HA.

TiSpark: Apache Spark plugin for OLAP queries on TiKV data.

PD (Placement Driver): Metadata service (etcd‑backed) that schedules TiKV regions.

Additional tools include Ansible deployment scripts, Syncer for MySQL‑to‑TiDB migration, and TiDB Binlog for incremental backup.

Key TiDB features:

Horizontal scalability

MySQL‑compatible syntax

Strongly consistent distributed transactions

Cloud‑native architecture

HTAP for ETL‑free analytics

Fault tolerance with Raft recovery

Online schema changes

TiDB in Moneta’s Architecture

The new stack is divided into three layers:

Top layer: Stateless, horizontally scalable client API and proxy.

Middle layer: Soft‑state services and a layered Redis cache; can self‑recover using data persisted in TiDB.

Bottom layer: TiDB cluster storing all stateful data; highly available and self‑healing.

Kubernetes orchestrates the entire system, providing global fault monitoring and ensuring high availability.

Performance Improvements

After deploying TiDB, the system showed dramatic gains:

Write throughput > 40 k rows/s.

During peaks, 30 k queries and 1.2 M post checks per second.

99th‑percentile latency ≈ 25 ms; 999th‑percentile ≈ 50 ms.

Relevant charts:

Lessons Learned

Fast data import: Using TiDB DM and Lightning, 1.1 trillion rows were imported in four days (versus a month with naive writes).

Reducing query latency: Separate TiDB instances for latency‑sensitive queries; added SQL hints; employed low‑precision TSO and prepared statements to cut round‑trips.

Resource evaluation: TiDB’s Raft replication needs at least three replicas, requiring more hardware than a single‑master MySQL setup, but the extra resources enable easier scaling.

Expectations for TiDB 3.0

TiDB 3.0 introduces several enhancements that we have already tested in our anti‑spam service:

Titan storage engine: Reduces write amplification for large values, cutting both write and query latency (see comparison chart).

Table partitioning: Allows time‑range partitions, dramatically improving query performance.

Batch Raft messages & multi‑threaded Raftstore: Increases write throughput and reduces node count (from seven to five nodes for similar load).

SQL plan management: Binds queries to optimal execution plans without modifying SQL text.

TiFlash: Columnar analytics engine enabling fast OLAP on massive data without separate ETL pipelines.

Next Steps

TiDB’s MySQL compatibility lets us adopt it without rewriting existing code. Its horizontal scalability prepares us for future growth beyond a trillion records. We plan to continue contributing to the open‑source ecosystem and work with PingCAP to further strengthen TiDB.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

performance optimization TiDB HTAP database scaling MySQL Migration large-scale data

Written by

Java Backend Technology

Focus on Java-related technologies: SSM, Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading. Occasionally cover DevOps tools like Jenkins, Nexus, Docker, and ELK. Also share technical insights from time to time, committed to Java full-stack development!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Background

Key Pain Points

Architecture Requirements

Exploration of Solutions

Drawbacks of MySQL Sharding & MHA

MySQL Sharding

MHA

Why TiDB?

TiDB in Moneta’s Architecture

Performance Improvements

Lessons Learned

Expectations for TiDB 3.0

Next Steps

Java Backend Technology

How this landed with the community

Was this worth your time?

0 Comments

Expectations for TiDB 3.0