How Zhihu Scales Read Filtering: A Deep Dive into High‑Performance Backend Architecture
This article explains how Zhihu built a highly available, low‑latency read‑filtering service for its homepage, detailing the system’s design goals, architecture components such as proxy, cache and storage, massive data scale, migration from MySQL to TiDB, and performance results after adopting TiDB 3.0.
1. Business Scenario
Zhihu has grown from a Q&A platform to a massive knowledge content platform with 30 million questions, over 130 million answers, 2.2 billion registered users, and a huge amount of articles and paid content. Efficiently delivering the most interesting high‑quality content to users is critical.
The homepage is a key traffic distribution entry, and the read‑filtering service helps avoid recommending content that users have already seen. It records all deeply read or skimmed items for long‑term storage and uses this data to filter already‑read items in homepage recommendation and personalized push.
When a user opens the recommendation page, the homepage fetches candidate items from multiple recall queues based on the user profile; some candidates may have been seen before, so they are passed through the read‑filtering service before being further processed and returned to the client.
The service must meet very high availability because the homepage is the most important traffic channel, handle a write peak of over 40 k records per second and about 30 billion new records daily, and retain data for three years. Currently it stores roughly 13 trillion records and is expected to reach 30 trillion in two years.
Query requirements are also demanding: each homepage refresh triggers at least one read query, with peak query volume around 30 k independent queries per second, each touching 400–1 000 documents, resulting in a peak of about 12 million document reads per second. End‑to‑end response time must stay under 90 ms, tolerating false positives as long as the false‑positive rate is acceptable.
2. Architecture Design
The system is designed around three goals: high availability, high performance, and easy scalability.
2.1 High Availability
Fault detection and self‑healing mechanisms are built into each component, isolating failures so the business experiences minimal impact.
2.2 High Performance
Requests are buffered in slots, each containing multiple cache replicas to spread read load. Multi‑replica slots increase availability and reduce pressure on the underlying database. Compression further reduces I/O on storage.
2.3 Easy Scalability
Stateless components are easily scaled, while weak‑state components can recover state from TiDB. Reducing the proportion of strong‑state services and using weak‑state services improves overall scalability.
2.4 Final Architecture
The top‑level client API and proxy are stateless and horizontally scalable. TiDB stores all state data. Between them lies a layered Redis cache with additional components ensuring cache consistency.
Kubernetes orchestrates the entire system, providing global fault monitoring and self‑healing capabilities.
3. Key Components
3.1 Proxy
The proxy is stateless and splits the cache into slots per user dimension. Each slot has multiple cache replicas for availability. Session consistency binds a user to a specific replica for the duration of a session.
If a replica fails, the proxy switches to another replica within the slot; if all replicas fail, it falls back to another slot, sacrificing performance for availability.
3.2 Cache
Cache utilization is improved by buffering more data using Bloom filters, which densify the data and reduce memory consumption. Write‑through caching avoids invalidation, and read‑through design merges concurrent queries into a single cache miss, reducing pressure on the database.
Cold caches are warmed quickly by transferring active cache state from peers when new nodes start, enabling rapid hot‑standby operation.
Multi‑layered caches (L1, L2) address space‑level and time‑level hotness, and cache‑tag isolation separates different business tenants to prevent interference.
3.3 Storage
Initially MySQL with sharding and MHA was used, but the massive data volume (≈1 trillion records per month) required migration to TiDB, which is MySQL‑compatible and offers better scalability.
3.4 Performance Metrics
The service now handles 40 k writes per second, 30 k independent queries, and 12 million document reads per second, with P99 and P999 latencies stable at 25 ms and 50 ms respectively.
4. All About TiDB
4.1 MySQL to TiDB
Data migration used TiDB DM for incremental binlog capture and TiDB Lightning for fast bulk import (≈11 trillion records in four days). After migration, latency issues were tuned with query isolation, SQL hints, low‑precision TSO, and prepared‑statement reuse.
Binlog adaptation was required to split global ordering constraints, and Drainer optimizations improved throughput and latency.
Resource planning highlighted the need for more machines due to TiDB’s Raft replication (minimum three replicas) and non‑clustered indexes on large composite primary keys.
4.2 TiDB 3.0
TiDB 3.0’s gRPC batch messages, multi‑threaded Raft store, and Plan Management simplify latency‑sensitive query tuning. TiFlash enables massive analytical workloads on billions of rows. The new Titan storage engine improves write performance for large records, benefiting the anti‑fraud service.
Table Partitioning on time dimensions directs queries to recent partitions, further reducing latency.
5. Summary
Designing sustainable systems requires deep understanding of business characteristics and building architectures that are highly available, performant, and easily extensible. Open‑source contributions and cloud‑native principles, such as native high availability and scalability, are essential for modern backend services.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITFLY8 Architecture Home
ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
