Design and Implementation of a Multi-Level Storage Architecture for Bilibili Comment Service
The paper proposes a multi‑level storage architecture for Bilibili’s comment service that replaces TiDB with a custom KV store (Taishan) and Redis caching, introduces unstructured indexes, CAS‑based consistency, real‑time and offline reconciliation, and a hedged degradation strategy to boost reliability, read throughput, and scalability during traffic spikes.
The comment system is a core component of Bilibili's ecosystem, enabling interaction between creators and users, influencing content recommendation, community culture, and user retention. During hot events, comment traffic spikes dramatically, making service stability and cache hit rates critical. A cache miss forces requests to TiDB, risking service outage if TiDB fails.
To improve reliability, a multi‑level storage architecture is proposed, replacing TiDB as the single point of failure with a custom KV store (Taishan) and leveraging Redis for caching and sorted‑set indexes.
Architecture Design
The existing architecture relies on Redis for sorted‑set indexes (likes, time, heat) and TiDB for persistent storage. When Redis misses, TiDB queries become slow and consume excessive CPU and memory. Example SQL for like‑order index:
SELECT id FROM reply WHERE ... ORDER BY like_count DESC LIMIT m,nThe new design introduces a multi‑level storage system based on Taishan KV, with three core ideas:
Convert structured index storage to unstructured.
Replace SQL queries with higher‑performance NoSQL operations.
Trade write‑side complexity for read‑side performance.
Storage Design
Two abstract data models are defined:
Index : secondary index stored as Redis Sorted Set.
KV : primary key‑row stored as key‑value pairs containing metadata and comment content.
The model enables efficient pagination, sorting, and scanning by using Redis Sorted Set for indexes and a KV cache (Redis or Memcache) for comment material.
Data Consistency
Switching from TiDB (structured) to Taishan (unstructured) requires custom synchronization, which may cause data loss, write failures, conflicts, out‑of‑order writes, and latency. To mitigate these issues, a retry queue is introduced for failed writes, and CAS (Compare‑and‑Swap) is used to ensure atomic read‑modify‑write operations.
Version numbers are added to each comment change to detect and discard stale updates. Example update statement:
UPDATE reply SET like_count=like_count+1, version=version+1 WHERE id = xxxDuring CAS writes, the version from the binlog is compared with the current version in Taishan; only newer or equal versions are applied.
Reconciliation System
Because CAP theorem limits simultaneous availability and consistency, a reconciliation mechanism is needed. Real‑time reconciliation consumes TiDB binlog events with a delayed queue, compares them against Taishan data, and triggers alerts for mismatches. Offline reconciliation uses T+1 data from both TiDB and Taishan warehouses to verify eventual consistency.
Degradation Strategy
High availability is achieved by automatically degrading from TiDB to Taishan when the primary store fails or times out. Instead of simple serial or parallel fallback, a hedging policy is employed: after a configurable delay, a backup request is sent to the secondary store; the faster successful response is returned, balancing latency and resource usage.
In production, TiDB may serve as the primary for latency‑sensitive queries, while Taishan acts as primary for heavy‑weight analytical workloads. An incident example shows TiKV node failure leading to seamless degradation to Taishan without user impact.
Summary and Outlook
The multi‑level storage architecture significantly improves the stability and performance of Bilibili's comment service, providing reliable fallback, higher read throughput, and better scalability. Ongoing work includes refining synchronization, expanding NoSQL capabilities, and further automating degradation mechanisms.
Bilibili Tech
Provides introductions and tutorials on Bilibili-related technologies.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.