Databases 13 min read

Design and Implementation of a Multi‑Level Comment Storage System for Bilibili

This article presents a comprehensive design of Bilibili's comment service architecture, detailing the transition from TiDB to a multi‑level storage system based on Taishan KV, the data models, consistency mechanisms, retry and versioning strategies, and a hedging‑based degradation policy to ensure high availability under heavy traffic.

High Availability Architecture
High Availability Architecture
High Availability Architecture
Design and Implementation of a Multi‑Level Comment Storage System for Bilibili

1. Background

The comment system is a core component of Bilibili's ecosystem, influencing user interaction, content recommendation, community culture, and overall platform stability. During hot events, comment traffic spikes dramatically, stressing the service and making high cache hit rates essential; cache misses lead to direct TiDB queries, risking service outages.

2. Architecture Design

The existing architecture relies on Redis for caching and TiDB for storage, using various sorted indexes (likes, time, hotness) stored in Redis Sorted Sets. When cache misses occur, TiDB queries become slow, consuming CPU and memory and degrading overall performance.

To avoid TiDB single‑point failures, a new multi‑level storage system built on Bilibili's self‑developed Taishan KV is introduced, converting structured indexes to unstructured storage and SQL queries to high‑performance NoSQL operations.

SELECT id FROM reply WHERE ... ORDER BY like_count DESC LIMIT m,n

3. Storage Design

Two abstract data models are defined: Index (sorted indexes) and KV (comment material). The table below compares TiDB and Taishan representations:

Abstract Data Model

TiDB Model

Taishan Model

Description

Index

Secondary Index

Sorted Set

Sorted indexes such as likes or time order

KV

Primary Key & Row

Key‑Value

Metadata and content of a comment

The Index+KV model enables efficient pagination and real‑time updates using Redis Sorted Sets for indexes and KV stores for comment data.

4. Data Consistency

Switching from TiDB's structured data to Taishan's unstructured format lacks ready‑made sync tools, leading to potential data loss, write failures, conflicts, out‑of‑order writes, and latency. To mitigate these issues, a retry queue is introduced for failed writes, and a version‑number mechanism is applied to enforce ordering.

UPDATE reply SET like_count=like_count+
1
, version=version+
1
WHERE id = xxx

During CAS operations, the version from the binlog is compared with the stored version; updates proceed only if the incoming version is greater or equal, discarding stale data.

5. Degradation Strategy

Given the strict availability requirements, a hedging policy is adopted: after a configurable timeout on the primary store (TiDB or Taishan), a delayed backup request is sent to the secondary store. This balances response time and resource consumption, outperforming simple serial or parallel fallback strategies.

Degradation Strategy

Advantages

Disadvantages

Serial

Simple

Long latency, may exceed upstream timeout

Parallel

Short latency

Consumes roughly double the request load

In production, TiDB is set as primary for latency‑sensitive, lightweight queries, while Taishan serves as primary for heavy analytical workloads. An incident where TiKV nodes failed demonstrated seamless automatic downgrade to Taishan, keeping the comment service stable.

6. Summary and Outlook

The comment service is vital for community engagement on Bilibili. Continuous improvements in storage reliability, consistency mechanisms, and degradation policies aim to deliver a smoother user experience and foster stronger community ties.

Distributed SystemsTiDBstorage architectureconsistencycomment systemdegradationTaishan KV
High Availability Architecture
Written by

High Availability Architecture

Official account for High Availability Architecture.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.