Design and Architecture of Bilibili Comment System
The article details Bilibili’s modular, platform‑ized comment system architecture—covering core publishing, reading, and admin modules; a layered reply‑interface, service, and job design with multi‑level caching, sharded databases, hotspot detection, redundancy, security measures, and sophisticated hot‑comment ranking to sustain massive user traffic.
Background
The article introduces the concept of comments from Wikipedia and describes how Bilibili's comment system has evolved into a modular, platformized architecture to handle massive user traffic and complex features.
Basic Functional Modules
Core functionalities include publishing comments (unlimited nested replies), reading comments (sorted by time or popularity), deleting comments, interaction (like, dislike, report), and management (pinning, featuring, admin operations). Additional features cover rich text, tags, decorations, and AI-assisted hot comment management.
Architecture Overview
The comment system is designed as an independent subsystem. The high‑level overview is shown in the diagram (image omitted).
3.1 Architecture – Overview
3.2 Architecture – reply‑interface
The reply‑interface serves as the entry layer for both client comment components and downstream services. For mobile/WEB, it provides view‑model APIs via a BFF layer; for server‑side calls, it offers clear boundaries, minimal data exposure, security checks, and traffic control.
3.3 Architecture – reply‑admin
Provides admin services for complex queries, using Elasticsearch for search and a hybrid approach with a relational database for real‑time fields.
3.4 Architecture – reply‑service
Implements atomic comment operations (list, delete, etc.) with multi‑level caching, Bloom filters, and hot‑spot detection to ensure high availability and throughput.
3.5 Architecture – reply‑job
Handles asynchronous tasks such as cache rebuilding and binlog consumption. It uses a Cache‑Aside pattern for cache updates and a message queue to ensure a single cache rebuild per comment list, avoiding cache stampede.
4 Storage Design
4.1 Database Design
At least three tables are required: a comment table (primary key comment_id, indexed by comment_area_id), a comment area table (primary key area_id, with a type field), and a separate content table for large comment bodies.
Key field categories include relational data (author, parent comment), counters (total, root, child counts), status/attributes (enum status, bitmap attributes), and meta information.
Example SQL queries (simplified):
SELECT * FROM subject WHERE obj_id=? AND obj_type=? SELECT id FROM reply_index WHERE obj_id=? AND obj_type=? AND root=0 AND state=0 ORDER BY floor=? LIMIT 0,20 SELECT * FROM reply_index, reply_content WHERE rpid IN (?,?,...) SELECT id FROM reply_index WHERE obj_id=? AND obj_type=? AND root=? ORDER BY like_count LIMIT 0,3 SELECT * FROM reply_index, reply_content WHERE rpid IN (?,?,...)Due to high write volume, MySQL sharding was initially used, later migrated to TiDB for horizontal scalability.
4.2 Cache Design
Redis is used with three main caches: subject (JSON string), reply_index (sorted set with comment IDs as members and ordering fields as scores), and reply_content (combined data from index and content tables). Consistency is maintained via binlog‑driven cache invalidation and single‑flight mechanisms to prevent cache stampede.
5 Availability Design
5.1 Write and Read Hotspots
During a traffic surge (e.g., “Tencent’s chili sauce not fragrant” incident), comment count updates were merged in‑memory before DB writes, and batch inserts were used. Non‑DB logic was split into pre‑ and post‑processing threads, increasing TPS by over tenfold.
Read hotspots arise from frequent area‑info queries and batch downstream requests. A BFF layer caches hotspot data locally, and a sliding‑window LFU algorithm detects hot keys.
5.2 Redundancy and Degradation
Multi‑level caches degrade gracefully; if a higher level fails, the system falls back to the next level. The system is deployed in dual‑datacenter active‑active mode with failover and cross‑region retry mechanisms. Critical dependencies (e.g., moderation) are treated as strong dependencies, while non‑critical ones (e.g., fan badges) are weak and can be throttled or circuit‑broken.
6 Security Design
6.1 Data Security
Ensures compliance, consistent state, and priority handling so that harmful comments are permanently removed and never exposed via APIs or caches.
6.2 Opinion Security
Minimizes user‑visible errors, optimizes comment count consistency, and uses transaction locking or serialization to avoid write‑skew anomalies.
7 Hot Comment Design
7.1 Definition
Initially based on like count, later incorporates weighted voting, time decay, length, user level, and other factors.
7.2 Challenges and Solutions
Sorting by absolute likes suffers from large OFFSET costs; therefore Redis sorted sets are used. For weighted scores (e.g., Wilson score), a dedicated feed‑service/feed‑job computes and stores scores. Real‑time like‑rate sorting requires massive exposure logs and a dedicated pipeline.
Large sorted sets and real‑time feature updates create pressure on Redis pipelines; multi‑level caching and bloom‑filter checks mitigate this.
7.3 Vision
The goal is to provide a harmonious, engaging comment environment while leveraging the commercial potential of high‑traffic comment areas. Continuous optimization of hot‑comment algorithms and user‑experience balancing is planned.
Bilibili Tech
Provides introductions and tutorials on Bilibili-related technologies.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.