Designing Scalable Comment Systems: From Nested Trees to Flat Floors
This article examines how to design a high‑performance comment system by comparing nested and flat (cover‑floor) database models, evaluating adjacency list, path enumeration, and closure table approaches, and outlining write‑asynchronous, cache‑first read strategies for millions of users.
Background
Comment systems are not the core of e‑commerce platforms, but when comment volume grows large they can become performance bottlenecks. The article explores how to design a comment system that scales.
Problem
The main challenge is handling deep nesting of replies, which leads to poor performance. The goal is to solve the "cover‑floor" (flat) display problem.
Database: Nested Model
Two common ways to store nested comments are:
Adjacency List : each record stores its parent ID.
Path Enumeration : each record stores the full path (e.g., 1/2/3).
Adjacency List advantages: fast writes, simple structure. Disadvantages: recursive queries require N+1 calls, deletion of intermediate nodes is costly.
Path Enumeration advantages: easy queries using LIKE '1/2%' that follow the left‑most index rule, high performance, intuitive hierarchy. Disadvantages: path length limited by column type/size, complex maintenance when moving nodes.
Closure Table stores ancestor‑descendant pairs with depth. Advantages: high query performance and fast hierarchical moves. Disadvantages: higher storage overhead (two tables) and more complex write logic. This approach often balances performance and maintainability.
Database: Flat (Cover‑Floor) Model
The flat model uses a two‑level structure, storing only top‑level comments and their direct replies, eliminating recursion. A simple table contains id, top_comment_id, reply_id, and content. Querying top‑level comments is a straightforward WHERE top_comment_id = ? operation.
Write Strategy
Writes are performed asynchronously to achieve eventual consistency: the application publishes a message to a message queue (MQ); a consumer processes the message and inserts the data into the database. Front‑end JavaScript caches the new comment to update the UI immediately.
Read Strategy
To serve millions of users without overloading the database, the system separates hot and cold data:
Hot Cache : Frequently accessed recent comments are stored in Redis using a zset structure.
On‑Demand Loading : Older or less‑popular comments are lazily loaded. Pagination uses cursor‑based queries such as WHERE id < ? ORDER BY id DESC LIMIT ? instead of offset‑based pagination.
Full‑Link Considerations
The article notes that a complete comment system also requires rate limiting, degradation, circuit breaking, and service decomposition, making it a full‑link engineering effort beyond just data storage and access.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITFLY8 Architecture Home
ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
