Designing Scalable Feed Stream Systems: Architecture, Storage, and Sync Strategies
This article explains how to design a high‑performance feed‑stream system—covering product definition, data categories, storage options, synchronization modes, metadata handling, commenting, likes, search, sorting, deletion, and update—so you can build a solution that scales to millions or billions of users.
Introduction
About a decade ago, the rise of smartphones ushered in the mobile‑internet era, represented by products such as Weibo, WeChat, Toutiao, and Kuaishou. These applications are feed‑stream products where information flows from top to bottom, making them ideal for mobile browsing.
Feed‑Stream System Characteristics
A feed‑stream is essentially a data flow that delivers "N" publishers' information units to "M" receivers through follow relationships.
Data Types
Publisher data : Content generated by publishers that must be stored and retrieved per publisher.
Follow relationships : One‑way (e.g., Weibo) or two‑way (e.g., WeChat friends) connections that determine how information propagates.
Receiver data : Aggregated items ordered by time‑heat, with newer items placed first.
Core Data Stores
Repository: Permanent storage of publisher data.
Follow table: Permanent storage of user relationships.
Sync store: Stores recent time‑heat data for receivers.
Product Definition
Typical feed‑stream products fall into four categories: Weibo‑type, friend‑circle type, short‑video (Toutiao/Douyin) type, and private‑message type. Each has distinct relationship models and scaling considerations.
Storage Design
Key requirements are reliability (no data loss) and horizontal scalability for ever‑growing data. Options include distributed NoSQL (e.g., Tablestore, Bigtable) and relational databases (e.g., MySQL). For large‑scale systems, distributed NoSQL is preferred.
Synchronization Modes
Push (write‑expansion) : Publisher writes are immediately pushed to receivers' sync stores; requires high write throughput.
Pull (read‑expansion) : Receivers pull data from publishers' outboxes; high read load and complex position tracking.
Push‑Pull hybrid : Common users use push, while high‑fan‑out "big V" users use pull to reduce waste.
Metadata
User profile and list tables.
Follow/friend relationship tables with indexing.
Push session pool to track online users and avoid query storms.
Comments and Likes
Both are stored similarly to feed items, with comments requiring an extra reference to the parent message. Distributed NoSQL is suitable; relational databases can be used if already available.
Search
Simple keyword search for users, posts, or friends can be implemented via a search engine or a database with full‑text capabilities. Multi‑field indexes are added to the relevant tables.
Sorting
Two primary sorting strategies: time‑based (used by Weibo, friend circles, private messages) and score‑based (used by recommendation‑driven feeds like Toutiao).
Deletion and Update
Deletion can be physical (remove from repository) or logical (mark as deleted). Updates follow the same path; versioned stores like Tablestore support edit histories.
Overall Architecture
The system can be built either with a single cloud service (Tablestore) or a combination of open‑source components (MySQL, Redis, HBase). The choice depends on team expertise, scaling needs, and operational preferences.
Practical Scenarios
Friend circle: two‑way relationships, time‑based sorting, push model.
Weibo: one‑way relationships, big‑V handling, push‑pull hybrid.
Toutiao: recommendation‑driven, no explicit follows, score‑based sorting.
Private messages: one‑to‑one communication, simple feed model.
Conclusion
By understanding product requirements, data categories, storage choices, synchronization strategies, and auxiliary features such as metadata, comments, likes, search, and sorting, you can design a feed‑stream system that comfortably supports hundred‑million to billion‑level user bases.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Programmer DD
A tinkering programmer and author of "Spring Cloud Microservices in Action"
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
