Backend Development 21 min read

Designing Scalable Feed Stream Systems: Architecture, Storage, and Sync Strategies

This article explains how to design a high‑performance feed‑stream system—covering product definition, data categories, storage options, synchronization modes, metadata handling, commenting, likes, search, sorting, deletion, and update—so you can build a solution that scales to millions or billions of users.

Programmer DD

Oct 11, 2020

Designing Scalable Feed Stream Systems: Architecture, Storage, and Sync Strategies

Introduction

About a decade ago, the rise of smartphones ushered in the mobile‑internet era, represented by products such as Weibo, WeChat, Toutiao, and Kuaishou. These applications are feed‑stream products where information flows from top to bottom, making them ideal for mobile browsing.

Feed‑Stream System Characteristics

A feed‑stream is essentially a data flow that delivers "N" publishers' information units to "M" receivers through follow relationships.

Data Types

Publisher data : Content generated by publishers that must be stored and retrieved per publisher.

Follow relationships : One‑way (e.g., Weibo) or two‑way (e.g., WeChat friends) connections that determine how information propagates.

Receiver data : Aggregated items ordered by time‑heat, with newer items placed first.

Core Data Stores

Repository: Permanent storage of publisher data.

Follow table: Permanent storage of user relationships.

Sync store: Stores recent time‑heat data for receivers.

Product Definition

Typical feed‑stream products fall into four categories: Weibo‑type, friend‑circle type, short‑video (Toutiao/Douyin) type, and private‑message type. Each has distinct relationship models and scaling considerations.

Storage Design

Key requirements are reliability (no data loss) and horizontal scalability for ever‑growing data. Options include distributed NoSQL (e.g., Tablestore, Bigtable) and relational databases (e.g., MySQL). For large‑scale systems, distributed NoSQL is preferred.

Synchronization Modes

Push (write‑expansion) : Publisher writes are immediately pushed to receivers' sync stores; requires high write throughput.

Pull (read‑expansion) : Receivers pull data from publishers' outboxes; high read load and complex position tracking.

Push‑Pull hybrid : Common users use push, while high‑fan‑out "big V" users use pull to reduce waste.

Metadata

User profile and list tables.

Follow/friend relationship tables with indexing.

Push session pool to track online users and avoid query storms.

Comments and Likes

Both are stored similarly to feed items, with comments requiring an extra reference to the parent message. Distributed NoSQL is suitable; relational databases can be used if already available.

Search

Simple keyword search for users, posts, or friends can be implemented via a search engine or a database with full‑text capabilities. Multi‑field indexes are added to the relevant tables.

Sorting

Two primary sorting strategies: time‑based (used by Weibo, friend circles, private messages) and score‑based (used by recommendation‑driven feeds like Toutiao).

Deletion and Update

Deletion can be physical (remove from repository) or logical (mark as deleted). Updates follow the same path; versioned stores like Tablestore support edit histories.

Overall Architecture

The system can be built either with a single cloud service (Tablestore) or a combination of open‑source components (MySQL, Redis, HBase). The choice depends on team expertise, scaling needs, and operational preferences.

Practical Scenarios

Friend circle: two‑way relationships, time‑based sorting, push model.

Weibo: one‑way relationships, big‑V handling, push‑pull hybrid.

Toutiao: recommendation‑driven, no explicit follows, score‑based sorting.

Private messages: one‑to‑one communication, simple feed model.

Conclusion

By understanding product requirements, data categories, storage choices, synchronization strategies, and auxiliary features such as metadata, comments, likes, search, and sorting, you can design a feed‑stream system that comfortably supports hundred‑million to billion‑level user bases.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

backend-architecture System Design Synchronization storage feed stream Scalable

Written by

Programmer DD

A tinkering programmer and author of "Spring Cloud Microservices in Action"

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.