Designing a Scalable Feed Stream System for Billions of Users

This article explains how to design a high‑performance feed‑stream architecture—including product definition, data modeling, storage choices, synchronization modes, metadata handling, commenting, likes, sorting, search, and deletion—so that a system can support tens of millions to billions of users while remaining reliable and scalable.

21CTO
21CTO
21CTO
Designing a Scalable Feed Stream System for Billions of Users

Introduction

About ten years ago, the rise of smartphones turned the Internet into a mobile era, giving birth to feed‑stream products such as Weibo, WeChat Moments, Toutiao, and Kuaishou. These applications present continuously updated content units (feeds) that flow from top to bottom, making them ideal for mobile browsing.

Feed Stream System Characteristics

A feed stream is essentially a data flow that delivers N publishers' content units to M receivers through follow relationships.

Data Model

The system handles three core data types:

Publisher data – the original posts or media generated by users.

Follow relationships – either one‑way (e.g., Weibo) or two‑way (e.g., WeChat friends).

Receiver data – the aggregated timeline for each user, usually ordered by recency.

These map to three storage concepts:

Repository: permanent storage of publisher data.

Follow table: permanent storage of relationship data.

Sync store: short‑term storage of receiver‑side, time‑sorted data.

Product Definition

Typical feed products fall into four categories: Weibo‑style, Moments‑style, short‑video (Douyin/TikTok) style, and private‑message style. The choice influences follow relationship type (single vs. double) and sorting (time vs. recommendation).

Storage Selection

For reliable, horizontally scalable storage, distributed NoSQL (e.g., Alibaba Cloud Tablestore, Bigtable) is preferred for large‑scale systems; MySQL can be used for small prototypes. The repository must guarantee durability and support linear scaling.

Synchronization Modes

Push mode (write‑fan‑out) : the publisher’s message is immediately pushed to all followers’ sync stores; requires extremely high write throughput.

Pull mode (read‑fan‑out) : followers read from publishers’ outboxes on demand; demands strong read capacity and per‑follower offset tracking.

Push‑pull hybrid : most users use push, while “big V” users use pull to avoid wasteful pushes to inactive followers.

Never rely solely on pull mode for large systems.

Metadata Services

Additional metadata includes user profiles, follow/friend lists, and a push‑session pool that tracks online users to avoid query storms caused by periodic client polling.

Comments and Likes

Both are stored similarly to feed content, with an extra reference to the parent message. Distributed NoSQL is suitable; relational databases can be used for smaller deployments.

Sorting

Two common sorting strategies are time‑based (used by Weibo, Moments, private messages) and score‑based (used by recommendation‑driven feeds). This article focuses on time‑based sorting.

Deletion and Update

Deletion can be physical (removing the record from the repository) or logical (marking it as deleted). Updates follow the same path; versioned stores like Tablestore can keep edit histories.

Search

Simple keyword search for users, posts, or friends can be implemented with a search engine or a full‑text capable database. Multi‑field indexes are added to the repository and user tables as needed.

System Architecture Overview

The complete architecture combines the core feed pipeline with metadata services, comment/like stores, search, and sorting modules. Two implementation paths are presented:

Open‑source stack (MySQL, Redis, HBase, etc.) for teams comfortable with operating multiple components.

Single‑system solution using Alibaba Cloud Tablestore, which provides built‑in support for all required features and automatic horizontal scaling.

Practical Scenarios

Specific feed types—Moments, Weibo, Toutiao, and private messages—are briefly described, each with its own relationship model and scaling considerations. Future articles will dive deeper into each variant.

Conclusion

The article outlines the essential building blocks for designing a billion‑user feed stream system, emphasizing product definition, storage, synchronization, metadata, interaction features, sorting, and search, and offers guidance on choosing between open‑source composites and a managed NoSQL service.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

System ArchitectureScalabilitymetadataSynchronizationstorageSearchfeed stream
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.