Backend Development 22 min read

Designing a Billion-Scale Feed Stream System: Architecture & Best Practices

This article explains how to design a high‑performance, billion‑user feed stream system, covering product definition, data modeling, storage choices, synchronization modes, metadata handling, commenting, likes, search, sorting, deletion, updates, and practical architecture examples for different feed‑type applications.

Java Backend Technology

Sep 10, 2020

Designing a Billion-Scale Feed Stream System: Architecture & Best Practices

Introduction

About a decade ago, with the rise of smartphones, the internet entered the mobile era, giving birth to feed‑type products such as Weibo, WeChat Moments, Toutiao, and Kuaishou. These applications present continuously updated information units (feeds) in a top‑down flow, making them ideal for mobile browsing.

Feed Stream System Characteristics

A feed stream is essentially a data flow that delivers "N" publishers' information units to "M" receivers through follow relationships.

Data Classification

Publisher data: content generated by users, organized per publisher.

Follow relationships: one‑way (e.g., Weibo) or two‑way (e.g., Moments) connections.

Receiver data: aggregated feeds ordered by time or relevance.

System Design Overview

1. Product Definition

Identify the product type (Weibo‑like, Moments‑like, TikTok‑like, private‑message). Each type differs in follow relationship (single‑direction vs. bi‑directional) and sorting (time vs. recommendation).

2. Storage

The storage layer must guarantee reliability, durability, and horizontal scalability. Options include distributed NoSQL (Tablestore, Bigtable) for large scale or MySQL/Redis for smaller deployments.

3. Synchronization

Push mode (write‑expansion): immediately push messages to receivers; requires high write throughput.

Pull mode (read‑expansion): receivers pull messages from publishers' outboxes; high read load.

Push‑pull hybrid: combines both, suitable for single‑direction feeds with massive users.

4. Metadata

User details and lists.

Follow/friend relationships.

Push session pool to notify online users of new messages.

5. Comments and Likes

Both are stored similarly to feed content, usually in a distributed NoSQL table, with optional full‑text indexing for search.

6. Search

Simple keyword matching can be achieved via a search engine or a full‑text capable database (MySQL, MongoDB, Tablestore) by building appropriate indexes.

7. Sorting

Time‑based sorting is used for classic feed products; recommendation‑based sorting requires a different architecture and is covered in separate articles.

8. Deletion & Update

Delete by removing the content from the storage layer or marking it as logically deleted; updates follow the same path, leveraging multi‑version storage when available.

Summary

The feed stream system consists of product definition, storage, synchronization, metadata, comment/like handling, sorting, and search. Two main implementation paths exist: a single‑system solution using Alibaba Cloud Tablestore or a combination of open‑source components (MySQL, Redis, HBase) for teams comfortable with operations.

Architecture Practice

Moments

Bi‑directional relationships, time‑based sorting, push mode.

Weibo

Single‑direction relationships, large‑V effect, push‑pull hybrid.

Toutiao

Recommendation‑driven feed, no explicit follow relationships.

Private Messages

One‑to‑one feed, simple storage and push.

Extended Reading

Related articles on Tablestore Timeline 2.0, modern IM message architecture, and Tablestore guide are linked for deeper exploration.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Backend System Architecture Scalability distributed storage feed stream

Written by

Java Backend Technology

Focus on Java-related technologies: SSM, Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading. Occasionally cover DevOps tools like Jenkins, Nexus, Docker, and ELK. Also share technical insights from time to time, committed to Java full-stack development!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.