Backend Development 18 min read

How We Scaled Feed Push: From Simple Push to Lazy Loading and Fan Filtering

This article explains the architecture and evolution of a feed push system, covering the basic push model, its early implementation, performance trade‑offs, and a series of optimizations—including lazy loading, length control, delayed push, active‑inactive fan filtering, and hot‑cold separation—to improve read efficiency, reduce storage costs, and handle massive fan bases.

Sohu Tech Products

Apr 15, 2026

How We Scaled Feed Push: From Simple Push to Lazy Loading and Fan Filtering

Push Model Overview

The feed system delivers structured content to users based on relationships, interests, or scenarios, ordered by time or algorithm. Three implementation styles exist: push (write diffusion) , pull (read diffusion) , and hybrid . This summary focuses on the early push‑mode implementation and its subsequent optimizations.

Early Push Implementation

Core Caches (Redis, no expiration)

User follow‑list cache : stores brief feed entries of followed users in reverse‑chronological order.

Feed detail cache : caches the full content of each feed.

User‑follow‑people cache : records the list of accounts a user follows.

Fan list cache : records the followers of a user.

Other auxiliary caches (e.g., “do‑not‑show” flags).

Follow / Unfollow Module

When a user follows another, an asynchronous job fetches the recent feeds of the followed user and writes them into the follower’s follow‑list cache. The follow‑relation and fan caches are updated synchronously. Unfollow removes the corresponding entries. Asynchronous writes avoid time‑outs under high feed volume; synchronous writes are possible but risky.

Feed Publishing

After persisting a feed, the service writes the feed detail to the cache, reads the publisher’s fan list, and pushes a brief feed entry into each fan’s follow‑list cache.

Reading a Follow Feed

Each user’s follow‑list cache holds feed IDs sorted by timestamp. The read path is:

Check the follow‑list cache. If healthy, fetch a page of feed IDs.

Retrieve feed details from the detail cache.

Filter deleted or hidden feeds and return the page.

If the cache fails, a database fallback is triggered:

Obtain the list of followees (from cache or DB).

For each followee, read one page of feeds from the DB.

Merge, re‑sort, parse, filter, and return the result.

The fallback prioritises performance over strict correctness when the followee count is large.

Push Model Optimizations

Lazy Loading

To reduce initial push latency, only a limited number of recent feeds are pushed synchronously when a user follows someone. When the cached list falls below a threshold during scrolling, a lazy‑loading module marks the request, reads older feeds from the DB starting from the smallest timestamp, and pushes them back into the cache via a Kafka “degradation channel”.

Length Controller

A length‑controller trims each user’s follow‑list cache to keep its size within a configurable range, preventing unbounded Redis growth caused by lazy loading.

Delayed Push (Fold‑back Strategy)

For users prone to spamming, feeds are delayed before being pushed. The delay interval (e.g., 10 minutes) is configurable. If a delayed feed is the last one within the interval it is pushed immediately; otherwise it is queued, written to the cache to survive restarts, and pushed after the delay.

Convergence of Modules

All cache‑expiration, Redis‑exception, and reload handling are consolidated into the follow module. Lazy loading, length control, and partial‑follow handling are routed through the degradation channel (Kafka). This unifies functionality, reduces error surface, and prepares for inactive‑fan filtering.

Inactive Fan Filtering System

Fans are divided into active and inactive groups. A full‑copy of the fan cache serves as the inactive‑fan cache; the original cache remains for active fans. Daily logs mark a user as active when they browse their follow feed. Users who only follow but never browse for N consecutive days are classified as inactive and their follow‑list caches are removed.

When a high‑profile (“big V”) user publishes a feed, the system first pushes to online users, then filters the big V’s fans, and finally performs a slower push to the remaining fans, keeping latency for active users under one minute.

Design Trade‑offs and Outlook

Push mode excels when follow‑counts are high but fan counts are low, offering fast reads at the cost of higher write overhead and increased storage (each feed is duplicated across fan caches). As big‑V users with massive fan bases emerged, push mode faced latency and scalability challenges, prompting exploration of pull mode and custom SNS feed indexing.

Key design ideas:

Sacrifice write performance for read efficiency.

Use space‑to‑time trade‑offs by duplicating feeds across fan caches.

Distribute complexity across publishing and reading paths.

Converge feed‑push entry points to improve maintainability.

Cold‑hot separation and active‑inactive fan zones to reduce latency and storage.

Factory pattern for feed parsing.

Delayed push to prevent feed flooding.

Deferred unfollow handling via queues.

Bloom filter for fan shard existence checks.

Kryo serialization to shrink cache payloads.

Lazy loading and length trimming to control cache size.

Database fallback to guarantee availability despite cache failures.

Feed Parser Factory (example)

class FeedParserFactory {
    public FeedParser createParser(int action) { // action is feed type identifier
        if (action == OriginalFeed.Action.getActionCode()) {
            return OriginalFeedParser;
        } else if (/* other conditions */) {
            // return other parser
        } else {
            // default parser
        }
    }
}

Complete Push‑Mode Architecture

The full implementation consists of the following modules, all interacting to provide scalable, low‑latency feed delivery:

Follow / unfollow handling.

Feed generation and caching.

Delayed‑push (fold‑back) module.

Lazy‑loading module.

Length‑controller.

Database fallback strategy.

These modules are coordinated via Redis for fast cache operations and Kafka for degradation‑channel messaging, ensuring both performance and resilience.

backend Caching Lazy Loading feed push fan-filtering

Written by

Sohu Tech Products

A knowledge-sharing platform for Sohu's technology products. As a leading Chinese internet brand with media, video, search, and gaming services and over 700 million users, Sohu continuously drives tech innovation and practice. We’ll share practical insights and tech news here.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.