Design and Implementation of Feed Stream Architecture for NetEase Open Courses
The article details NetEase Open Courses’ feed‑stream architecture, describing how content ingestion, multi‑level filtering, vertically and horizontally split storage, Elasticsearch indexing, two‑tier caching, and micro‑service orchestration combine to deliver personalized, high‑availability course feeds while addressing scalability, consistency, and operational challenges.
Introduction
The term “feed stream” refers to the subscription information flow that users obtain either by pulling or by push. Common examples include Weibo super topics, WeChat official account subscription messages, Douyin video streams, etc. Most mainstream apps use a feed‑stream design on the first screen. This article explains, from a business perspective, how the feed‑stream architecture is designed and efficiently implemented for NetEase Open Courses.
Business Architecture
The feed stream can be vertically divided into three parts:
Source content ingestion
Transition from source content to feed content
Feed content delivery
Requirements and Challenges
When abstracting the technical architecture from the business architecture, the design converges on several key technical points:
How to implement effective filtering strategies for content ingestion
How to ensure high availability of the content storage architecture
How to achieve precise, personalized content distribution
Overall Content Architecture Diagram
Design and Implementation
(1) Content Ingestion Model Design and Iteration
Version 1.0 of the ingestion model is shown in Figure 2.
NetEase Open Courses obtains content from two sources: self‑operated premium content and contracted third‑party PGC users. The content passes through machine review, manual first‑review, manual re‑review, spot checks, and patrols, filtering out illegal or sensitive material before entering the content pool. This simple model was sufficient at early stages but could not keep up with the rapid growth of user consumption.
Version 2.0 introduces “contracted self‑media” to increase content volume, similar to Baidu’s Baijiahao, Toutiao’s Toutiaohao, or WeChat public accounts. A funnel model then filters out content that does not match the business characteristics. Example filtering rules for music videos include:
Likes > 10 k
Title contains keywords such as “music”
Effective comment count exceeds a threshold
To handle the inevitable false positives/negatives of machine review, a “grade pool” concept is added (Figure 4), separating self‑operated content, contracted PGC content, and contracted self‑media content into premium, quality, and normal pools respectively. The grade pool is used for personalized recommendation algorithms and service degradation fallback.
(2) Content Storage Architecture Design
Based on user behavior and content attributes, several attribute‑classification buckets are derived (Figure 6). Content is automatically mapped to specific buckets, enabling finer‑grained user behavior extraction and more accurate user profiling for recommendation algorithms.
After preprocessing, the underlying persistent storage is built (Figure 7).
2.1 Data Vertical Splitting
Each data record consists of three basic attributes: course content (title, description), user actions (likes, favorites), and personalization metadata (bucket, classification). The data is vertically split into three independent tables, allowing each part to be stored and accessed according to its own access patterns (Figure 8).
Vertical splitting avoids redundancy, reduces storage waste, and simplifies data aging and hot‑cold separation.
2.2 Data Horizontal Splitting
Horizontal splitting (sharding) is applied to massive user‑behavior tables (likes, favorites, watch history) to support both analytical workloads and recommendation models.
2.3 Building Elasticsearch Indexes
Elasticsearch is introduced to satisfy strong search requirements: a front‑end ES for C‑end basic search and a back‑end ES for B‑end operational needs (e.g., finding and taking down all “Honor of Kings” content from millions of records).
2.4 Business Cache Construction
A two‑level cache is used: the first level caches business‑specific data (e.g., classification relationships), and the second level caches fundamental data (e.g., course details). This design reduces data redundancy and simplifies consistency management between MySQL and Redis.
Cache‑DB synchronization scenarios include:
update DB → delete Redis key → Redis expiration + query‑back write‑through
update DB → message queue → asynchronous eviction/update
Consistency requirements vary by use case (absolute, minute‑level, hour‑level, etc.). For content publishing, a minute‑level delay is acceptable.
2.5 Storage Service Selection
mongo
Insert/select only, can degrade, fail, or lose data
ddb
Large tables, massive incremental data, vertical‑split large fields (e.g., article text)
mysql
Core relational business data
redis
Two‑level cache for core exposure data, top/hot items
ES
Business‑specific text data
2.6 Service Decomposition and Load Balancing
The overall call chain is long, spanning ingestion, persistence, caching, and recommendation. Each functional module is isolated as a microservice and communicates via middleware. To handle traffic spikes (e.g., mass takedown of politically sensitive content), two strategies are used: (1) queue DB traffic and consume it periodically, giving the DB “breathing room”; (2) scale out by adding DB instances.
(3) Personalized Content Delivery
The final step is to pull personalized content for each user (Figure 11).
Enhancements include finer‑grained buckets, the grade‑pool fallback for cold‑start or service outage, and both passive (automatic) and active (user‑initiated) feedback mechanisms.
Architecture Pain Points
Content synchronization: long data flow makes it hard to recover lost updates or deletions.
Dispersed storage: retrieving a complete content record requires joining many tables, degrading performance at scale.
Service proliferation: excessive microservices increase call‑chain length and operational overhead.
Conclusion
Maintaining the underlying business architecture is an ongoing process. A mature technical architecture must be tightly coupled with business characteristics, continuously iterated, and adapted as technology evolves. There is no “best” architecture—only the one that best fits the specific business needs.
Author
Cheng Jian – Joined NetEase Open Courses in 2018 as a senior backend R&D engineer, focusing on version iteration, recommendation systems, and service operation optimization.
NetEase Media Technology Team
NetEase Media Technology Team
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.