Artificial Intelligence 17 min read

How Huajiao Live Built a From‑Scratch Personalized Recommendation System

This article analyzes Huajiao Live's end‑to‑end recommendation pipeline, covering basic concepts, recall and ranking algorithms—including collaborative filtering, matrix factorization, deep learning models—and multi‑objective optimization, while detailing the engineering workflow for training, deployment, and real‑time serving in a live‑streaming environment.

Huajiao Technology

Apr 7, 2020

How Huajiao Live Built a From‑Scratch Personalized Recommendation System

Introduction

Live‑streaming platforms need to recommend relevant streams from massive content pools. A typical recommendation pipeline consists of three stages: Recall (filter millions of items to a few thousand using low‑cost models), Feature‑based Ranking (refine to hundreds with richer features), and Final Ranking (select the top items for display).

Recall Algorithms

Domain‑based Collaborative Filtering

Item‑based collaborative filtering builds a similarity matrix between streamers (items). Because the number of streamers is far smaller than users, the matrix is tractable (O(n²)). It provides interpretable recommendations and fast cold‑start for new users, but it does not learn from an optimization objective and can be memory‑intensive.

Latent‑Factor Collaborative Filtering

Matrix Factorization (MF) decomposes the user‑item interaction matrix into low‑dimensional user matrix X and item matrix Y. Implicit feedback (e.g., watch time, click count) is binarized (1 if above a threshold, else 0) and weighted by confidence in the loss function. Offline training yields X and Y; online inference computes the dot product X_i·Y_j for any user‑item pair.

Advantages: simple, fast online, low storage. Limitations: limited expressiveness and difficulty handling very sparse data.

Neural Collaborative Filtering (NCF)

NCF replaces the inner product with a deep neural network that learns a non‑linear interaction function from one‑hot encoded user and item IDs. The model consists of an embedding layer followed by multiple fully‑connected layers.

NeuMF

NeuMF combines a Generalized Matrix Factorization (GMF) branch with a Multi‑Layer Perceptron (MLP) branch, capturing both linear and non‑linear feature interactions.

Ranking Algorithms

Feature Engineering

Effective ranking relies on diverse features, including:

User demographics and profile

Contextual signals (time of day, device)

Historical behavior (clicks, watch time, gifts, comments)

Real‑time metrics (current view count, live gifts, chat activity)

Model Choices

Logistic Regression (LR) : linear model, fast training, limited to linear interactions.

Factorization Machines (FM) : adds second‑order feature cross terms automatically; can handle sparse high‑dimensional data.

GBDT + LR : Gradient Boosted Decision Trees generate high‑order feature combinations; the leaf indices are fed to a linear model.

Deep Models

Wide & Deep, DeepFM and DIN integrate FM‑style cross features with deep neural networks. DIN introduces an attention mechanism to weight item embeddings based on user behavior sequences.

Multi‑Objective Optimization

Live‑streaming recommendation often optimizes several metrics simultaneously (click‑through, watch time, gifts, comments, follows, shares). Multi‑task models share a common embedding layer and add task‑specific towers.

ESMM (Entire Space Multi‑Task Model) : predicts click‑through rate (CTR) and conversion‑rate (CVR) jointly, introducing pCTCVR (probability of conversion given click) to mitigate sample‑selection bias.

MMoE (Multi‑Gate Mixture‑of‑Experts) : splits the shared bottom layer into multiple expert sub‑networks and learns a gating function for each task, allowing flexible sharing.

Model Training and Deployment

Data collection gathers user‑streamer interactions and stores full‑snapshot samples in HDFS. Offline training runs daily or weekly on the entire dataset; incremental updates are processed via Flink streams. Trained models are served with TensorFlow Serving. A Go‑based recommendation service transforms incoming requests into feature vectors and calls the TensorFlow Serving endpoint for inference.

Conclusion

There is no universally “best” model for live‑streaming recommendation. The optimal solution depends on domain characteristics such as multimodal content, real‑time dynamics, and hotspot effects. Understanding the scene, extracting representative features, and selecting models that align with those features are essential for achieving superior recommendation performance.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Live Streaming recommendation AI deep learning Multi-Task Learning collaborative filtering recommender system

Written by

Huajiao Technology

The Huajiao Technology channel shares the latest Huajiao app tech on an irregular basis, offering a learning and exchange platform for tech enthusiasts.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.