Artificial Intelligence 11 min read

Push Precision Recommendation System: Overview, Iteration, and Design

This article presents a comprehensive overview of the push precision recommendation system, detailing its data processing pipeline, machine‑learning‑driven algorithms, modular architecture—including offline, near‑real‑time, and push layers—and subsequent system iterations, optimizations, visual monitoring platforms, and future development directions.

HomeTech

Aug 2, 2023

Push Precision Recommendation System: Overview, Iteration, and Design

Preface

In the era of information explosion, timely and accurate information retrieval is a key challenge. As the user base and content scale of the Home (ZhiJia) platform grow, traditional pull‑based methods no longer satisfy personalized information needs. A push system can deliver relevant content even when the app is not active, leveraging machine‑learning models to provide precise, personalized recommendations.

1. System Overview

Push is an effective user recall product covering operational scenario pushes (activities, notifications, hot topics) and algorithmic precision pushes (timed personalized pushes). Its core modules include:

Data Processing: user data, content data, and historical behavior data.

Flow Prediction: recall, ranking, intervention, filtering to generate renderable push data.

JOB: scheduling and triggering push tasks.

Push Channel: device filtering, protocol and message packaging, app identification, and delivery.

Terminal: message aggregation, request to manufacturers or third‑party channels, and delivery.

APP: display via notification bar or pop‑up, reporting arrival and click metrics.

The system is described from three perspectives: data, algorithms, and architecture.

Data : The foundation, comprising offline user profiles and real‑time behavior profiles.

Algorithms : Evolved from simple rule‑based strategies to tree models and now deep learning neural networks to handle high complexity and massive data.

Architecture : Ensures near‑real‑time, fully automated operation, covering user behavior collection, feature extraction, data storage, and result generation.

2. System Iteration and Optimization

2.1 Chain Tasks

Push uses chained tasks where each batch undergoes feature extraction, recall, ranking, shuffling, re‑ranking, and recommendation generation. Issues include difficulty reusing feature data, long execution chains, and high integration cost for new recall functions.

2.2 Asynchronous Services

Data Partitioning: Store feature data and intermediate results in partitioned tables to retain data after failures.

Flow Splitting: Divide the main prediction flow into independent sub‑processes (recall, ranking, re‑ranking, result fusion) executed asynchronously by tail number, improving stability and fault tolerance.

Platform Configuration: Dynamic experiment integration via a configuration platform reduces the cost of adding new strategies.

3. Overall Design

3.1 Business Architecture

The push system consists of three layers: offline, near‑line, and push.

Offline Layer: Heavy‑weight batch processing for data handling, feature computation, and offline prediction using Spark models.

Near‑Line Layer: Near‑real‑time processing of Kafka streams to compute real‑time user features, which are fused with long‑term features and fed into TensorFlow models for instant predictions.

Push Layer: Merges offline and real‑time predictions (preferring real‑time), batches them, and schedules delivery to users.

3.2 Technical Architecture

The design follows a layered modeling and filtering approach to extract truly interesting content from massive data.

3.2.1 User Features

User features consist of raw attributes (from profile tables, behavior logs, content data) and computed attributes used for recall.

3.2.2 Material Features

Materials include original articles/videos, posts, reviews, Q&A, each with attributes such as interest car series, tags, author follow status, click/view/favorite counts, and interaction rates.

3.2.3 Prediction Process

The prediction pipeline includes:

Recall: Reduce tens of millions of items to thousands using popular, tag‑based, and collaborative recall methods.

Filtering: Exclude already exposed or clicked items and items irrelevant to the user's city.

Fine Ranking: Score recalled items with a model and sort by score.

Re‑ranking: Adjust the sorted list (e.g., control frequency of a specific car series, boost resources from high‑performing strategies).

4. Visualization Platform

The push system relies on scheduled tasks that run multiple times daily, pushing the latest material to users. Real‑time monitoring is preferred over post‑mortem analysis.

Unified Scheduling Platform: Supports main‑process execution, failure rerun, timeout alerts, and log inspection.

Report Platform: Provides statistics on push strategy open rates, experiment open rates, tail‑number open rates, content pool material stats, and recall result alerts.

Configuration Platform: Enables dynamic AB testing, recall strategy switching, ranking model updates, operational rules, and filtering logic for personalized time‑slot configurations.

5. Conclusion

The push system is a core feature of the AutoHome app, delivering timely and engaging content to tens of millions of users daily, thereby boosting active usage and user stickiness. Future work will focus on re‑activating silent users, enhancing activity, and improving overall user experience.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

architecture Big Data Machine Learning recommendation push

Written by

HomeTech

HomeTech tech sharing

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.