Artificial Intelligence 27 min read

Deep Learning Practices for Multi‑Business Integrated Recommendation: From Dual‑Channel to Multi‑Channel Interest Models and Multi‑Scenario Adaptation

The article details how 58.com tackled the challenges of multi‑business recommendation by evolving its ranking models from a dual‑channel deep interest architecture to a 1+N multi‑channel deep interest model, incorporating customized feature cross layers, scenario‑adaptation mechanisms, and extensive engineering optimizations that yielded significant CTR and conversion gains.

58 Tech
58 Tech
58 Tech
Deep Learning Practices for Multi‑Business Integrated Recommendation: From Dual‑Channel to Multi‑Channel Interest Models and Multi‑Scenario Adaptation

Background: The "You Might Like" homepage of 58.com is the platform's largest recommendation scenario, handling tens of millions of daily active users, billions of candidate posts, and over a hundred million training samples. It faces two prominent challenges: multi‑business fusion (covering housing, recruitment, second‑hand goods, local services, etc.) and multi‑objective optimization (balancing connection efficiency, revenue, user experience, retention, and operational activities).

Traditional models such as XGB and FM struggle in this setting due to feature‑alignment difficulties across heterogeneous business posts and the high complexity of feature engineering required for each business.

Model evolution: To address these issues, the ranking system progressed from a dual‑channel deep interest model to a 1+N multi‑channel deep interest model, and is now exploring multi‑scenario adaptation. The dual‑channel model introduced behavior‑sequence interest modeling, while the multi‑channel model adds separate channels for click, search, conversion, and content behaviors.

Deep interest model development: Early experiments compared DIN, DIEN, and Transformer for sequence modeling. Transformer was finally chosen for its scalability and comparable performance, as the dynamic nature of user interest in 58's scenarios is modest. Pure sequence models, however, could not surpass well‑engineered XGB models.

Feature‑engineering integration: To reduce engineering effort, a customized cross‑feature layer was added, embedding high‑level business features (e.g., matching cross features) directly into the model. The architecture consists of four layers: Input (raw post, user, context features), Vectorization (embedding or pretrained vectors), Cross (operations such as Cosine similarity, DNN, Multiply, GAP), and Concatenation.

Dual‑channel architecture combines the customized channel with the pure sequence channel via an MLP, achieving a stable 3% CTR lift and 5% conversion lift over the XGB baseline while simplifying feature‑engineering pipelines.

Multi‑channel architecture expands the model with five behavior channels: Click (post ID & key attributes), Search (query‑post semantic matching via word‑vectors), Conversion (cluster‑ID representation to handle sparsity), Content (behavior‑window collaborative representation, still under exploration), and a Customized channel that injects scene‑specific cross features.

Multi‑scenario adaptation: A scenario‑adaptation layer uses attention to align customized, basic, and expressive interests with a learned scene representation (scene ID embedded in the click channel). Richer cross‑feature logic is added to the customized channel to support diverse scene requirements.

Engineering practice: Training time was reduced from five days to five hours through parallelism, data format changes, and batch‑size tuning (15‑20). Online latency dropped from 10% timeout to 0.3% by decoupling the vectorization layer into Redis, sharing user data across batches, and optimizing request batch sizes. Model tuning focused on behavior representation (post ID, cluster‑ID, word‑vector semantics) and introduced clustering IDs for better generalization.

Key operational issue: A log‑delay mismatch caused training data to be more up‑to‑date than online data, harming performance. The fix shifted the training window back by two minutes, restoring alignment and improving click‑through rates.

Future directions: Optimize the content‑behavior channel (behavior‑window collaborative representation), incorporate negative‑feedback sequences, enhance scenario‑transferability, and pursue multi‑objective optimization (ESSM, reinforcement learning) to balance business goals.

Overall impact: The multi‑channel deep interest model has become the primary online model for the homepage recommendation, delivering over 10% improvement in exposure conversion and contributing to a 50% uplift in overall system metrics across recall, ranking, and presentation layers.

References:

Covington et al., Deep Neural Networks for YouTube Recommendations, ACM RecSys 2016.

Zhou et al., Deep Interest Network for Click‑Through Rate Prediction, 2017.

Zhou et al., Deep Interest Evolution Network for Click‑Through Rate Prediction, 2018.

Mikolov, Distributed Representations of Words and Phrases, NIPS 2013.

Feng et al., Deep Session Interest Network for Click‑Through Rate Prediction, 2019.

Chen & Zhao et al., Behavior Sequence Transformer for E‑commerce Recommendation in Alibaba, 2019.

Pi et al., Practice on Long Sequential User Behavior Modeling for Click‑Through Rate Prediction, 2019.

Qi et al., Search‑based User Interest Modeling with Lifelong Sequential Behavior Data for Click‑Through Rate Prediction, 2020.

feature engineeringrecommendationDeep LearningRankingMulti-Channelinterest modelingscenario adaptation
58 Tech
Written by

58 Tech

Official tech channel of 58, a platform for tech innovation, sharing, and communication.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.