Artificial Intelligence 19 min read

Deep Interest Modeling and Multi‑Channel Recommendation for 58.com Home Page

This article presents the challenges of large‑scale home‑page recommendation at 58.com, describes how behavior‑sequence models such as DIN, DIEN and Transformer are applied and evolved into double‑channel and multi‑channel deep interest architectures, and details offline and online performance optimizations that yielded significant gains in click‑through and conversion rates.

58 Tech
58 Tech
58 Tech
Deep Interest Modeling and Multi‑Channel Recommendation for 58.com Home Page

The home‑page recommendation scenario of 58.com serves tens of millions of users with billions of candidate posts and billions of training samples daily, spanning multiple business domains such as housing, jobs, second‑hand goods, and local services.

Key challenges include feature alignment across heterogeneous business attributes, complex feature‑engineering pipelines, and high‑variance feedback signals, which limit the effectiveness of traditional models.

To address these issues, the team adopted behavior‑sequence modeling as the core input, constructing user sequences from click, conversion, and search actions, and experimented with three deep models: DIN (attention‑based), DIEN (interest extraction + evolution), and Transformer (multi‑head attention with residual connections).

Initial experiments showed that while DIEN consumes more resources than Transformer, both outperform DIN in online metrics; however, due to limited dynamic user behavior, the final production model favored a Transformer‑based architecture.

The double‑channel deep interest model combines a customized feature channel (static post and user attributes, cross‑features) with a serialized sequence channel (Transformer‑derived interest), feeding both into an MLP. Offline optimizations such as negative sampling, MapReduce‑based feature preprocessing, TFRecord storage, and TensorFlow Dataset API raised GPU utilization from <30% to >90% and reduced training time from five days to five hours.

Online optimizations included parallel feature extraction and scoring via batch requests to TensorFlow Serving, batch‑level user feature sharing, model decoupling to handle large PB models, and request‑size compression, achieving sub‑10 ms latency for 95% of requests.

Further refinements aligned offline and online behavior sequences, introduced Word2Vec embeddings for posts and user actions, and added a multi‑channel architecture that separates click, conversion, and search behavior streams, each with tailored representations (e.g., clustered IDs for conversion, averaged word vectors for search).

Extensive A/B tests demonstrated up to 14% lift in click‑through rate and 17% increase in conversion rate compared to baseline rule‑based ranking, with the multi‑channel model delivering an additional ~20% improvement over the double‑channel version.

Future work includes incorporating content‑behavior channels (e.g., article browsing) and extending the framework to support multiple scene‑specific adaptations through customized feature cross‑engineering.

AIdeep learningTransformerlarge-scale systemsuser interest modelingsequence modeling
58 Tech
Written by

58 Tech

Official tech channel of 58, a platform for tech innovation, sharing, and communication.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.