Exploring Multi-Objective Recommendation Algorithms for 58 Community: Cross-Domain Embedding and Online Optimization
This article details how 58 Community improved content value share, click‑through, and user retention by designing a generalized multi‑objective recommendation algorithm that leverages cross‑domain embeddings, DeepFM‑DIN models, EGES‑inspired pre‑training, and online CEM‑based parameter optimization.
01 58 Community Business and Background
58 Community is a content platform for local users of 58.com, offering PGC and UGC feeds with rich formats such as images, videos, and audio. Its mission is to connect all 58 services, acting as a bridge between real‑estate, recruitment, automotive, and local life domains.
02 Evolution of Business Goals
Initially the goal was a single‑objective ranking to boost click‑through rate (CTR). As the product matured, multiple objectives such as interaction rate, like rate, and user retention became important, leading to multi‑objective ranking models like shared‑layer ESSM and MM‑OE.
Key targets include increasing the proportion of "value content" (content tightly related to 58’s core services) while keeping CTR stable, and improving retention without sacrificing CTR or interaction metrics.
03 Exploration of a Generalized Multi‑Objective Algorithm
Value content is defined as items closely linked to 58’s core business (housing, cars, recruitment, local life). The team first attempted a naive approach by adding cross‑domain behavior sequences to the DIN model, but faced issues such as sparse cross‑domain IDs, slow training, and negligible weight for new features.
To address this, they adopted an EGES‑style cross‑domain embedding pre‑training: extracting 2‑3 core attributes per business line as edge information, compressing IDs, and constructing a weighted user‑behavior graph. DeepWalk‑style random walks generated sequences for word2vec‑like embedding training, producing joint embeddings for items and edge attributes.
These pre‑trained embeddings replaced the original pooling embeddings in the DIN sequence, resulting in higher value‑content share (12% → 28%) and a modest CTR lift.
04 Retention Optimization and Online Parameter Tuning
Retention analysis identified four influential factors: interaction rate, first‑visit content type weight, last‑visit content type weight, and diversity. A re‑ranking score was defined as CTR + a·interaction + b·first‑type + c·last‑type, with a diversity adjustment factor θ.
To find optimal hyper‑parameters (a, b, c, θ), the team used the Cross‑Entropy Method (CEM) for online automatic tuning: sampling parameter sets from a Gaussian distribution, allocating ~10% traffic for each, evaluating rewards (weighted retention, CTR, interaction gains), selecting top‑k, and iteratively updating the distribution.
The CEM process converged after ~10 iterations, yielding a 1% lift in next‑day retention while keeping CTR and interaction stable. The approach highlighted the importance of first/last visit content weights for new users.
In summary, the cross‑domain embedding and online CEM optimization successfully balanced multiple objectives, improving both short‑term CTR and long‑term retention, and point toward future reinforcement‑learning‑based multi‑objective strategies.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.