Artificial Intelligence 16 min read

Exploring Multi-Objective Recommendation Algorithms for 58 Community: Cross-Domain Embedding and Online Optimization

This article details how 58 Community improved content value share, click‑through, and user retention by designing a generalized multi‑objective recommendation algorithm that leverages cross‑domain embeddings, DeepFM‑DIN models, EGES‑inspired pre‑training, and online CEM‑based parameter optimization.

DataFunTalk

Oct 4, 2021

Exploring Multi-Objective Recommendation Algorithms for 58 Community: Cross-Domain Embedding and Online Optimization

01 58 Community Business and Background

58 Community is a content platform for local users of 58.com, offering PGC and UGC feeds with rich formats such as images, videos, and audio. Its mission is to connect all 58 services, acting as a bridge between real‑estate, recruitment, automotive, and local life domains.

02 Evolution of Business Goals

Initially the goal was a single‑objective ranking to boost click‑through rate (CTR). As the product matured, multiple objectives such as interaction rate, like rate, and user retention became important, leading to multi‑objective ranking models like shared‑layer ESSM and MM‑OE.

Key targets include increasing the proportion of "value content" (content tightly related to 58’s core services) while keeping CTR stable, and improving retention without sacrificing CTR or interaction metrics.

03 Exploration of a Generalized Multi‑Objective Algorithm

Value content is defined as items closely linked to 58’s core business (housing, cars, recruitment, local life). The team first attempted a naive approach by adding cross‑domain behavior sequences to the DIN model, but faced issues such as sparse cross‑domain IDs, slow training, and negligible weight for new features.

To address this, they adopted an EGES‑style cross‑domain embedding pre‑training: extracting 2‑3 core attributes per business line as edge information, compressing IDs, and constructing a weighted user‑behavior graph. DeepWalk‑style random walks generated sequences for word2vec‑like embedding training, producing joint embeddings for items and edge attributes.

These pre‑trained embeddings replaced the original pooling embeddings in the DIN sequence, resulting in higher value‑content share (12% → 28%) and a modest CTR lift.

04 Retention Optimization and Online Parameter Tuning

Retention analysis identified four influential factors: interaction rate, first‑visit content type weight, last‑visit content type weight, and diversity. A re‑ranking score was defined as CTR + a·interaction + b·first‑type + c·last‑type, with a diversity adjustment factor θ.

To find optimal hyper‑parameters (a, b, c, θ), the team used the Cross‑Entropy Method (CEM) for online automatic tuning: sampling parameter sets from a Gaussian distribution, allocating ~10% traffic for each, evaluating rewards (weighted retention, CTR, interaction gains), selecting top‑k, and iteratively updating the distribution.

The CEM process converged after ~10 iterations, yielding a 1% lift in next‑day retention while keeping CTR and interaction stable. The approach highlighted the importance of first/last visit content weights for new users.

In summary, the cross‑domain embedding and online CEM optimization successfully balanced multiple objectives, improving both short‑term CTR and long‑term retention, and point toward future reinforcement‑learning‑based multi‑objective strategies.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

User Retention recommendation Deep Learning multi-objective optimization CEM cross-domain embedding online parameter tuning

Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.