Artificial Intelligence 20 min read

Commercial Recommendation System for 58 Recruitment: Architecture, Recall, and Ranking Techniques

This talk presents the design and implementation of 58's commercial recruitment recommendation system, covering the business scenario, system architecture, regional and behavior‑based recall methods, various ranking models—including coarse‑ranking, dual‑tower, DIN‑bias, and multitask W3DA—and future optimization directions.

58 Tech
58 Tech
58 Tech
Commercial Recommendation System for 58 Recruitment: Architecture, Recall, and Ranking Techniques

The presentation introduces the 58 recruitment commercial recommendation scenario, where a large C‑side job‑seeker audience and diverse B‑side job postings require efficient BC‑side matching to improve traffic monetization.

Scenario Overview : The APP’s top‑left "Find Job" entry drives 70% of traffic, split into non‑precise (hot, nearby) and precise (tag‑matched) recommendations. Job postings are abundant (6 × 10⁷ normal, 3 × 10⁵ commercial), with many fine‑grained categories, leading to cross‑category posting challenges.

System Architecture : A pipeline from multi‑channel recall to filtering and ranking is built using user and item data combined with state‑of‑the‑art machine‑learning algorithms. The architecture includes a strategy layer for content understanding, a recall‑filter‑ranking mechanism, and a final style‑optimization step before display.

Recall Methods :

Regional recall using DBSCAN clustering on user latitude/longitude to identify dense user regions; top‑clicked job titles in each region become hot‑post recommendations.

Behavior‑based recall employing the EGES model: sessions (3‑hour windows) are cleaned, a graph of clicked posts is built, random walks generate sequences, and a Word2Vec‑style training produces post embeddings enriched with side‑information (tags, category, salary). Vectors are unit‑normalized and concatenated with a 16‑bit city encoding (0 → -1) before cosine similarity retrieval.

Additional recall channels: content‑based (title Word2Vec), rule‑based (category expansion), and other heuristics.

Ranking :

Coarse Ranking aims to select 200‑300 candidates within 15 ms. It evolved from rule‑based to LR, then to a dual‑tower model (user tower and item tower decoupled, item vectors stored in Redis). Feature selection reduced dimensions from 450 to 150.

Knowledge distillation uses the fine‑ranking model as a teacher, combining soft targets (teacher predictions) with hard targets (click labels) to improve AUC from 0.68 to 0.70.

Fine Ranking addresses position bias and combines CTR and CVR. Models progressed from a baseline DIN to DIN‑bias (position sub‑network), then to W3DA (Wide‑Deep‑Attention) adding two wide sub‑networks for first‑ and second‑order feature crosses, finally to a multitask W3DA that jointly predicts CTR, CVR, and CTCVR (CTR × CVR) following the ESMM paradigm, achieving a 3.55 % lift in CTCVR while keeping CTR stable.

Evaluation & Q&A : Offline metrics focus on AUC; online performance is measured by CTR, CVR, ACP, and A/B tests. The Q&A covers DBSCAN evaluation, EGES embedding validation, negative sample selection for coarse ranking, hard vs. soft targets in distillation, and future plans such as edge deployment.

Future Work : Improve cold‑start recall by dynamically exploring user intent, adopt more expressive ranking models like DCNv2, and enhance traffic allocation strategies to foster a healthier commercial ecosystem.

machine learningrecommendation systemRankingrecallonline advertisingDBSCANEGES
58 Tech
Written by

58 Tech

Official tech channel of 58, a platform for tech innovation, sharing, and communication.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.