Artificial Intelligence 12 min read

Efficiency Optimization Practices for 58.com Search Ranking

This article presents a comprehensive overview of 58.com’s search efficiency optimization, detailing the business background, ranking framework, data, algorithm, and engineering components, describing the three-stage ranking process, strategy and platform optimizations, feature engineering, model upgrades, and the resulting performance improvements.

DataFunTalk
DataFunTalk
DataFunTalk
Efficiency Optimization Practices for 58.com Search Ranking

The talk begins with an introduction to 58.com’s diverse business scenarios—housing, second‑hand goods, recruitment, local services—and explains how these varied product forms (top, premium, normal) create a complex ranking problem.

The ranking algorithm is divided into three stages: coarse ranking (reducing model load by filtering candidates using timeliness and quality‑factor de‑weighting), fine ranking (optimizing click‑through rate, conversion, personalization, relevance), and re‑ordering (applying business‑related and filter strategies). These stages deliver three core capabilities: quality governance, efficiency optimization, and traffic control.

Efficiency optimization is built on three pillars—data, algorithm, and engineering. The framework iterates on strategy paths and an efficiency‑optimization platform, which includes log merging/cleaning, feature engineering, model training, and online validation.

Strategy optimization follows four phases: feedback strategy (early‑stage smoothing, position bias removal, time decay), basic model (LR, GBDT, personalization, structured posts), feature upgrade (timeliness, combinatorial, text‑image features), and model upgrade (fusion of LR and GBDT, deep learning with Wide&Deep and DeepFM).

The efficiency platform refines logs (IP and user anti‑fraud), defines multi‑metric thresholds (exposure, CTR, conversion), and generates raw samples from exposure, click, and conversion logs. Samples are enriched on a feature‑open platform covering post structure, feedback, image features, and user personalization.

Feature engineering is componentized: one‑to‑one transformations, one‑to‑many encodings, many‑to‑many combinations, supporting both expression‑based and predefined components. Model fusion combines result‑level and feature‑level fusion, e.g., GBDT‑encoded features, neural network embeddings.

Model training typically uses 14 days of data with a 1:15 positive‑negative ratio, 3‑day test set, and AUC as the offline metric. LR models are discretized; XGB is used for high‑dimensional features. Reported gains include >20% conversion uplift for rental listings, >30% ECPM increase for premium housing, and >10% phone‑call connect rate improvement.

The platform consists of three modules: a base module (log samples, machine learning, online experiments), a user workspace (data configuration), and a data module (process, sample, log, and effect databases). These enable workflow creation, configuration, monitoring, model conversion, push, and reporting, while supporting feature‑open, data, process, experiment, and report management.

Final results show 40‑60% ECPM improvement for premium listings and ~10% conversion uplift across ordinary business lines. Future work will explore deeper learning models, integrate TensorFlow, and build an online‑offline unified learning platform leveraging rich multimodal data.

algorithmMachine Learningfeature engineeringsearch rankingonline advertisingefficiency optimization
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.