Artificial Intelligence 12 min read

Efficiency Optimization Practices for 58.com Search Ranking

This article presents a comprehensive overview of 58.com’s search efficiency optimization, detailing the business background, ranking framework, data, algorithm, and engineering components, describing the three-stage ranking process, strategy and platform optimizations, feature engineering, model upgrades, and the resulting performance improvements.

DataFunTalk

Jan 18, 2019

Efficiency Optimization Practices for 58.com Search Ranking

The talk begins with an introduction to 58.com’s diverse business scenarios—housing, second‑hand goods, recruitment, local services—and explains how these varied product forms (top, premium, normal) create a complex ranking problem.

The ranking algorithm is divided into three stages: coarse ranking (reducing model load by filtering candidates using timeliness and quality‑factor de‑weighting), fine ranking (optimizing click‑through rate, conversion, personalization, relevance), and re‑ordering (applying business‑related and filter strategies). These stages deliver three core capabilities: quality governance, efficiency optimization, and traffic control.

Efficiency optimization is built on three pillars—data, algorithm, and engineering. The framework iterates on strategy paths and an efficiency‑optimization platform, which includes log merging/cleaning, feature engineering, model training, and online validation.

Strategy optimization follows four phases: feedback strategy (early‑stage smoothing, position bias removal, time decay), basic model (LR, GBDT, personalization, structured posts), feature upgrade (timeliness, combinatorial, text‑image features), and model upgrade (fusion of LR and GBDT, deep learning with Wide&Deep and DeepFM).

The efficiency platform refines logs (IP and user anti‑fraud), defines multi‑metric thresholds (exposure, CTR, conversion), and generates raw samples from exposure, click, and conversion logs. Samples are enriched on a feature‑open platform covering post structure, feedback, image features, and user personalization.

Feature engineering is componentized: one‑to‑one transformations, one‑to‑many encodings, many‑to‑many combinations, supporting both expression‑based and predefined components. Model fusion combines result‑level and feature‑level fusion, e.g., GBDT‑encoded features, neural network embeddings.

Model training typically uses 14 days of data with a 1:15 positive‑negative ratio, 3‑day test set, and AUC as the offline metric. LR models are discretized; XGB is used for high‑dimensional features. Reported gains include >20% conversion uplift for rental listings, >30% ECPM increase for premium housing, and >10% phone‑call connect rate improvement.

The platform consists of three modules: a base module (log samples, machine learning, online experiments), a user workspace (data configuration), and a data module (process, sample, log, and effect databases). These enable workflow creation, configuration, monitoring, model conversion, push, and reporting, while supporting feature‑open, data, process, experiment, and report management.

Final results show 40‑60% ECPM improvement for premium listings and ~10% conversion uplift across ordinary business lines. Future work will explore deeper learning models, integrate TensorFlow, and build an online‑offline unified learning platform leveraging rich multimodal data.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

algorithm machine learning search ranking online advertising efficiency optimization

Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.