Design and Implementation of an End-to-End Efficiency Optimization Platform for 58.com Classified Listings
This article describes the design and implementation of a comprehensive efficiency‑optimization platform at 58.com, detailing its end‑to‑end workflow—from log aggregation and feature extraction through machine learning model training and online experimentation—highlighting modular, configurable, and scalable solutions for multi‑business, multi‑product ranking.
The article introduces the background of 58.com, the largest domestic classified information platform, and the need for an efficient list‑page ranking system that can handle multiple business lines (housing, used cars, recruitment, etc.) and product formats. It explains that click‑through rate (CTR) and conversion rate (CVR) models are the core factors for improving connection efficiency.
It then presents the overall architecture of the Efficiency Optimization Platform, which consists of four foundational modules—log sampling, machine learning, online experimentation, and platform‑wide integration. The platform follows three architectural principles: end‑to‑end process coverage, scenario‑agnostic configurability, and iterative construction.
Log Sampling Module handles log merging, anti‑fraud filtering, post feature extraction, user personalization, and sample generation. The process includes log preprocessing (merging exposure, click, and conversion logs), multi‑dimensional anti‑fraud mechanisms, extraction of basic, feedback, and text‑image features, and the creation of historical and near‑real‑time personalized features.
Machine Learning Module takes the generated samples as input and performs sample sampling, feature engineering, and model training. Sampling supports label selection, negative‑sample down‑sampling, and positive‑sample up‑sampling. Feature engineering provides selection, transformation (discretization, normalization, encoding), and combination (Cartesian and matching) capabilities. The platform supports LR, GBDT, FM, and deep‑learning models (TensorFlow‑based W&D, DeepFM) with both result and feature fusion strategies.
Online Experiment Module converts offline models into executable expressions, validates score consistency, and pushes models to production. An experiment system performs user‑ID based traffic splitting for recall and ranking layers, enabling A/B testing and metric reporting.
Platform Integration wraps all modules into a unified front‑end, offering workflow creation, configuration, monitoring, model adaptation, push, and reporting. It also provides management capabilities (data, workflow, report, experiment) and an open feature platform that allows easy registration and incorporation of new features.
The article outlines the iterative practice path: initial rapid delivery of core log modules, successive simplifications and enhancements (e.g., anti‑fraud mechanisms), and the migration of multiple business‑specific pipelines to the common platform. It emphasizes challenges such as wide‑range impact, strict effect alignment, and multi‑stage coordination.
In the concluding section, three key takeaways are highlighted: the importance of strong engineering skills for algorithm teams, the necessity of robust data pipelines and monitoring, and the value of clear, iterative planning for platform and optimization development. Future directions include deeper integration of deep learning, online learning, configuration abstraction, and richer conversion data.
58 Tech
Official tech channel of 58, a platform for tech innovation, sharing, and communication.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.