Artificial Intelligence 28 min read

Deep Learning Ranking Models for 58.com Rental Search: Architecture, Model Iterations, and Optimization

This article presents the end‑to‑end design, feature engineering, model evolution (Wide&Deep, DeepFM, DCN, DIN, DIEN), multi‑task training, and deployment optimizations that 58.com applied to improve search ranking for its rental business, demonstrating significant gains in click‑through and conversion rates.

58 Tech
58 Tech
58 Tech
Deep Learning Ranking Models for 58.com Rental Search: Architecture, Model Iterations, and Optimization

Deep learning has greatly enhanced feature representation, enabling successful applications in vision, text, and speech, and it is now a key driver for commercial search ranking. 58.com, the largest Chinese lifestyle information platform, leverages deep models to improve its rental search service.

Application Background – Users search for specific neighborhoods or subway stations, and the system retrieves relevant posts, ranks them, and returns a sorted list. Better ranking improves retrieval efficiency, user retention, information quality, and ultimately platform revenue.

Ranking Architecture – The system consists of a data layer (log extraction, feature aggregation, offline sampling) and a strategy layer (model inference and business rules). Figure 2 (not shown) illustrates this split.

Model Iteration Path

1. Wide&Deep : Combines a linear LR component (memorization of low‑level feature co‑occurrence) with a DNN component (high‑level abstraction). Figure 3 shows its structure.

2. DeepFM : Replaces the LR part with a Factorization Machine to automatically capture second‑order feature interactions, sharing the embedding layer with the DNN (Figure 4).

3. DCN : Introduces cross layers that receive both the previous hidden output and the original input, enabling higher‑order feature crossing (Figure 5).

4. DIN : Uses attention to model the relevance between a candidate post and each item in the user's historical click sequence, reducing reliance on handcrafted statistics (Figure 7).

5. DIEN : Extends DIN with an Interest Extraction Layer (GRU) and an Interest Evolving Layer (AUGRU + Attention) to capture temporal evolution of user interests (Figure 8).

Data Sample Construction – Features are grouped into user‑dimensional, post‑dimensional, and contextual categories (see Table 1). User sequences are built via Flink+Kafka, capped at length 50, and stored in HDFS for offline training.

Offline Sample Pipeline – Includes sampling (positive:negative ≈ 1:4‑5), feature extraction, feature engineering (normalization, bucketing, hashing), and model training. GAUC is used for evaluation, weighting per‑user AUC after filtering users with all‑positive or all‑negative samples.

Model Optimization

Feature construction: one‑hot encode low‑dimensional categorical features, embed high‑dimensional sparse IDs, normalize continuous features, and concatenate before the attention layer.

Batch effects: Proper handling of TensorFlow batch‑normalization and Dice activation ensures consistent training and inference.

Sequence padding: Masking is applied to ignore padded zeros, improving AUC (+0.94%) and GAUC (+1.37%).

Data loading: Switching to tf.data.Dataset with parallel preprocessing raises GPU utilization from 10% to 50% and reduces training time from 7 h to 1 h.

Multi‑task learning: A shared embedding and attention layer feed separate DNN towers for CTR and CVR, with a weighted sum loss; this yields stable improvements over the baseline XGBoost model.

Prediction Optimization

Serialization format: Converting feature lists to primitive arrays (int[]/float[]) cuts serialization latency (Figure 12).

User‑sequence simplification: Only one copy of user‑dimensional data is sent per request; padding is performed offline, reducing data size by 90% and lowering timeout rates from 3‑5% to < 1% (Figure 13).

Deployment: CPU‑based inference outperforms GPU due to reduced data transfer overhead.

Online Results – The final multi‑task DIEN model outperforms the XGBoost baseline in both click‑through and conversion rates (Figures 14‑15). Latency remains stable (< 12 ms per batch of 20) with timeout ratios below 0.2%.

Conclusion & Outlook – The end‑to‑end pipeline—from offline sample generation to online serving—demonstrates how deep models (DIN, DIEN) and systematic engineering can substantially boost ranking performance in rental search. Future work includes embedding‑splitting for ultra‑sparse IDs, incorporating visual features via CNNs, and extending the approach to other verticals such as recruitment and commercial real estate.

Author : Bai Bo, Senior Algorithm Engineer, 58.com TEG Search Ranking Department.

Department : 58.com TEG Search Ranking provides NLP services, recall, and ranking for core business scenarios, serving millions of daily active users.

References

Cheng et al., "Wide & Deep Learning for Recommender Systems", 2016.

Guo et al., "DeepFM: A Factorization-Machine Based Neural Network for CTR Prediction", IJCAI 2017.

Wang et al., "Deep & Cross Network for Ad Click Predictions", 2017.

Zhou et al., "Deep Interest Network for Click‑Through Rate Prediction", KDD 2018.

Zhou et al., "Deep Interest Evolution Network for Click‑Through Rate Prediction", AAAI 2019.

Ma et al., "Entire Space Multi‑Task Model", SIGIR 2018.

model optimizationfeature engineeringdeep learningrecommendation systemmulti-task learningsearch ranking
58 Tech
Written by

58 Tech

Official tech channel of 58, a platform for tech innovation, sharing, and communication.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.