Artificial Intelligence 19 min read

Deep Learning Practices for Internet Real‑Estate Recommendation at 58.com

This article details the end‑to‑end deep‑learning pipeline used by 58.com for real‑estate recommendation, covering business background, a six‑layer architecture, vector‑based recall, various embedding and ranking models, multi‑task and multi‑scenario optimization techniques, and future directions for large‑model integration.

DataFunSummit
DataFunSummit
DataFunSummit
Deep Learning Practices for Internet Real‑Estate Recommendation at 58.com

The presentation introduces the 58 real‑estate platform, describing its business scope (new homes, second‑hand homes, rentals, commercial properties, overseas listings, and decoration services) and the critical role of recommendation in connecting landlords, agents, and seekers.

The recommendation system is organized into six layers: Data layer (offline and real‑time storage, with house vectors stored in Faiss), Computing layer (offline/online tasks, model training, Faiss vector search, and personalized retrieval), Recall layer (multiple strategies such as vector recall, commercial recall, interest recall, location‑based recall, re‑marketing, and hot‑item recall), Ranking layer (fine‑grained ranking models), Re‑ranking layer (weighting, deduplication, filtering, and shuffling), and Application layer (online serving interfaces).

Vectorized recall relies on preprocessing click sequences, filtering noisy clicks, handling modified listings, and oversampling connection actions. Embedding models include Skip‑gram (word2vec), DeepWalk, EGES (with side information), Graph Convolutional Networks, and a B4SR BERT‑based sequential recall model that uses masked token prediction and concatenated attribute embeddings.

For user‑to‑item (U2I) matching, dual‑tower architectures with separate user and item towers are employed; SENET modules are added for feature weighting, and model ensembles combine MLP, CIN, DCN, FM, etc., using Hadamard product and logistic regression. SentenceBert replaces native BERT for better sentence‑level matching between user queries and property descriptions.

Multi‑task learning addresses both click‑through‑rate (CTR) and connection tasks. Early models such as ESMM handle CTR‑CVR dependency; later improvements explore MMoE, SNR, PLE (Progressive Layered Extraction), and LHUC‑PLE, which inject scene information via learned hidden‑unit contributions. GradNorm balances loss weights, and long‑term interest sequences are incorporated.

Challenges include city‑ and business‑type specific recall, ensuring sufficient recall volume across regions, and handling bias from commercial exposure. Solutions involve multi‑partition Faiss services, weighted embedding concatenation, and dynamic weighting of member versus personal listings.

The authors conclude with a summary of the system’s components, ongoing optimization efforts, and plans to explore large‑model applications in real‑estate recommendation.

deep learningrecommendation systemVector Searchfaissmulti-task learningreal estate recommendation
DataFunSummit
Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.