Artificial Intelligence 14 min read

Applying Large Language Models to Real Estate Recommendation: Case Studies and Optimization Techniques

This article presents a comprehensive case study on how large language models are integrated into 58.com’s real‑estate recommendation platform, detailing challenges, data adaptation, prompt and parameter optimizations, embedding generation, conversational recommendation, and future directions for multimodal and generative recommendation systems.

58 Tech
58 Tech
58 Tech
Applying Large Language Models to Real Estate Recommendation: Case Studies and Optimization Techniques

Since the emergence of ChatGPT, large language models (LLMs) have driven technological transformation across many domains, yet their application in real‑estate recommendation still faces three main bottlenecks: difficulty adapting structured property data to text‑oriented models, latency in real‑time response, and unclear participation modes (direct generation vs. feature assistance).

To address these issues, the 58.com Housing Business Group (HBG) recommendation algorithm team partnered with the 58.com AI Lab, conducting multi‑business, multi‑scenario, and multi‑mode experiments that focus on LLM‑based user‑profile inference and embedding generation.

2.1 LLM Profile Inference Case – The strategy combines user behavior signals (searches, clicks, collections, calls, chats) with property attributes (price, layout, location, description, images) to construct a textual prompt describing the user’s house‑search path. The LLM processes this prompt, outputs a textual description of the user’s preferred property features, and the text is parsed into structured profile data used in recall, fine‑ranking, and re‑ranking stages. Detailed steps include extensive behavior data collection, extraction of both structured and unstructured property features, and integration of the generated profile into the recommendation pipeline.

Optimization techniques include prompt engineering (role‑play, chain‑of‑thought, and a formatted placeholder to fix the number of returned entities, e.g., {Region1:Q1, BusinessCircle1:S1, …, Price:xx‑xx million, Area:xx‑xx sqm} ), and parameter tuning (setting temperature to 0.08 for more deterministic outputs). These refinements raised the average per‑user connection count by 2.37%, with a 5.33% lift in first‑tier cities.

A knowledge base of similar districts and business circles was also built offline to enrich the recommendation pool.

2.2 LLM Embedding Recommendation Case – Instead of textual profiles, this approach feeds user and property textual descriptions into the LLM to obtain high‑dimensional vectors (embeddings) that are directly used for similarity‑based recall and ranking. The team employed the self‑developed Wuba Text Embedding (WTE) family, specifically the WTE‑chatling‑7b model built on the Lingxi chatling‑turbo backbone, which incorporates bidirectional attention, extensive pre‑training on domain‑specific corpora, and a contrastive loss with expanded negative samples.

Optimization techniques for embeddings include: (1) Prompt refinement that preserves numeric IDs and reorders features to improve token importance; (2) Model acceleration using Rust‑based Text Embedding Inference and Python‑based SGLang frameworks, achieving a ten‑fold speed increase over traditional sentence‑transformer inference; (3) Exploiting embedding characteristics such as sensitivity to IDs, token positions, and high‑frequency terms, e.g., converting “total price is 5 million” to “price 5 million”. Experiments showed a 6.61% rise in per‑user connections for commercial‑real‑estate recommendation pages.

2.3 Conversational Recommendation Case – In the new‑home smart micro‑chat assistant, the system parses multimodal user inputs (text, voice, images) in real time, distinguishes between Q&A, casual chat, and recommendation intents, and leverages contextual understanding to recommend properties that match the inferred preferences, dramatically improving accuracy and user experience.

Outlook – Future work will focus on three directions: (1) Multimodal fusion to incorporate images, videos, and floor plans alongside text; (2) Generative recommendation, moving from pure matching to creating personalized property descriptions and virtual advisor interactions; (3) Ecosystem co‑construction, open‑sourcing the end‑to‑end pipeline for broader business adoption.

Conclusion – By continuously integrating LLM capabilities with domain expertise, high‑performance inference frameworks, and systematic optimization, the HBG algorithm team has demonstrated substantial business impact and sets a foundation for the next generation of intelligent real‑estate recommendation systems.

prompt engineeringlarge language modelsembeddingRecommendation systemsReal EstateAI optimization
58 Tech
Written by

58 Tech

Official tech channel of 58, a platform for tech innovation, sharing, and communication.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.