Mining User Housing Preference Schemes with Supply‑Filtered Tree‑Based Methods
The article proposes a supply‑filtered, tree‑based approach to discover multi‑dimensional user housing preference schemes, contrasting fixed‑length preference mining methods, and details algorithmic modules such as split‑point search, similarity calculation, split suppression, and user clustering to improve interpretability and offline applicability.
Background Accurate understanding of user preferences is a prerequisite for online‑offline matchmaking in real‑estate services. Existing industry work emphasizes accuracy but treats preference as an influence factor rather than an explicit output, limiting interpretability for downstream offline applications.
What is a user housing scheme? Users typically filter only a few dimensions of a property list; the selected dimensions reflect the attributes they care about. This leads to two sub‑problems: identifying the attributes a user cares about and the values within those attributes. The paper defines the combination of active (decision‑influencing) and passive (by‑product) preferences as a user housing scheme .
Fixed‑length combination preference mining methods require a priori selection of preference dimensions (e.g., price, layout). Example tables illustrate how user IDs are mapped to preference distributions for attributes such as community and price.
Common approaches
Linear‑weighted mining: simple, deterministic, but requires manually set behavior and time weights, making optimization costly.
Supervised‑model mining: formulates preference extraction as a supervised problem (e.g., Seq‑Rec), avoiding manual weights but demanding extensive labeled data and higher development effort.
Limitations of fixed‑length encoding: mixes active and passive preferences, incurs high computational cost for high‑dimensional joint distributions, and depends on strong coverage of user behavior.
Proposed tree‑based method for housing schemes
Split‑point search module : recursively selects attribute‑value conditions that maximize similarity between the user‑visited property set and the supply set.
Similarity calculation module : measures similarity from spatial, probabilistic, or NLP perspectives, ensuring comparability across different attribute granularities.
Split‑suppression module : prevents over‑splitting that would create overly small subsets, balancing similarity gain against computational cost and randomness.
User clustering module : aggregates sparse user interaction data into clusters to improve robustness for low‑activity users and reduce offline computation.
The algorithm proceeds by feeding the user‑visited set I and supply set S, searching for attribute P and value p that maximize similarity, partitioning I and S into left/right sub‑trees, and recursing until suppression criteria are met. Each leaf path yields a candidate housing scheme with an associated intensity score.
Evaluation measures the rank of the scheme containing the actually transacted property, both per user (percentile) and averaged across users, to assess how well the extracted schemes explain real purchases.
Conclusion and outlook The supply‑filtered scheme mining method better captures the factors influencing user decisions in scenarios with many property attributes and strong supply constraints, offering improved online‑offline consistency and paving the way for broader downstream applications.
Beike Product & Technology
As Beike's official product and technology account, we are committed to building a platform for sharing Beike's product and technology insights, targeting internet/O2O developers and product professionals. We share high-quality original articles, tech salon events, and recruitment information weekly. Welcome to follow us.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
