Artificial Intelligence 14 min read

Personalized Demand Prediction and Ranking for Qunar App’s “You May Like” Card

This article describes how Qunar replaced a low‑click hot‑words card with a personalized “You May Like” recommendation card, detailing data collection, rule‑based and collaborative‑filtering association methods, learning‑to‑rank models (subjective Bayes, RankBoost, LambdaMart), training‑sample strategies, online experiments, evaluation metrics, and future plans including LSTM‑based sequence modeling.

Qunar Tech Salon

Aug 20, 2016

Personalized Demand Prediction and Ranking for Qunar App’s “You May Like” Card

Background and Overall Solution

Qunar’s app originally displayed a history search card and an operational hot‑words card on the main search page, but the hot‑words card had a low click‑through rate (4%). To help users find interesting products faster, the hot‑words card was replaced with a “You May Like” recommendation card, aiming to increase clicks and reduce bounce rates.

Problem Statement

All major recommendation platforms (Taobao, Amazon, Toutiao, YouTube) predict user needs; Qunar’s app also contains many recommendation pages and cards. The goal is to improve the “You May Like” card on the app’s main search page.

In the previous version, the operational hot‑words card’s click‑through rate was only 4%; the new card is expected to raise clicks and lower the no‑input exit rate.

Solution Idea

Two steps are taken: first, identify demand points closely related to the user; second, rank these demand points to find the most interesting ones.

Discovering User‑Demand Associations

1. User Profile

User‑profile includes permanent residence, device type, current location, and historical behavior sequence. Logs from most Qunar business lines are collected in near‑real‑time (minute‑level latency).

2. Rule‑Based Association

A rule engine finds connections between users and demands; a learn‑to‑rank model then orders the demands.

3. Collaborative Filtering Based on User Behavior Sequences (A2B Transfer Matrix)

Traditional user‑based or item‑based CF suffers from data sparsity in tourism. Qunar proposes a sequence‑based CF: after behavior A, compute the probability of each possible behavior B occurring within T days, and select the highest‑probability B as the recommendation. The formula is shown in the following image.

Example: users who searched for “Sanya flight” then, within T days, looked at Sanya hotels; users who bought a ticket to Nanning then looked at Guilin tours. This adds a temporal dimension to CF.

4. LBS‑Based Collaborative Filtering

Statistics of the hottest search queries per city are combined with users’ permanent residence and device type. The calculation formula is illustrated below.

Examples: For Beijing locals, top searches are Happy Valley, Gubei Water Town, Beidaihe; for Beijing tourists, top searches are the Forbidden City, Summer Palace, Great Wall.

5. Manual Rules

Over 120 handcrafted rules handle special scenarios, e.g., if a user books an early‑morning flight, recommend hotels near the airport.

Machine‑Learning Based Demand Ranking

Each predicted demand (expert opinion) is represented as a vector; p is the total number of experts (rules), and x_i is the confidence of expert i (rankfeature). The scoring function aggregates these features.

Evolution of Ranking Models

Three models were used sequentially:

Subjective Bayes (70s expert‑system inference) – simple but assumes independence of rankfeatures and requires discretization; achieved 10% CTR.

RankBoost – combines multiple weak rankers, learns feature discretization automatically; achieved 18% CTR.

LambdaMart – integrates LambdaRank and GBDT; uses regression trees to model feature dependencies; CTR did not improve noticeably yet because rule count is still low.

Obtaining Objective Training Samples

Two approaches:

Use historical user behavior as training data before launch – suffers from inaccuracy (different entry click habits) and inflexibility (new business logs needed).

After launch, use logs from the “You May Like” card – limited to items already shown, causing a Matthew‑effect where low‑ranked experts never get exposure.

To evaluate new rules, random items generated by a rule are inserted into live displays, and clicks are monitored. This online experiment minimally impacts user experience.

Sampling Constraints

Constraint 1: Sample counts for each rule should be proportional to the rule’s original trigger rate. Violating this skews model training.

Constraint 2: Total sample weight for each rule should match the original trigger‑rate ratio. This is achieved by adjusting each sample’s weight with a correction factor w.

After weight correction, rule B can be sampled more often while preserving the overall weight ratio (e.g., 2:30).

Evaluation Metrics and Future Work

The primary metric is click‑through rate; secondary metrics include user bounce rate on the main search page.

Formulas for CTR and bounce rate are shown below.

Since the “You May Like” project launched in December 2015, CTR has steadily improved, as illustrated by the performance curve.

Future plans include adding finer‑grained rules, further tuning the LambdaMart model, and introducing an LSTM‑based RNN to capture full user behavior sequences for better demand prediction.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

machine learning personalization recommendation Ranking collaborative filtering Qunar

Written by

Qunar Tech Salon

Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.