How Video Search Engines Rank Results: From Click Models to Multi‑Goal Optimization
This article explains the architecture of video search engine ranking, covering optimization objectives such as relevance, click‑through rate and watch time, and detailing pointwise, pairwise and listwise learning approaches, model training pipelines, and online serving strategies.
In a video search engine, the ranking system sits after indexing, recall, and query understanding, determining the final order of results shown to users (see
). Unlike the recall stage, ranking focuses on accuracy rather than recall rate, processing a smaller candidate set with more complex models.
Optimization Goals
Optimization Target : Designing objectives influences content bias; click‑through rate (CTR) favors easily clickable videos, while watch‑time targets longer engagements. Multi‑objective designs combine CTR and effective play time.
Model Structure : The ranking model maps features x to a prediction f(x) that approximates the target y, determining how efficiently features are used.
Feature Engineering : Features may include user‑side signals and their representations.
Relevance Modeling : To avoid over‑reliance on click signals, the system must ensure the ranking respects content relevance.
Beyond these core topics, the ranking pipeline also involves model training, sample selection, position bias correction, and feature discretization.
Click‑Through Rate (CTR) Model (Pointwise)
The CTR model treats each sample independently. For a sample, the loss is the cross‑entropy between the true click label (P⁻) and the predicted probability (P):
Although the model outputs calibrated click probabilities, ranking often cares about the relative order of documents. Pointwise objectives act on single documents, while pairwise objectives consider document pairs, directly optimizing the ordering relationship.
Pairwise Learning
Pairwise loss aligns the model‑predicted ordering probability P ij with the empirical ordering probability \(\overline{P}_{ij}\) using a cross‑entropy term (see formula). This encourages the model’s ranking to match the ground‑truth ordering.
Experiments on Disney+ search logs across languages show that the pairwise approach improves AUC (see
), leading to ~1% lift in click position and 1‑2% increase in CTR.
Listwise Learning (ListNet)
ListNet extends the idea to a whole query session, using the top‑1 probability to model the chance that a document appears first (see formula). In practice, ListNet yields an additional ~0.3% CTR gain over pairwise models.
Duration Model and Clickbait Mitigation
Analysis reveals 20‑30% of clicks lack subsequent playback, indicating clickbait. To address this, a duration model either filters out short plays (thresholding) or weights longer plays higher (see formula). Normalizing play time by video type and applying a log‑transformed weight improves relevance while avoiding over‑promotion of long videos.
Multi‑Objective Framework
The final online scoring function combines relevance (y rel ), click (y click ) and normalized play time (y play ) with learnable weights w Rel , w CTR , w VCR and hyper‑parameters α, β, w:
This design ensures relevance is explicitly modeled, preventing conflicts between relevance and ranking scores, and treats click and play signals separately because they reflect distinct user decisions.
Training Sample Collection
Positive samples are derived from clicked results; intermediate query states within a search session are also treated as positives (see
). Invalid queries, low‑frequency clicks, and logs from abnormal server states are filtered out. Training data are shuffled hourly and split into N partitions to reduce training latency.
Conclusion
This two‑part series first introduced the ranking system’s objectives and data pipeline; the next article will dive deeper into model architectures and feature engineering.
Hulu Beijing
Follow Hulu's official WeChat account for the latest company updates and recruitment information.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.