Artificial Intelligence 18 min read

Why Vector Retrieval Outperforms Keyword Search for Personalized Video Discovery

This article explains how modern video platforms combine traditional keyword retrieval with deep‑learning‑based vector retrieval, detailing model architectures, attention mechanisms, personalization features, offline experiments, and online A/B results that show significant improvements in recall, relevance, and user experience.

Hulu Beijing

May 26, 2022

Why Vector Retrieval Outperforms Keyword Search for Personalized Video Discovery

Document retrieval (retrieval) or matching is the process of finding relevant content from a candidate set, with the goal of maximizing recall rate for target documents.

When the recall rate is limited to the top‑k documents, it can be formalized as:

Here T is the set of retrieved relevant documents and N is the total set of relevant documents. Keyword retrieval and vector retrieval are two essential methods in search systems. Keyword retrieval is simple, accurate, requires no pre‑trained models, and avoids cold‑start problems, making it the baseline recall strategy before deep learning became prevalent. As deep learning matured in information retrieval, vector retrieval has become increasingly important.

These two methods complement each other to form a relatively complete recall system, as shown in Figure 1.

Keyword retrieval is a mature solution with many distributed services such as ElasticSearch, Solr, Sphinx, as well as libraries like Lucene and Xapian. The choice depends on business scale and shape.

Keyword retrieval solves three main problems: building the index, processing the query, and fast matching. Indexing transforms structured or unstructured documents into term sequences (e.g., indexing the title "Mars Inside SpaceX" and extracting keywords like "Elon Musk" from the description). Keyword extraction is a classic NLP task, commonly addressed with CRF+LSTM+BERT.

Keyword retrieval has clear drawbacks: strict term matching cannot capture semantic similarity, it cannot easily incorporate user‑side signals (e.g., playback or purchase history), and it cannot effectively use feedback signals.

Deep‑learning‑based vector retrieval (embedding‑based retrieval) compensates for these shortcomings with powerful representation learning and goal‑driven parameter optimization.

Vector retrieval typically follows a dual‑tower architecture, where query and document sides are encoded independently and then compared via a similarity function. The query encoder processes the query term sequence (often tri‑letter embeddings summed), while the document encoder handles titles (embedding lookup + sum pooling) and long descriptions (BERT encoding of fixed‑length passages followed by self‑attention). The final document representation concatenates title and description embeddings.

The similarity is computed with cosine similarity, and the loss function is typically softmax cross‑entropy over positive and negative samples, optimizing for recall at the first‑stage retrieval level.

Personalization is introduced by enriching the query side with user ID, playback history, search history, device information, and request time. Multi‑head attention is applied after the embedding layer to enable richer feature interactions.

Experiments show that incorporating playback history yields larger gains than changing model structures, and adding playback history on top of search history further improves performance. Offline top‑K recall and online A/B tests demonstrate that vector retrieval especially benefits long‑tail queries and short query terms, where keyword retrieval often fails (e.g., "HandMade", "NYC").

Three strategies for sampling positive and negative examples were compared; the final system uses online click data as positives and random negatives, balancing training simplicity and performance.

Overall, combining keyword retrieval (simple, cold‑start friendly) with vector retrieval enriched by user behavior (robust, personalized) yields the best recall system for streaming video search, as validated by production deployments on Disney+, Hulu, and Star+.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Deep Learning Search Engine Information Retrieval Vector Retrieval personalized search keyword-search

Written by

Hulu Beijing

Follow Hulu's official WeChat account for the latest company updates and recruitment information.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.