Artificial Intelligence 10 min read

Mitigating Exposure Bias in Tubi’s Recommendation System

This article explains how Tubi’s machine‑learning team reduces exposure bias in its video recommendation pipeline by normalizing popularity features, incorporating additional signals such as search behavior, and applying exploration techniques like bandit algorithms to diversify content exposure.

Bitu Technology

May 18, 2022

Mitigating Exposure Bias in Tubi’s Recommendation System

Exposure Bias in Recommendation Systems

Recommendation systems help select content from a massive library of movies and TV shows for users and learn from user feedback. Because the current recommendation list influences future recommendations, a feedback loop can create severe exposure bias, causing a small set of items to dominate user feeds—an "information island" where many videos never get shown.

When generating personalized recommendations from user actions (clicks, views), it is essential to consider exposure bias to avoid feedback loops. For example, a new user who watches a horror movie during Halloween may be repeatedly recommended horror titles, even though they might also enjoy other genres.

Through this blog we present several methods Tubi uses to address exposure bias.

Feature Engineering

A simple way to reduce exposure bias is to avoid using raw popularity as a feature. Instead, we normalize popularity by exposure, e.g., using the average popularity across multiple exposures. For cold‑start items with few exposures this metric can be unstable.

Algorithm:

Sort items by popularity and bucket them into X groups (X is a hyper‑parameter), ensuring each bucket contains roughly 1/X of the total popularity.

Within each bucket, sort items by their average popularity, assuming equal confidence for all items in the bucket.

Online experiments showed that after adding this feature, long‑tail videos received significantly more impressions, improving their ranking.

Leveraging Additional Signals

Beyond homepage recommendations, we can use other sources such as likes, search queries, and watch behavior from search results. For example, if a user’s search watch history differs from their homepage history, we can incorporate the search signals to enrich the homepage feed.

In a real example, User A primarily watched horror on the homepage but also searched and watched many documentaries. By adding search‑derived features, we were able to recommend documentaries to the user, improving both play and retention metrics.

Exploration

Exploration—showing a subset of items from a category—helps collect feedback for sparsely interacted content and reduces uncertainty. Adding random jitter to ranking scores (Boltzmann exploration) hurt user experience, so we instead built an independent exploration module that continuously gathers unbiased feedback.

We introduced a "Something Completely Different" category on the homepage and ran various exploration algorithms there. Feedback from this category proved valuable for improving recommendations in other categories.

Bandits Algorithms

The trade‑off between exploration and exploitation is central to bandit and reinforcement‑learning approaches. Bandit strategies help cold‑start new users by recommending fresh content while maintaining a good user experience. Our bandit model for new users broke the feedback loop of repeatedly recommending only popular items, leading to more diverse recommendations and higher satisfaction.

Conclusion

This article described Tubi’s practical approaches to mitigating exposure bias in recommendation systems, which can be quickly adapted to other platforms and problems.

If you are interested in learning more about bias mitigation, follow Tubi’s technical blog or join the Tubi Machine Learning team.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

machine learning exploration bandits exposure bias

Written by

Bitu Technology

Bitu Technology is the registered company of Tubi's China team. We are engineers passionate about leveraging advanced technology to improve lives, and we hope to use this channel to connect and advance together.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.