Artificial Intelligence 15 min read

Deep Uncertainty-Aware Learning (DUAL) for Click‑Through Rate Prediction and Exploration Strategies

The paper presents Deep Uncertainty‑Aware Learning (DUAL), a scalable Bayesian deep‑learning framework that combines a neural feature extractor with a Gaussian‑process prior to model CTR prediction uncertainty, mitigates feedback‑loop bias, and enables confidence‑driven exploration (UCB and Thompson sampling) that improves long‑term utility while preserving accuracy.

Alimama Tech
Alimama Tech
Alimama Tech
Deep Uncertainty-Aware Learning (DUAL) for Click‑Through Rate Prediction and Exploration Strategies

This article introduces a new training method for click‑through rate (CTR) prediction called Deep Uncertainty‑Aware Learning (DUAL). The authors analyze the CTR prediction problem from the perspective of data loops, where the model’s predictions influence the data distribution, breaking the i.i.d. assumption of traditional supervised learning.

The paper first reviews the evolution of CTR models, noting that recent advances focus on richer representations (Attention, RNNs, Memory, Graph Embedding) but ignore the uncertainty of predictions. It then describes the data‑loop issue: ads shown to users generate feedback that becomes training data, creating a feedback loop that can bias the model toward sub‑optimal ads.

To resolve this contradiction, the authors adopt a Contextual Bandits view of ad serving. They treat the ad‑distribution (policy) as an explicit, controllable variable rather than an implicit assumption, allowing exploration‑exploitation trade‑offs.

The core of DUAL is to model the uncertainty of CTR predictions using Bayesian deep learning. A deep kernel is constructed by combining a deep neural network feature extractor with a Gaussian Process (GP) prior. The GP prior provides a distribution over the latent function that maps features to click probabilities, while the deep network supplies expressive representations for high‑dimensional sparse features.

Because exact GP inference is infeasible at industrial scale, the authors employ Sparse Variational Gaussian Processes. They introduce inducing points, variational distributions, and an ELBO objective that can be optimized with mini‑batch stochastic gradients, making the approach scalable to billions of samples.

Practical engineering tricks are discussed, including parameterizing inducing points in the latent space, a three‑stage training schedule to avoid gradient vanishing, and constraints on inducing‑point locations inspired by K‑means clustering. Online inference is accelerated by pre‑computing GP components, leaving only a lightweight linear computation at prediction time.

Based on the uncertainty estimates, two exploration strategies are proposed:

DUAL‑UCB: rank ads by the upper confidence bound of the predicted CTR.

DUAL‑TS: rank ads by a Thompson‑sampled CTR from the posterior distribution.

Experiments on Alibaba’s advertising platform and public datasets (Amazon Books/Electronics, Yahoo! R6B) show that DUAL does not degrade baseline CTR accuracy and can even improve AUC slightly. The exploration strategies, especially DUAL‑TS, achieve up to 30.7% improvement in long‑term utility compared with greedy baselines.

The paper concludes that DUAL provides a general, efficient way to obtain uncertainty‑aware CTR predictions and enables confidence‑driven exploration, opening avenues for system‑level exploration‑exploitation research.

deep learningonline advertisingCTR predictionContextual BanditsGaussian ProcessUncertainty Modeling
Alimama Tech
Written by

Alimama Tech

Official Alimama tech channel, showcasing all of Alimama's technical innovations.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.