Tagged articles

Bandit Algorithms

7 articles · Page 1 of 1

Aug 27, 2021 · Artificial Intelligence

Hybrid Bandit and Visual-aware Ranking Models for Advertising Creative Selection and Dynamic Optimization

The article presents a hybrid bandit framework combined with a visual‑aware ranking model to efficiently select and dynamically optimize advertising creatives, addressing cold‑start challenges, element‑level personalization, and production‑parameter search, and validates the approach with extensive offline and online experiments.

Bandit AlgorithmsCTR Predictioncreative optimization

0 likes · 15 min read

Hybrid Bandit and Visual-aware Ranking Models for Advertising Creative Selection and Dynamic Optimization

Youku Technology

Dec 9, 2020 · Artificial Intelligence

Four Alibaba Papers Accepted at AAAI 2021: Bandits, Video Adaptation, Sentiment, Segmentation

AAAI 2021, the premier AI conference with a 21.4% acceptance rate, accepted four papers from Alibaba Entertainment covering non‑stationary stochastic bandits with graph feedback, spatial‑temporal causal inference for image‑to‑video adaptation, a unified MRC framework for aspect‑based sentiment analysis, and amodal segmentation using shape priors.

AAAI 2021Alibaba ResearchAmodal Segmentation

0 likes · 5 min read

Four Alibaba Papers Accepted at AAAI 2021: Bandits, Video Adaptation, Sentiment, Segmentation

HomeTech

Jun 10, 2020 · Artificial Intelligence

Exploitation & Exploration Algorithms in Recommender Systems: ε‑Greedy, UCB, and Thompson Sampling Applications

This article introduces recommender systems and the exploitation‑exploration dilemma, explains common E&E algorithms such as ε‑greedy, Upper‑Confidence‑Bound, and Thompson Sampling, and details their practical deployment for interest‑point eviction, selection, and adaptive recall count optimization in an automotive recommendation platform.

Bandit AlgorithmsEpsilon-GreedyExploitation

0 likes · 10 min read

Exploitation & Exploration Algorithms in Recommender Systems: ε‑Greedy, UCB, and Thompson Sampling Applications

DataFunTalk

Apr 19, 2020 · Artificial Intelligence

Bandit Algorithms for Recommendation Systems: Context‑Free, Thompson Sampling, and Contextual Approaches

This article explains how multi‑armed bandit methods such as Upper Confidence Bound, Thompson Sampling, and their contextual extensions can address cold‑start, diversity, and bias problems in large‑scale recommendation systems, describing practical update mechanisms, offline evaluation techniques, and deployment experiences at Ctrip.

AIBandit AlgorithmsExploration‑exploitation

0 likes · 15 min read

Bandit Algorithms for Recommendation Systems: Context‑Free, Thompson Sampling, and Contextual Approaches

Alibaba Cloud Developer

Dec 12, 2018 · Artificial Intelligence

Tackling Pseudo-Exposure in Mobile E-Commerce: A Contextual Multiple-Play Bandit Approach

To address the pseudo-exposure problem that reduces click-through rates in mobile e-commerce recommendation, the authors model the task as a contextual multiple-play bandit, propose weighted sample and similarity-enhanced linear reward extensions, provide sublinear regret proofs, and demonstrate significant CTR gains on real Taobao data.

Bandit AlgorithmsCTR optimizationcontextual multi-play

0 likes · 30 min read

Tackling Pseudo-Exposure in Mobile E-Commerce: A Contextual Multiple-Play Bandit Approach

Alibaba Cloud Developer

Jun 29, 2018 · Artificial Intelligence

How AI Powers Heterogeneous Content Ranking in E‑Commerce Search

This paper addresses the challenge of ranking heterogeneous data in e‑commerce by proposing two algorithms—a multi‑armed bandit approach and a personalized Markov deep neural network—to select and order content streams, demonstrating superior performance over baseline models in A/B tests.

Bandit AlgorithmsDeep Learningcontent ranking

0 likes · 7 min read

How AI Powers Heterogeneous Content Ranking in E‑Commerce Search

Alibaba Cloud Developer

Feb 24, 2017 · Artificial Intelligence

How Reinforcement Learning Transforms E‑Commerce Search and Recommendation

This article explores how Taobao leverages reinforcement learning, multi‑armed bandits, and reward‑shaping techniques to improve large‑scale e‑commerce search ranking and recommendation, detailing problem modeling, algorithm designs such as Tabular Q‑learning and DDPG, experimental results from Double‑11, and advanced models like GBDT+FTRL and Wide‑&‑Deep.

Bandit AlgorithmsDeep LearningRecommendation Systems

0 likes · 19 min read