Dec 28, 2022 · Artificial Intelligence

Sustainable Online Reinforcement Learning for Auto-bidding (SORL)

The Sustainable Online Reinforcement Learning (SORL) framework tackles offline inconsistency in auto‑bidding by iteratively gathering safe online data from real ad systems with a Lipschitz‑based exploration method and training a variance‑suppressed conservative Q‑learning policy, achieving safer, more stable, and higher‑performing bids on Alibaba’s platform.

Online AdvertisingSafe Explorationauto-bidding

0 likes · 18 min read

Sustainable Online Reinforcement Learning for Auto-bidding (SORL)