Tagged articles

Safe Exploration

3 articles · Page 1 of 1

Jun 29, 2026 · Artificial Intelligence

Ensuring Safety in Real-World Reinforcement Learning: Tsinghua’s Safe Exploration Equilibrium Mechanism

The article reviews a Tsinghua University paper published in IEEE TPAMI 2026 that introduces a Safe Exploration Equilibrium (SEE) framework for real‑world reinforcement learning, proving convergence to a safety equilibrium, detailing a two‑step algorithm, and validating it on three classic control tasks with zero constraint violations and rapid region expansion.

ControlEquilibriumReal-World RL

0 likes · 8 min read

Ensuring Safety in Real-World Reinforcement Learning: Tsinghua’s Safe Exploration Equilibrium Mechanism

Machine Heart

Jun 23, 2026 · Artificial Intelligence

Ensuring Safety in Real-World Reinforcement Learning: Tsinghua’s Safe Exploration Equilibrium Mechanism

The article reviews a Tsinghua University paper that introduces a Safe Exploration Equilibrium (SEE) framework for real‑world reinforcement learning, proves its convergence to a mathematically defined equilibrium, and validates the approach with control‑task simulations that achieve zero constraint violations and rapid region expansion.

Control SystemsConvergence ProofEquilibrium

0 likes · 8 min read

Alimama Tech

Dec 28, 2022 · Artificial Intelligence

Sustainable Online Reinforcement Learning for Auto-bidding (SORL)

The Sustainable Online Reinforcement Learning (SORL) framework tackles offline inconsistency in auto‑bidding by iteratively gathering safe online data from real ad systems with a Lipschitz‑based exploration method and training a variance‑suppressed conservative Q‑learning policy, achieving safer, more stable, and higher‑performing bids on Alibaba’s platform.

Online AdvertisingSafe Explorationauto-bidding

0 likes · 18 min read

Sustainable Online Reinforcement Learning for Auto-bidding (SORL)