Bilibili Tech
Dec 6, 2024 · Artificial Intelligence
Ensemble-based Offline-to-Online Reinforcement Learning (ENOTO): Methodology, Experiments, and Analysis
ENOTO introduces ensemble Q‑networks into the offline‑to‑online reinforcement‑learning pipeline, using minimum‑Q and uncertainty‑driven exploration to stabilize fine‑tuning, boost learning efficiency, and achieve 10‑25 % higher cumulative returns with minimal online interaction across MuJoCo and AntMaze benchmarks.
AntMazeENOTOEnsemble Q-Networks
0 likes · 16 min read