Agentic RL: Transforming LLMs into Autonomous Decision‑Making Agents
This survey formalizes the shift from preference‑based reinforcement fine‑tuning to Agentic Reinforcement Learning, defines Agentic RL via MDP/POMDP abstractions, proposes a dual taxonomy of capabilities and task domains, compiles over 500 recent works, and outlines open challenges for scalable, robust AI agents.
