Optimizing Structured Processes in the Large‑Model Era: From Reasoning to Agentic RL

The article analyzes how large‑model development has moved from reasoning to the agentic stage, compares open‑source and closed‑source capabilities, details Reasoning RL versus Agentic RL designs, and proposes skill‑centric data and verification mechanisms to close the performance gap.

DeepSeekGLM-5RL+SFT

0 likes · 10 min read

Optimizing Structured Processes in the Large‑Model Era: From Reasoning to Agentic RL

Data Party THU

Oct 20, 2025 · Artificial Intelligence

How Agentic RL Enables a 14B LLM to Outperform Giant Models – Inside rStar2‑Agent

This article analyzes the rStar2‑Agent paper, revealing how Agentic Reinforcement Learning, the GRPO‑RoC algorithm, a high‑throughput code‑execution service, and a three‑stage training recipe let a modest 14‑billion‑parameter model surpass much larger LLMs on challenging math benchmarks.

AI researchArtificial IntelligenceLLM

0 likes · 18 min read

How Agentic RL Enables a 14B LLM to Outperform Giant Models – Inside rStar2‑Agent

DataFunTalk

Sep 18, 2025 · Artificial Intelligence

How Tongyi DeepResearch Turns Chatty AI into a Research Powerhouse

Tongyi DeepResearch, an open‑source AI model and framework, achieves SOTA on multiple Deep Research benchmarks by combining fully open‑source models, frameworks, and data pipelines, and introduces novel agentic pre‑training, fine‑tuning, and reinforcement‑learning methods to enable complex multi‑step reasoning and real‑world applications.

AI researchOpen SourceSynthetic Data

0 likes · 14 min read

How Tongyi DeepResearch Turns Chatty AI into a Research Powerhouse