Embedding Error Correction into the Policy Space: How Search‑R2 Redefines Search‑Enhanced Reasoning

The Search‑R2 framework integrates error detection, localization, and correction into a reinforcement‑learning loop for search‑enhanced reasoning, achieving notably larger accuracy gains on difficult multi‑hop QA tasks than baseline methods, even when those baselines receive higher sampling budgets.

Agentic AIError CorrectionMulti-hop QA

0 likes · 15 min read