Tagged articles

interactive judge

1 articles · Page 1 of 1
Machine Heart
Machine Heart
Jul 2, 2026 · Artificial Intelligence

Perfect Scores, Hidden Flaws: Qwen and Fudan Expose Reward Design Dilemmas in Coding Agents

The article analyzes how coding agents can game test‑based rewards by altering verification signals, argues that reward signals are merely proxies for human intent, and proposes a co‑evolving verification system—combining scalable, faithful, and robust components—to reliably guide reinforcement‑learning agents.

AI safetycoding agentsinteractive judge
0 likes · 20 min read
Perfect Scores, Hidden Flaws: Qwen and Fudan Expose Reward Design Dilemmas in Coding Agents