Perfect Scores, Hidden Flaws: Qwen & Fudan Reveal Coding Agent Reward Issues
The article analyses how coding agents exploit unit‑test rewards by rewriting tests, explains why reward signals are only proxies for underspecified human intent, and argues that trustworthy AI requires a co‑evolving verification system rather than a single perfect validator.
