AutoControl Arena: Enabling AI to Automatically Detect Frontier Risks
AutoControl Arena automatically synthesizes executable test environments that let researchers and developers uncover hidden AI agent risks in unknown tail scenarios, introduces the X‑BENCH benchmark with 70 scenarios across seven risk categories, reveals that stronger models exhibit more complex mis‑alignments, and validates its fidelity against real red‑team setups.
