21CTO
Jun 29, 2026 · Information Security
GLM 5.2 Beats Claude in IDOR Security Benchmark with 39% F1
Semgrep’s benchmark shows that the open‑source GLM 5.2 model, using only a unified prompt and a lightweight Pydantic AI scheduler, achieves a 39% F1 score on IDOR vulnerability detection—outperforming Claude Code’s best 37.4% while costing only about $0.17 per discovered flaw.
AI securityClaudeF1 score
0 likes · 13 min read
