Artificial Intelligence 6 min read

Claude’s New Code Review Feature Uses Multiple AI Agents with <1% False‑Positive Rate

Claude’s newly launched Code Review automatically triggers on GitHub pull requests, dispatches parallel AI agents to scan code, filters false positives to under 1%, provides quantified issue statistics, discovers security bugs, and charges $15–25 per review, offering a trade‑off between depth and cost.

AI Insight Log

Mar 9, 2026

Claude’s New Code Review Feature Uses Multiple AI Agents with <1% False‑Positive Rate

Claude has introduced a Code Review feature that activates when a pull request is opened on GitHub. The system automatically launches multiple parallel AI agents to scan the changed code.

You open a PR on GitHub.

Claude triggers and dispatches several concurrent agents to examine the code.

The agents cross‑validate each other's findings to eliminate false alarms.

Issues are scored by severity and compiled into a single summary comment with inline annotations.

The final output is a curated report rather than a flood of raw messages.

Anthropic’s internal data shows that for PRs larger than 1,000 lines, 84% contain issues with an average of 7.5 findings per PR, while PRs under 50 lines see issues in 31% of cases with an average of 0.5 findings. Engineers report a false‑positive rate of less than 1%.

Within Anthropic’s own usage, the proportion of PRs that received substantive review comments rose from 16% to 54%, indicating a significant boost in effective coverage.

The sub‑1% false‑positive figure is highlighted as a critical benchmark because AI‑driven tools often suffer from noisy output that hampers adoption.

One concrete example comes from the open‑source TrueNAS project, where Claude identified a long‑standing type‑mismatch bug that silently corrupted an encryption‑key cache—an issue that would be hard for human reviewers to spot.

Another scenario involves an authentication‑related vulnerability. Claude’s comment pointed out that the accessToken and refreshToken are returned without verifying that the requester owns the session, allowing any authenticated user to guess or enumerate session IDs to obtain another user’s tokens.

"This endpoint returns accessToken and refreshToken but does not verify that the requestor is the session owner. Any authenticated user can guess or enumerate session IDs to obtain other users' tokens."

The tool suggested adding ownership checks and removing the tokens from the response. Such flaws can reach a CVSS score of 9.1, classifying them as high‑severity.

Pricing and availability: the feature is in a research‑preview phase for Team and Enterprise customers. Each review costs roughly $15–25, billed by token usage, with larger or more complex PRs costing more. Users can set a monthly spending cap, and administrators enable the service via Claude Code settings after installing the GitHub App.

Anthropic acknowledges that deeper reviews may be more expensive, presenting a clear cost‑vs‑benefit trade‑off.

While multi‑agent collaboration itself is not novel, applying it to code review with measurable false‑positive control and security‑bug detection is relatively rare. The jump from 16% to 54% reflects improved coverage rather than a raw increase in bug count.

For teams, whether the $15–25 per review fee is worthwhile depends on PR size and the organization’s emphasis on code security.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

AI agents code review GitHub pricing Claude security vulnerabilities false positives

Written by

AI Insight Log

Focused on sharing: AI programming | Agents | Tools

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.