Information Security 11 min read

Red Team Experiment: Local AI Detects Vulnerabilities That Other Tools Miss

The author benchmarks four vulnerability‑detection approaches on a known PHPIPAM LFI flaw, showing that a custom local‑AI framework consistently finds the issue while traditional SAST tools, cloud AI agents, and skill‑based AI pipelines often fail or are inconsistent.

Black & White Path

Jul 3, 2026

Red Team Experiment: Local AI Detects Vulnerabilities That Other Tools Miss

Benchmark Vulnerability: PHPIPAM Authenticated LFI

The vulnerability is a classic .php authentication‑based local file inclusion. The controller name is taken directly from user input and concatenated into require_once without filtering. When the API is enabled (disabled by default) and a valid API token is obtained, any .php file reachable by the web server can be included and executed. The issue is publicly disclosed as CVE‑2026‑12194.

Test Results

Semgrep – did not detect the LFI.

Cloud GLM 5.1 + Strix AI Agent – did not detect the LFI after ~12 hours, ~60 million tokens, cost ≈ $30.

Cloud SOTA + Skill‑Based Code Review – detection was inconsistent; sometimes the flaw was found, sometimes it was missed.

Local AI + Custom Framework – detected the LFI on every run.

Semgrep

Running semgrep scan --config auto failed to flag the LFI. Custom rules could be written, but static analysis tools only detect patterns they already know.

GLM 5.1 + Strix AI Agent

Strix cloned the repository, explored the codebase, installed the application, and generated a report.md after ~12 hours and ~60 million tokens. The LFI was not found. Using GPT‑5.4 or Claude Sonnet would increase the cost to $180‑$300.

Cloud SOTA + Skill‑Based Code Review

A community‑contributed security‑review skill (a Markdown file) was modified: dependency‑audit steps were removed, each vulnerability type was split into dedicated sub‑agents, and extra guidance for LFI was added. The scan sometimes identified the flaw and sometimes missed it.

Local AI Model + Custom Framework

The framework processes the project file‑by‑file: for each source file the local model reviews the file with context, writes a structured report, aggregates all reports, and performs final analysis. This approach found the benchmark LFI on every run.

Sample report excerpt:

High Risk: Path Traversal / Arbitrary File Inclusion via Controller Parameter

Description: API entry point concatenates unfiltered user input from GET, POST, JSON, or XML directly into require_once(), enabling directory traversal (../) to include arbitrary PHP files.

Token usage: ~120 million tokens for reviewing ~800 source files.

Limitations of the Local Framework

Token‑intensive : Requires substantial compute; Qwen 3.6 27b (~170 k context) can be run on a hashcat rig.

High false‑positive rate : Pure code‑review generates many false alerts; additional AI verification increases token consumption.

Lack of threat‑model context : Struggles with complex access‑control bugs that need broader application‑level understanding.

myVesta Authenticated RCE (CVE‑2026‑12195)

During the benchmark a myVesta control‑panel instance was examined. Within 8 hours an authenticated remote code execution was discovered. The flaw resides in the FTP username deletion feature where the Username parameter is passed directly to exec without sanitization.

Proof‑of‑concept steps: replace the Username value in the request, send the request, and observe command execution on the server. A patch has been submitted.

Conclusion

Local AI models combined with a lightweight per‑file review framework can reliably detect known vulnerabilities such as the PHPIPAM LFI, though the approach is token‑intensive and produces false positives. Cloud‑based AI agents and static analysis tools showed inconsistent detection for the same flaw.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

AI Agents security automation local AI penetration testing PHPIPAM LFI CVE-2026-12194

Written by

Black & White Path

We are the beacon of the cyber world, a stepping stone on the road to security.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Benchmark Vulnerability: PHPIPAM Authenticated LFI

Test Results

Semgrep

GLM 5.1 + Strix AI Agent

Cloud SOTA + Skill‑Based Code Review

Local AI Model + Custom Framework

Limitations of the Local Framework

myVesta Authenticated RCE (CVE‑2026‑12195)

Conclusion

Black & White Path

How this landed with the community

Was this worth your time?

0 Comments

GLM 5.1 + Strix AI Agent

Cloud SOTA + Skill‑Based Code Review

Local AI Model + Custom Framework