How AutoCVE Automates Vulnerability Discovery to Deliver 30 CVEs in One Week

AutoCVE is an open‑source, multi‑agent platform that automates the full CVE discovery workflow—from project selection, code scanning, and intelligent finding via a ReAct loop, to verification and structured reporting—enabling researchers to uncover up to 30 high‑severity vulnerabilities across 14 projects in a single week.

Black & White Path
Black & White Path
Black & White Path
How AutoCVE Automates Vulnerability Discovery to Deliver 30 CVEs in One Week

Project Overview

AutoCVE (GitHub: https://github.com/larlarua/AutoCVE) is an open‑source agent‑based audit platform that automates the end‑to‑end workflow: project selection → code scanning → agent‑driven vulnerability mining → CVE report generation. In two weeks the repository gained over 900 stars, 55 forks and produced 30 CVEs across 14 open‑source projects, including Chartbrew, typebot.io, Tautulli, SillyTavern, pimcore, froxlor and JeecgBoot. Reported vulnerability types span improper access control, authorization bypass, SSRF, SQL injection and other high‑severity issues.

Core Capabilities

Fully automated pipeline covering project filtering, code audit, vulnerability mining and structured CVE report generation.

Three audit modes: Enhanced Scan (Scan → Triage), Intelligent Audit (Finding) and Comprehensive Audit (Scan → Triage + Finding).

Interactive audit sessions allow users to query the agent for detailed attack chains, reproduction steps, impact analysis or remediation guidance.

Architecture: Six Coordinating Agents

A Multi‑Agent workflow is orchestrated by an Orchestrator that schedules six specialized agents:

Orchestrator : coordinates the audit flow and decides which agents to activate.

Recon : performs project reconnaissance, identifying language, framework and entry files.

Scan : runs rule‑based, dependency and secret scans (e.g., Semgrep, Bandit).

Triage : filters false positives and retains genuine candidates.

Finding : core AI‑driven agent that reads source code and, using a ReAct loop, discovers high‑value vulnerabilities.

Verification : validates findings in a sandbox with PoC exploits.

Orchestrator → Recon → Scan → Triage → Verification
                 └─→ Finding → Verification
                                 ↓
                         Merge / Finalize

ReAct Loop in the Finding Agent

The Finding Agent implements a ReAct (Reasoning + Acting) cycle: read code, grep for patterns, analyze data flow with multiple tools, and decide when to terminate via the FinalizeFinding tool.

Nudge Mechanism

A “nudge” system forces the model to invoke FinalizeFinding. If the model only states completion without calling the tool, it receives up to two nudges; after that the result is marked “incomplete”.

Structured Output of FinalizeFinding

When the termination tool is called, the agent must provide a structured vulnerability description containing:

vulnerability_type

severity

title

file_path, line_start, line_end

code_snippet

source / sink analysis

exploit_chain

PoC information

CVSS scoring rationale

Interactive Audit Sessions

After an audit finishes, users can continue a conversation with the agent, asking for detailed attack chains, reproduction steps, real‑world impact or remediation advice, addressing the typical limitation of scanners that only provide conclusions.

Deployment

AutoCVE is distributed via Docker Compose. A single command starts the system on Linux/macOS/Git Bash or Windows PowerShell:

curl -fsSL https://raw.githubusercontent.com/larlarua/AutoCVE/v1.0.3/docker-compose.prod.yml \ | docker compose -f - up -d
curl.exe -fsSL https://raw.githubusercontent.com/larlarua/AutoCVE/v1.0.3/docker-compose.prod.yml | docker compose -f - up -d

After launch, the frontend is available at http://localhost:3000, the backend API at http://localhost:8000, and the Swagger docs at http://localhost:8000/docs. A demo account ([email protected] / demo123) is provided.

Model Configuration

The platform can attach various large language models to agents, including OpenAI GPT series, Anthropic Claude, Google Gemini, Alibaba Qwen, DeepSeek, GLM, Moonshot, Ollama (local), Baidu Wenxin and ByteDance Doubao. Different agents can use different models—for example, a powerful reasoning model for Finding and a cheaper model for Scan.

Skills Mechanism

Users can extend agents with custom Skills written in Markdown, such as Java deserialization audit, PHP file‑upload audit, SSRF detection, CVE report writing and Huntr submission workflow. Skills are loaded lazily: only a summary is loaded at startup, with detailed content fetched on demand.

Safety Notice

The tool is intended solely for authorized security research, code audit, and educational exchange. Unauthorized scanning, penetration testing, or security assessment is prohibited.

When reporting vulnerabilities, users should follow the target project's SECURITY.md policy, GitHub private vulnerability reporting, CNA processes, or other responsible disclosure guidelines.

Summary

Efficiency: batch auditing yields ~30 CVEs per week.

Standardized workflow: end‑to‑end process from reconnaissance to structured report.

Traceable audit sessions: interactive queries allow verification of conclusions.

Extensibility: Skills mechanism enables accumulation of domain‑specific knowledge.

AutoCVE multi‑Agent workflow diagram
AutoCVE multi‑Agent workflow diagram
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

security testingmulti‑agentReAct loopvulnerability automationAutoCVECVE discovery
Black & White Path
Written by

Black & White Path

We are the beacon of the cyber world, a stepping stone on the road to security.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.