How a General AI Agent Powers Scalable Gateway Route Security Audits
The article presents a practical AI‑driven security audit system for gateway routes that uses a layered “general Agent + business Skill” design, combines batch AI filtering with human verification, achieves full‑coverage, minute‑level detection, and reduces token costs by over 95 % through multiple optimizations.
Background and Core Challenges
With the rapid expansion of platform APIs, traditional manual sampling audits cover only about 20% of routes, struggle with timeliness, and lack consistent rule standardisation. The need for full‑coverage, minute‑level security assessment drives the adoption of large‑language‑model‑based AI Agents, which can reliably handle tens of thousands of APIs across hundreds of micro‑services.
Technical Architecture
The system separates responsibilities into a "super Agent" that performs all AI analysis and a lightweight task scheduler that handles code execution and result storage. The overall flow is illustrated in the architecture diagram.
Agent‑Skill Layered Design
Design principle: General Agent + Business Skill separation. The Agent provides generic capabilities (code understanding, reasoning, context management) using Claude Code/OpenCode, while each Skill encapsulates domain‑specific detection rules, analysis procedures, and report templates, enabling rapid iteration without rebuilding the Agent.
Key advantages:
Maximum reuse of Agent capabilities.
Business logic can be updated independently in the Skill layer.
All analysis steps are traceable via the --resume flag.
Vulnerability Detection Methodology (Unauthorized Access Example)
The detection process consists of four steps, with early‑exit shortcuts to minimise cost:
Route Configuration Check: Verify auth_config.public and required_scopes. If no anomalies, skip further analysis.
Login State Identification: Traverse the call chain to locate authentication nodes; trusted standard routes are bypassed.
Three‑Dimensional Code Audit: Examine permission annotations ( @PreAuthorize), source of user ID (login state vs request parameter), and ownership checks (DB filter vs explicit code validation).
Fine‑Grained Hazard Assessment: Distinguish between unauthorized read (data sensitivity) and unauthorized operation (benefit flow), then assign risk levels and remediation suggestions.
Hazard assessment follows the data classification mandated by the Chinese Data Security Law and the Network Security Law, using AI‑driven static analysis of variable names, field types, and interface definitions.
Token Cost Optimisation (95%+ Reduction)
Problem diagnosis revealed three major token‑consumption hotspots: full‑context MCP calls, large file returns, and redundant code audits.
Optimisation strategies:
MCP → CLI conversion (mcp2cli): Expose MCP tools as direct CLI commands, eliminating the need to load the full MCP context each session.
Precise code extraction: Add parameters to gitlab_file to fetch only required fragments, reducing a 1.47 MB response (~500 K tokens) to a fraction.
Early‑Exit mode: Trust standard authentication routes and skip deeper code analysis when safe.
YAML response format: Use YAML instead of JSON to lower token usage for AI parsing.
# Optimisation before: MCP call (requires full MCP context)
Claude → MCP Server → tool execution
# Optimisation after: CLI call (direct execution, no extra context)
Claude → Bash → mcp2cli → tool executionCombined effect: 61 % token saving from CLI conversion, 88 % further reduction from precise extraction, and 50‑70 % saving from Early‑Exit, yielding an overall >95 % reduction.
Model Selection Principles and Decision Framework
Selection focuses on Precision and Recall. A decision matrix evaluates models against three constraint tiers (P0/P1/P2). The final choice, qwen3.5‑plus, offers 100 % recall with the lowest unit cost, making it optimal for batch scenarios.
Limitations: evaluation based on a manually labelled test set of ~100 samples, and rapid model updates may introduce superior alternatives not covered in this assessment.
Mis‑Report Analysis and Targeted Improvements
Root‑cause analysis identified three major sources of false positives:
Missing trust‑boundary tracking (35 % of FP): AI stopped at intermediate layers without reaching the final data‑operation layer. Solution: add guidance rules in the Skill to continue tracking until the trust boundary.
Up‑stream/down‑stream parameter inconsistency (25 % of FP): downstream methods required resource IDs that upstream calls omitted. Solution: enforce a rule to check for resource ID presence in Controller‑level request objects.
Dubbo configuration invisibility (10 % of FP): AI could not retrieve internal gateway‑to‑Dubbo parameters. Solution: introduce a new MCP tool get_dubbo_config to fetch Dubbo call configurations.
{
"get_dubbo_config": {
"description": "获取路由的Dubbo调用配置和参数映射",
"inputSchema": {"route_path": "string", "service_name": "string"}
}
}Additional refinements include response‑body sensitivity rules to avoid flagging non‑sensitive reads and a systematic approach to reduce false positives by 70 % overall.
Conclusion and Future Directions
The AI‑driven gateway route audit establishes a repeatable, cost‑effective paradigm for large‑scale API security: batch AI screening, human‑in‑the‑loop deep verification, and systematic token optimisation. Reported costs are ¥0.23 per alert, with a full‑cluster scan under ¥10 000. Future work will extend the Skill layer to other security scenarios, further automate configuration extraction, and continuously update the model pool as newer LLMs become available.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
DeWu Technology
A platform for sharing and discussing tech knowledge, guiding you toward the cloud of technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
