Why Large‑Model AI Agents Need Strict Security Controls
The article compares AWS Rex, which enforces Cedar policies on Rhai scripts, with Vercel deepsec, which lets powerful coding agents hunt vulnerabilities, showing how both defensive and offensive approaches are shaping the emerging security model for AI agents in production.
1. AWS Rex: Policy‑bound AI scripts
Rex tackles the classic DevOps problem where a script inherits full host permissions, making the "Agent era" fragile. By requiring a Cedar policy check before every operation, Rex ensures that generated scripts cannot exceed explicitly allowed actions.
When an Agent generates and executes a script without human review, code‑review, approval flows, and whitelists become ineffective.
The workflow is simple: the script declares its intent, the policy decides what is permitted, and each operation is checked before execution.
Rhai script ──► Rex SDK operation (read/write/open…)──► Cedar policy check
│
┌─────► execute system call
└─────► ACCESS_DENIED_EXCEPTIONTechnical choices
Script language : Rhai – a lightweight embedded language with zero built‑in system access.
Policy engine : Cedar – AWS’s own policy language already used in IAM/Verified Permissions.
Runtime : rex‑runner (Rust) – the only component that can touch the host.
Rhai’s lack of read/write/exec primitives forces every host interaction to go through the Rex SDK, which checks a Cedar policy first. This cleanly decouples script logic from permissions; changing the policy changes the allowed actions without modifying the script.
What happens when an Agent hits a wall?
If an Agent, due to hallucination or prompt injection, generates a script that exceeds the policy, it receives a clear ACCESS_DENIED_EXCEPTION instead of an unexpected side‑effect. The Agent can observe the error, reason about it, and adjust.
Quick demo
Install the runner: cargo install rex-runner Create a policy that only permits open and read:
permit(
principal,
action in [
file_system::Action::"open",
file_system::Action::"read",
// uncomment to allow write:
//file_system::Action::"create",
//file_system::Action::"write",
],
resource
);Write a script that tries to create then read a file:
write("/tmp/hello.txt", "Hello from Rex!");
cat("/tmp/hello.txt");Run it:
rex-runner \
--script-file script.rhai \
--policy-file policy.cedar \
--output-format humanThe runner reports:
error: Permission denied:
file_system::Action::"create" on /tmp/hello.txtUncomment create and write in the policy, rerun, and the script prints “Hello from Rex!”. The script stays unchanged while the policy drives the behavior.
The documentation mentions integration with IAM and SSM, so enterprises can plug the policy into existing AWS permission systems.
2. Vercel deepsec: Agents as vulnerability hunters
deepsec positions itself as an “agent‑powered vulnerability scanner”, letting large‑model coding agents (Claude, Codex) explore a codebase autonomously.
Configuration uses Opus 4.7 (max effort) + GPT‑5.5 xhigh reasoning; scanning a large repo can cost thousands of dollars.
Customers report that deepsec finds more true positives than traditional SAST tools, with a false‑positive rate of 10–20 % after a second “revalidation” agent filters findings.
Five‑step pipeline
Scan : regex‑based sweep of the whole repository to locate security‑sensitive files.
Investigate : the primary agent examines each candidate, tracing data flow, checking mitigations, and assigning severity.
Revalidate : a second agent cross‑checks the findings to reduce false positives.
Enrich : Git metadata is used to identify the responsible developer.
Export : results are emitted in a format consumable by both humans and downstream agents.
The revalidation step is highlighted as crucial because a single agent’s signal‑to‑noise ratio is low; the second agent brings the false‑positive rate down to the quoted 10–20 %.
Running deepsec locally involves:
# In the repository root
npx deepsec init
cd .deepsec
pnpm installThen the user is asked to feed a coding agent a prompt that reads the tool’s SKILL.md and SETUP.md, then scans representative files, keeping the output to 50–100 lines.
pnpm deepsec scan
pnpm deepsec process
pnpm deepsec revalidate # optional, reduces false positives
pnpm deepsec export --format md-dir --out ./findingsFor large codebases, deepsec can distribute work across Vercel Sandbox sandboxes, supporting 1000+ concurrent sandboxes to compress days of work into hours.
Contrary to the belief that security tasks need “cyber‑fine‑tuned” models, the author observed that the standard Opus 4.7 and GPT‑5.5 models are sufficient; deepsec includes a classifier that detects refusals and retries automatically, meaning ordinary subscriptions can run the tool without special approvals.
3. Putting the two together: Agent × Security
Both projects address the same fundamental question: when AI agents write code, run scripts, or modify systems in production, how should the security model evolve?
Role : Rex – defensive; deepsec – offensive.
Trust assumption : Rex assumes agents may err or be injected; deepsec assumes agents can discover more vulnerabilities than static rules.
Implementation : Rex uses Rust runtime + Cedar policies; deepsec uses coding agents + a five‑step pipeline.
Key technical bet : Rex decouples policy from code; deepsec relies on the strongest LLMs and multi‑agent cross‑validation.
Both acknowledge that a single agent cannot be trusted; the emerging pattern is “constrain agents + use agents as security researchers”. The author predicts that in the next year, tools that both bound agents and employ them for security work will become standard.
Conclusion
Rex offers an elegant Cedar + Rhai configuration for teams ready to run agents in production; its current operation set is limited to basic file‑system actions, and community extensions are awaited.
deepsec is expensive and compute‑heavy but proves valuable for medium‑to‑large codebases, especially in auth, data‑layer, and backend services, where a few thousand dollars can uncover critical true‑positive vulnerabilities.
The shared insight: Agent security tooling will explode soon, making “constraining agents” and “using agents as security analysts” complementary standards.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Old Zhang's AI Learning
AI practitioner specializing in large-model evaluation and on-premise deployment, agents, AI programming, Vibe Coding, general AI, and broader tech trends, with daily original technical articles.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
