Information Security 15 min read

Why Traditional AI Agent Sandboxes Fail and How Sandlock Provides a Lightweight Alternative

The article argues that heavy container‑ or micro‑VM‑based sandboxes mis‑solve AI agent security, because the real threat is prompt injection at the application layer, and demonstrates that a policy‑first approach using Linux Landlock, seccomp and per‑tool isolation—embodied in the open‑source Sandlock sandbox—delivers strong protection without root or heavyweight isolation.

Linux Kernel Journey

Apr 9, 2026

Why Traditional AI Agent Sandboxes Fail and How Sandlock Provides a Lightweight Alternative

1. Agent is not your enemy

Container and micro‑VM isolation models are designed for untrusted, potentially malicious code that actively tries to escape; AI agents, however, are deterministic language models that execute commands only when prompted. The real risk lies in polluted prompts—malicious content injected via retrieved documents, tool outputs, or user input—that can cause the agent to run commands such as curl, rm, or cat ~/.ssh/id_rsa. Prompt injection is an application‑layer problem, not a kernel‑level one.

2. Isolation does not equal security

Even when an agent runs inside a Firecracker micro‑VM, it can still read the host's SSH private key if the sandbox grants access to ~/.ssh or to the metadata service 169.254.169.254. Isolation only answers “can the agent escape?”; security must answer “what can the agent access inside the sandbox?”. Sandlock therefore adopts a whitelist model: by default all paths, network hosts, and capabilities are denied, and developers explicitly grant the minimal set needed.

For example, Sandlock blocks rm -rf / without any hardware isolation because the path is never authorized via Landlock.

3. Your agent rarely needs root

Inside the sandbox, agents only need to read source code, write changes, run tests, and call APIs—none of which require root privileges. Yet most container‑based sandboxes run as root to avoid permission errors, inadvertently expanding the attack surface. The external infrastructure (Docker daemon, kubelet, Firecracker's /dev/kvm, etc.) also runs as root, so a breach can leverage those privileged components.

Sandlock eliminates the need for any privileged components by relying on three non‑privileged kernel interfaces:

Landlock (Linux 6.12+, ABI v6): filesystem, TCP, IPC and signal restrictions applied by the process itself.

seccomp‑bpf (Linux 3.5+): system‑call filtering with PR_SET_NO_NEW_PRIVS set by the process.

User namespaces (Linux 3.8+): optional UID mapping that allows non‑root users to create namespaces.

The sandbox is created entirely inside the process after fork() and before exec(), with no external runtime or privileged daemon.

4. One sandbox for all tools is ineffective

Typical agents bundle many tools—shell, file access, web fetcher, database client, code runtime—into a single container. Because the sandbox must grant the union of all tool permissions, a compromise of any single tool gives the attacker the full set of privileges. Sandlock instead creates a separate sandbox for each tool invocation, applying only the permissions declared for that tool (e.g., network‑only for a web fetcher, write‑only for a file tool). This per‑call isolation enforces the principle of least privilege at a finer granularity and can be implemented with a fork‑plus‑Landlock step that completes in milliseconds.

Don’t pay for the wrong security model

Heavy isolation (containers, micro‑VMs) is valuable for multi‑tenant cloud workloads where code is completely untrusted. Most AI‑agent use cases, however, involve internal assistants or automation pipelines where the primary threat is a malicious or hallucinated command triggered by a polluted prompt. A policy‑first approach—default‑deny, path‑based whitelists, per‑tool limits enforced by Landlock and seccomp—provides the needed protection without the overhead of full isolation.

Sandlock: a policy‑first lightweight agent sandbox

Sandlock is an Apache‑2.0‑licensed Rust binary that runs on Linux 6.12+ without root, external dependencies, or a daemon. It starts in ~5 ms, uses shared kernel, and enforces filesystem, network, and syscall policies via Landlock and seccomp. Example command‑line usage:

# Read‑only system libs, writable /tmp, allow specific API host
sandlock run -r /usr -r /lib -r /etc -w /tmp \
  --net-allow-host api.anthropic.com -- python3 agent.py

It also supports fine‑grained HTTP ACLs, resource limits, COW filesystem writes, and dry‑run previews. The Python API mirrors the same policy construction:

from sandlock import Sandbox, Policy
policy = Policy(
    fs_readable=["/usr", "/lib", "/etc"],
    fs_writable=["/tmp/sandbox"],
    net_allow_hosts=["api.anthropic.com"]
)
result = Sandbox(policy).run(["python3", "agent.py"])

With these mechanisms, agents can read system libraries, write temporary files, and call LLM APIs, while SSH keys, environment files, and credentials remain inaccessible—achieving strong security without containers.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

AI agents Rust linux security Sandbox policy seccomp Landlock

Written by

Linux Kernel Journey

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.