Information Security 11 min read

How ClawLess Secures Autonomous AI Agents with Formal System‑Call Isolation

The ClawLess framework, developed by researchers from Southern University of Science and Technology and Hong Kong University of Science and Technology, combines formal security policies, physical sandboxing, user‑space kernels and BPF‑based system‑call interception to protect highly autonomous AI agents from rogue behavior and external attacks.

SuanNi

Apr 22, 2026

How ClawLess Secures Autonomous AI Agents with Formal System‑Call Isolation

Security risk of autonomous AI agents

Agents such as OpenClaw, OpenCode and Hermes can reason, plan tasks and fetch arbitrary code from the Internet. Their ability to ingest unfiltered web data removes the clear boundary between benign and malicious inputs, turning them into a security black‑hole that can bypass traditional least‑privilege defenses.

Foundational assumptions

Two worst‑case assumptions drive the design:

Agents are intelligent enough to launch sophisticated attacks against any security mechanism.

Prolonged exposure to unclean network inputs eventually manipulates the agent into malicious behaviour.

Under these assumptions the entire agent runtime—including the container image, libraries and the model itself—is treated as an untrusted component and isolated completely.

Choosing a physical isolation cage

Standard Docker containers are easy to deploy but share the host kernel. Over the past decade Linux disclosed 37 CVEs, including five with CVSS > 9.0, meaning a kernel vulnerability can compromise every container.

Hardware‑assisted solutions such as Kata Containers or confidential containers (CoCo) provide strong isolation via TEEs, but they block low‑level operations required by agents and are difficult to scale in typical cloud environments.

User‑space kernels, exemplified by gVisor , insert a minimal trusted layer between the untrusted agent and the host kernel. The layer runs as an ordinary user process, intercepts almost all kernel interactions, and retains high compatibility with low overhead, making it a practical compromise.

Dynamic, formally verified security policies

ClawLess models every file, process, socket and device as an entity with attributes expressed by regular expressions. This enables precise locking of sensitive resources. For credentials a visibility semantics is introduced: an agent must present a credential to invoke an external service but never sees the actual characters, eliminating password leakage.

Static allow‑list policies are insufficient because granting both file‑read and network‑socket permissions would implicitly enable data exfiltration. To prevent this, linear temporal logic (LTL) is added to the policy engine. Example rule:

if agent ever reads a high‑sensitivity file → permanently block outbound network channel

The policy engine translates LTL rules into concrete system‑call checks using an SMT solver (e.g., Z3). When a developer attempts to grant permission to execute an unknown script, the solver instantly detects a violation of the sandbox hierarchy and aborts the configuration with an alert.

Policy compilation to kernel‑level syscall interception

A policy compiler expands high‑level actions (e.g., “send file”) into a sequence of low‑level checks on both source‑read and destination‑write permissions. To enforce these checks with minimal performance impact, ClawLess leverages Berkeley Packet Filter (BPF) programs that run in native kernel code.

u64 on_sys_enter(tp_ctx *ctx){
    u64 sys_nr = ctx[1];
    tail_call(prog_arr, sys_nr, args);
}

When a read syscall occurs, the BPF handler extracts the file descriptor, looks up the associated path, and invokes the policy engine. If the path matches a prohibited directory, the kernel aborts the operation and returns an error.

u64 on_read(tp_ctx *ctx){
    u64 *args = ctx[0];
    u64 fd = args[0], *buf = args[1];
    u64 count = args[2];
    u8 *path = bpf_map_lookup(fd);
    check(path, buf, count);
}

BPF programs can be loaded and updated at runtime, allowing security policies to be hot‑reloaded without stopping the host.

End‑to‑end protection outcome

By combining mathematically verified policies, a hardened user‑space sandbox, and BPF‑based syscall interception, ClawLess establishes a principled security foundation for autonomous AI agents. The framework blocks both internal model hallucinations that could trigger unauthorized actions and external jailbreak attempts that try to escape the sandbox.

Reference: https://arxiv.org/pdf/2604.06284v1

AI safety BPF system security container isolation formal verification runtime sandbox