Inside Claude Code: How AI Uses Four Permission Modes and a Two‑Stage Classifier to Guard Itself

This article dissects Claude Code’s permission system, detailing the four exposed permission modes, the eight‑source rule hierarchy, the traditional Bash matching logic, and the sophisticated YOLO Classifier that employs a fast‑first‑stage and a deep‑second‑stage judgment to automatically approve safe actions while falling back to user prompts for risky operations.

Shuge Unlimited
Shuge Unlimited
Shuge Unlimited
Inside Claude Code: How AI Uses Four Permission Modes and a Two‑Stage Classifier to Guard Itself

1. Permission Model Overview

Claude Code defines six internal permission modes but exposes four to users: default (prompt for every sensitive action), plan (read‑only), acceptEdits (auto‑approve edits, other actions still prompt), and bypassPermissions (completely skip checks, gated by a remote feature flag). The auto mode powers the YOLO Classifier and is not publicly selectable.

Four Permission Modes

default          → ask user each time
plan             → allow read‑only, block modifications
acceptEdits      → auto‑approve file edits, ask otherwise
bypassPermissions→ skip all checks (remote toggle)

Source path: utils/permissions/PermissionMode.ts.

Three‑Layer Rule Sources

Each permission rule can originate from eight sources, ordered by priority:

userSettings → projectSettings → localSettings → policySettings → flagSettings → command → cliArg → session

. In plain language this is "global → project → local → enterprise policy → startup flags → slash commands → CLI args → session rules".

PermissionRule Structure

{
  toolName: string,          // e.g., "Bash"
  ruleContent: string,       // optional pattern, e.g., "git:*"
  behavior: 'allow' | 'deny' | 'ask'
}

The rule is intentionally minimal to keep matching logic transparent.

2. Traditional Permission Matching

Before the AI classifier, Claude Code relied on static matching of Bash commands. shellRuleMatching.ts categorises patterns into five shapes: exact, prefix ( :*), suffix ( *), wildcard ( *), and option‑prefix ( -*). Example patterns:

Exact: git log Wildcard: npm:* Prefix: git* Option‑prefix: npm -* The dangerousPatterns.ts file lists 80 lines of high‑risk executables (e.g., python, node, curl, aws) and ant‑only internal tools. The key insight is that danger lies in the command’s side‑effects, not merely its syntax.

Read‑only validation ( readOnlyValidation.ts) distinguishes truly read‑only Bash commands ( ls, cat) from deceptive ones like echo "hi" > file.txt or tee, which write to disk.

Static rules are immutable: they cannot infer context, so a benign‑looking curl https://api.example.com may be safe in one scenario and dangerous in another. This limitation motivates the AI‑driven classifier.

3. YOLO Classifier – AI Judging AI

The YOLO (You Only Live Once) Classifier implements “bold auto‑approval”. It runs as a separate side‑query to avoid the assistant’s own text influencing the decision. Only tool_use blocks and the user’s message are kept in the transcript.

Transcript Construction

Layer 1: keep user text + tool_use, discard assistant replies.

Layer 2: serialize each tool call either as a text‑prefixed block or JSONL.

Claude’s configuration ( CLAUDE.md) is injected as a cached user message to give the classifier project‑specific intent.

Two‑Stage XML Classifier

Stage 1 – Fast (≈64 tokens):

max_tokens = 64, stop at </block>.

Bias toward block (fail‑closed).

If result is allow, return immediately.

Stage 2 – Thinking (≈4096 tokens): Full chain‑of‑thought reasoning. Outputs &lt;thinking&gt; , &lt;block&gt;yes/no&lt;/block&gt; , and &lt;reason&gt;…&lt;/reason&gt; . Can overturn a Stage 1 block. The design mirrors CPU branch prediction: a cheap fast path handles 80 % of cases; the expensive deep analysis handles the remaining risky operations. Operating Modes both (default): Stage 1 + Stage 2 for production. fast : only Stage 1, max_tokens = 256 for low latency. thinking : only Stage 2 for high‑security contexts. Output Schema { thinking: string, shouldBlock: boolean, reason: string } When allowing, the &lt;reason&gt; tag is omitted to save tokens. Safe Allow‑List Tools such as FileRead , Grep , task‑management commands, and UI helpers are automatically bypassed by the classifier. Write/edit tools are excluded and must pass through acceptEdits fast‑path logic. Caching System prompt, CLAUDE.md , and action block are cached (TTL ≈ 1 hour) via Anthropic’s cache_control . Stage 2 reuses Stage 1’s cache, reducing token overhead. 4. Auto Mode Auto Mode state lives in autoModeState.ts with three booleans: autoModeActive , autoModeFlagCli , and autoModeCircuitBroken . A remote GrowthBook flag can instantly disable Auto Mode (circuit breaker). Permission Degradation Entering Auto Mode strips all dangerous allow entries (e.g., Bash(*) , PowerShell(*) , Agent(*) ). Leaving Auto Mode restores them, analogous to a database transaction’s begin/commit. Decision Logic Safe tool → auto‑approve. Uncertain → block (YOLO decision). Never show a confirmation UI. Denial Tracking Consecutive denials > 3 or total denials > 20 trigger a fallback to an interactive prompt, ensuring a human can override a misbehaving classifier. 5. Persistence & Synchronisation Permission rules persist across sessions via configuration files. Priority order is global < project < user, but shared settings (project, policy, command) differ from personal settings (user, local, session). Shadowed‑rule detection ( shadowedRuleDetection.ts ) flags unreachable allow rules overridden by higher‑priority deny or ask entries, providing fix suggestions. 6. Tool‑Level Permission Flow The entry point is the React hook useCanUseTool (204 lines). It creates a permission context, invokes hasPermissionsToUseTool , and handles the three outcomes: allow : records approval, returns. deny : logs denial, shows notification. ask : proceeds through four sub‑handlers – Coordinator, Swarm Worker, Speculative Classifier (2 s timeout), and Interactive UI. Speculative checks run in parallel with the UI; a high‑confidence allow auto‑approves before the user sees a prompt. Permission Context Lifecycle resolveIfAborted – abort handling. logDecision – analytics. buildAllow – construct allow result. cancelAndAbort – cleanup. Each context is tied to a toolUseID for tracking. Permission Explainer When a confirmation UI appears, permissionExplainer.ts generates a short AI‑written explanation containing explanation , reasoning , risk , and riskLevel (LOW/MEDIUM/HIGH). Conclusion Claude Code’s permission architecture demonstrates a shift from static rule‑based security to a layered AI‑driven model. The YOLO Classifier upgrades static checks into context‑aware judgments, the two‑stage design balances cost and safety, and denial‑tracking provides a pragmatic human‑in‑the‑loop safeguard. This multi‑layered, fail‑closed approach offers a blueprint for teams building AI agents that must operate securely while minimizing friction.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

SecurityClaude Codeauto modeAI permissionsdenial trackingtwo-stage judgmentYOLO classifier
Shuge Unlimited
Written by

Shuge Unlimited

Formerly "Ops with Skill", now officially upgraded. Fully dedicated to AI, we share both the why (fundamental insights) and the how (practical implementation). From technical operations to breakthrough thinking, we help you understand AI's transformation and master the core abilities needed to shape the future. ShugeX: boundless exploration, skillful execution.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.