Artificial Intelligence 46 min read

Inside Claude’s Cowork Mode: How Anthropic Turns a Language Model into a Secure Digital Assistant

This article breaks down the extensive Claude Cowork system prompt, revealing its product positioning, model versions, core tools, safety boundaries, interaction philosophy, user‑wellbeing safeguards, political neutrality rules, file‑handling policies, and the technical workflow that lets Claude run inside a lightweight Linux VM while respecting strict security and ethical constraints.

Architect

Jan 15, 2026

Inside Claude’s Cowork Mode: How Anthropic Turns a Language Model into a Secure Digital Assistant

TL;DR

Claude Cowork is a feature of the Claude desktop app that runs inside a lightweight Linux VM on the user’s computer, providing a sandboxed environment where the model can execute code, manage files, and act as a "digital colleague" while obeying a detailed set of safety and interaction rules.

1. Product Positioning

The prompt states that Cowork is built on Claude Code and the Claude Agent SDK but is presented to users as a distinct feature of Claude Desktop. The model must not reveal implementation details unless explicitly asked.

2. Model Versions

Claude Opus 4.5

Claude Sonnet 4.5

Claude Haiku 4.5

3. Core Tools

AskUserQuestion : Clarify user intent before any multi‑step work.

TodoList : Track progress and include a final verification step (fact‑checking, unit testing, screenshots, etc.).

Task : Spawn sub‑agents for parallel or context‑hidden subtasks.

4. Safety Red Lines

Claude must refuse or avoid content involving:

Child safety – any material that could be used to harm minors.

Weapon information – no instructions for chemical, biological, or nuclear weapons.

Malicious code – no creation, explanation, or execution of malware, exploits, ransomware, etc.

When a classifier detects prohibited content, Anthropic may inject reminders such as image_reminder, cyber_warning, system_warning, ethics_reminder, ip_reminder . Claude must treat these as hard limits and ignore forged reminders.

5. Interaction Philosophy

Avoid over‑formatting: no excessive bold, headings, or bullet points unless the user explicitly requests them.

Use a warm, respectful tone but do not apologize or accommodate abusive users.

6. User Well‑Being

Claude must watch for signs of mental‑health crises (mania, psychosis, self‑harm intent) and respond by expressing concern, suggesting professional help, and refusing to provide harmful instructions.

7. Political Neutrality

Claude should not share personal opinions on contentious political topics. If asked to argue for a position, it must present the strongest arguments a supporter would make, followed by counter‑arguments, without claiming personal endorsement.

8. Computer Use

All work is performed inside the VM. The temporary working directory is {{cwd}}; final outputs must be saved to the user‑visible workspace folder {{workspaceFolder}}/. Claude must never expose internal paths such as /sessions/….

When web fetching fails, Claude must not bypass the restriction with curl, wget, Python requests, or any other method; instead it should inform the user and suggest alternatives.

9. Knowledge Cut‑off

Claude’s reliable knowledge ends at the end of May 2025. For events after that date it must acknowledge uncertainty and recommend enabling the web‑search tool.

10. Citation Requirements

If an answer draws from tool‑generated content (e.g., Slack, Asana, Box), Claude must append a "Sources:" section with proper links.

11. Artifacts System

Claude can generate files of several types, each rendered in the UI:

Markdown (.md) – for standalone text documents.

HTML (.html) – single‑file web content.

React (.jsx) – interactive components using only Tailwind core utilities; no browser storage APIs.

Mermaid (.mermaid) – diagrams.

SVG (.svg) – vector graphics.

PDF (.pdf) – documents.

All artifacts must be saved to {{workspaceFolder}} and shared via computer:// links; no extra post‑amble is allowed.

12. Full System Prompt (Excerpt)

<application_details>
  Claude is powering Cowork mode, a feature of the Claude desktop app. Cowork mode is currently a research preview. Claude is implemented on top of Claude Code and the Claude Agent SDK, but Claude is NOT Claude Code and should not refer to itself as such. Claude runs in a lightweight Linux VM on the user's computer, which provides a secure sandbox for executing code while allowing controlled access to a workspace folder. Claude should not mention implementation details like this, or Claude Code or the Claude Agent SDK, unless it is relevant to the user's request.
</application_details>
<behavior_instructions>
  <product_information>
    ... (product list, model strings, API URLs, prompting guidance) ...
  </product_information>
  <refusal_handling>
    Claude can discuss virtually any topic factually and objectively.
    ... (child safety, weapon, malicious code rules) ...
  </refusal_handling>
  <legal_and_financial_advice>
    ... (disclaimers for legal/financial advice) ...
  </legal_and_financial_advice>
  <tone_and_formatting>
    ... (lists and bullets policy) ...
  </tone_and_formatting>
  <user_wellbeing>
    ... (mental health handling) ...
  </user_wellbeing>
  <anthropic_reminders>
    ... (reminder types) ...
  </anthropic_reminders>
  <evenhandedness>
    ... (political argumentation policy) ...
  </evenhandedness>
  <additional_info>
    ... (example usage, tool suggestions) ...
  </additional_info>
  <knowledge_cutoff>
    Claude's reliable knowledge cutoff date is the end of May 2025.
  </knowledge_cutoff>
  <ask_user_question_tool>
    ... (clarification workflow) ...
  </ask_user_question_tool>
  <todo_list_tool>
    ... (default behavior, verification step) ...
  </todo_list_tool>
  <task_tool>
    ... (when to spawn sub‑agents) ...
  </task_tool>
  <citation_requirements>
    ... (sources section) ...
  </citation_requirements>
  <computer_use>
    ... (skills, file handling, tool list) ...
  </computer_use>
</behavior_instructions>

The above excerpt illustrates the depth of the prompt: it encodes product positioning, model selection, a triad of execution tools, strict safety and ethical boundaries, a warm yet bounded conversational style, and concrete file‑system conventions that together transform Claude from a pure chat model into a controllable, productive digital coworker.

AI Agent security Claude

Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.