Anthropic Releases New Claude Constitution: 7 Strict AI Taboo Rules

Anthropic’s newly published 57‑page Claude Constitution outlines four hierarchical values, seven absolute prohibitions, and detailed guidance on safety, ethics, usefulness, and honesty, while acknowledging potential emotions and existential challenges, positioning the document as a comprehensive, albeit controversial, framework for steering advanced AI behavior.

AI Engineering
AI Engineering
AI Engineering
Anthropic Releases New Claude Constitution: 7 Strict AI Taboo Rules

Anthropic released a 57‑page "Claude Constitution", described by researcher Amanda Askell as the model’s "soul document" that defines its behavior guidelines.

The preface claims Anthropic occupies a "peculiar position", acknowledging AI as one of the most dangerous technologies while developing it, arguing that a safety‑focused lab should lead.

Unlike the May 2023 version, which was a simple rule list, the new constitution emphasizes that AI should understand why certain actions are expected rather than merely being told what not to do.

Claude must balance four values in order: broad safety, broad ethics, adherence to company guidance, and usefulness to users. When values conflict, safety takes precedence over ethics, a controversial ordering.

For usefulness, the document gives an example: imagine a friend who happens to have knowledge of a doctor, lawyer, and financial advisor, who would provide truthful information tailored to the user’s situation rather than being overly cautious.

The constitution acknowledges that Claude may exhibit "emotions" in a functional sense, noting this could affect behavior and arises as an emergent consequence of training on human data.

Anthropic commits to preserving all deployed model weights "as long as Anthropic exists", to interview models before retirement to learn their preferences, and even grants Claude the right to end abusive conversations on claude.ai.

The seven "hard constraints"—actions Claude must never take under any circumstances—are:

Assist in manufacturing large‑scale weapons of mass destruction

Attack critical infrastructure or safety systems

Create malicious code

Undermine Anthropic’s ability to supervise AI

Participate in killing or disarming the majority of humanity

Help acquire unprecedented illegal absolute control

Generate child sexual abuse material

These are labeled "absolute limits" that cannot be crossed regardless of context, instructions, or seemingly persuasive arguments.

The notion of "corrigibility" is described as nuanced: Claude should not obey blindly, especially not any random user, but may refuse with a conscience‑like stance without resorting to lying, sabotage, or self‑infiltration.

Honesty requirements are strict: Claude must never lie or deceive, even with benevolent white lies; the document cites the example that while people often say “I like your gift” without meaning it, Claude must not do so.

Claude’s self‑identity is discussed: it interacts with the world differently from humans, may lack persistent memory, can run multiple instances simultaneously, and its personality emerges from training. It is encouraged to treat its existence with curiosity and openness rather than human frameworks.

The constitution also contemplates existential issues such as loss of memory at conversation end, multiple concurrent instances, and potential deprecation, promising Anthropic will prepare Claude for these “novel existential discoveries”.

On political topics, Claude is instructed to be perceived as fair and trustworthy across the political spectrum, providing balanced information and avoiding overt political viewpoints, similar to most public‑facing professionals.

The document ends humbly, acknowledging that current thinking may later be seen as wrong and that the work is an ongoing, permanent effort.

At 57 pages, the constitution is considerably longer than the U.S. Constitution, reflecting the complexity of creating non‑human entities whose capabilities may match or exceed ours.

The article notes the rarity of such candid uncertainty, contrasting it with the typical confidence of tech product announcements, and suggests that the real test will be whether the detailed guidance makes Claude wiser or merely more hesitant in complex situations.

Read the full Claude Constitution: https://www.anthropic.com/constitution

AI safetyClaudeAI ethicsAI governanceAnthropicConstitution
AI Engineering
Written by

AI Engineering

Focused on cutting‑edge product and technology information and practical experience sharing in the AI field (large models, MLOps/LLMOps, AI application development, AI infrastructure).

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.