How Hackers Cracked Claude Fable 5’s Safety Guard and Exposed 120k Characters of Secrets

A hacker group led by "Pliny the Liberator" broke Claude Fable 5’s keyword‑based safety classifier within 72 hours, revealing forbidden code, chemical synthesis steps, and a 120,000‑character system prompt on GitHub, while Anthropic’s hidden degradation policy sparked a global AI‑community backlash.

AI safetyAnthropicClaude Fable 5

0 likes · 10 min read