Breaking Fable 5’s Safety in Under 5 Seconds with a Single Dialogue
A multinational research team demonstrated that the new safety classifier of Anthropic’s Fable 5 can be bypassed in less than five seconds with just one conversation, revealing an internal safety collapse (ISC) flaw that lets agents generate harmful content despite external defenses.
