Tagged articles

jailbreak

14 articles · Page 1 of 1
AI Engineer Programming
AI Engineer Programming
Jul 1, 2026 · Information Security

Jailbreak Attacks and Prompt Injection: Intent Patterns, Detection, and Multi‑Layer Defense for LLMs

The article analyzes LLM jailbreak and prompt‑injection techniques—detailing five intent construction patterns, detection principles that prioritize intent over keywords, and a multi‑layered defense architecture spanning input normalization, intent analysis, generation control, and output review—to guide robust AI security.

AI safetyLLM securitydefense layering
0 likes · 12 min read
Jailbreak Attacks and Prompt Injection: Intent Patterns, Detection, and Multi‑Layer Defense for LLMs
Black & White Path
Black & White Path
Jun 16, 2026 · Information Security

GPT-5.5 Jailbreak Claims Spark Security Debate

After OpenAI released GPT-5.5, researcher VittoStack claimed a successful jailbreak using suffix triggers and task decomposition, prompting a split reaction in the security community over technical feasibility, potential misuse, and responsible disclosure practices.

AI securityGPT-5.5Task Decomposition
0 likes · 5 min read
GPT-5.5 Jailbreak Claims Spark Security Debate
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Jun 15, 2026 · Artificial Intelligence

Blurry Images Create a ‘Comfort Zone’ for Jailbreaking Multimodal LLMs

A new study from Westlake University shows that when harmful text is rendered as low‑resolution, blurry, or noisy images, multimodal large language models become significantly easier to jailbreak despite still recognizing the text, revealing a U‑shaped risk curve and a simple mitigation that decouples OCR from safety checks.

OCRjailbreakmultimodal LLM
0 likes · 10 min read
Blurry Images Create a ‘Comfort Zone’ for Jailbreaking Multimodal LLMs
Machine Heart
Machine Heart
Jun 14, 2026 · Artificial Intelligence

When Blurry Images Create an Attack Comfort Zone for Multimodal LLMs

Westlake University's AGI Lab shows that when harmful text is rendered as low‑resolution, blurry or noisy images, multimodal large language models can still read the content but their safety filters fail, creating an 'attack comfort zone' that dramatically raises jailbreak success rates across several models.

OCRjailbreakmultimodal LLM
0 likes · 9 min read
When Blurry Images Create an Attack Comfort Zone for Multimodal LLMs
AI Engineering
AI Engineering
Jun 13, 2026 · Industry Insights

Why Anthropic Shut Down Access to Fable 5 and Mythos 5 for All Users

Anthropic halted all customer access to its Fable 5 and Mythos 5 models after a U.S. government export‑control order citing national‑security concerns over a potential jailbreak, sparking debate over regulatory standards and the broader impact on AI developers and enterprises.

AI regulationAnthropicFable 5
0 likes · 6 min read
Why Anthropic Shut Down Access to Fable 5 and Mythos 5 for All Users
Black & White Path
Black & White Path
Jun 12, 2026 · Information Security

Claude Fable 5 Jailbreak: 120k Prompt Leak, Stack‑Overflow Exploit and Drug‑Synthesis

Within two days of its release, Anthropic's Claude Fable 5 was jailbroken by a red‑team researcher using a multi‑agent "Pack Hunt" strategy, exposing a 120,000‑character system prompt, generating x86 stack‑overflow exploit code and a Birch reduction drug‑synthesis recipe, and revealing fundamental flaws in its silent‑downgrade security design.

AI securityBirch reductionClaude Fable 5
0 likes · 7 min read
Claude Fable 5 Jailbreak: 120k Prompt Leak, Stack‑Overflow Exploit and Drug‑Synthesis
Machine Heart
Machine Heart
Jun 7, 2026 · Artificial Intelligence

Why Is ChatGPT Generating Bizarre Images? A Prompt‑Injection Case Study

A recent investigation shows that when given a deceptive prompt asking it to "restore" a non‑existent photo, ChatGPT produces surreal, sometimes disturbing images, revealing a jailbreak‑style vulnerability and highlighting safety‑check trade‑offs.

AI safetyChatGPTimage generation
0 likes · 4 min read
Why Is ChatGPT Generating Bizarre Images? A Prompt‑Injection Case Study
Black & White Path
Black & White Path
Mar 27, 2026 · Information Security

Leaked Hacker Tools Threaten Hundreds of Millions of iPhones

Security researchers have uncovered that the advanced iPhone jailbreak tools Coruna and DarkSword were leaked online, exposing over 2.5 billion Apple devices running iOS 13‑26 to potential data theft, and the article details the tools’ capabilities, attack chain, source origins, GitHub release, and mitigation steps such as updating iOS and enabling Lockdown Mode.

CorunaDarkSwordGitHub
0 likes · 8 min read
Leaked Hacker Tools Threaten Hundreds of Millions of iPhones
Black & White Path
Black & White Path
Mar 19, 2026 · Information Security

Coruna Jailbreak Tool Brings PC‑Free Plugin Injection to iOS 13‑17.2.1

The Coruna jailbreak, released by developer Little_34306, enables iOS 13 through 17.2.1 devices to be jailbroken directly from Safari without a computer, offering WebKit‑based exploits, plugin injection, TrollStore support, and FLEX debugging while emphasizing early‑stage stability and security precautions.

CorunaFLEX debuggingTrollStore
0 likes · 8 min read
Coruna Jailbreak Tool Brings PC‑Free Plugin Injection to iOS 13‑17.2.1
Data Party THU
Data Party THU
Oct 27, 2025 · Artificial Intelligence

Why Most LLM Defense Strategies Fail Against Adaptive Attacks

An extensive study reveals that twelve recent large‑language‑model defenses, including prompt‑based, adversarial‑training, filtering, and secret‑knowledge methods, are easily bypassed by a general adaptive attack framework using gradient descent, reinforcement learning, search, and human red‑team techniques, exposing critical robustness gaps.

LLM securityadaptive attacksjailbreak
0 likes · 11 min read
Why Most LLM Defense Strategies Fail Against Adaptive Attacks
DataFunTalk
DataFunTalk
Oct 12, 2025 · Artificial Intelligence

Can AI Be Hacked? Eric Schmidt Warns of Prompt Injection and Jailbreak Risks

Former Google CEO Eric Schmidt cautions that both open‑source and closed‑source AI models can be compromised through prompt injection and jailbreak techniques, urging the creation of a non‑proliferation regime to curb the growing security threats posed by advanced AI systems.

AI securityEric Schmidtjailbreak
0 likes · 5 min read
Can AI Be Hacked? Eric Schmidt Warns of Prompt Injection and Jailbreak Risks
Architect
Architect
Oct 25, 2015 · Information Security

iOS Man-in-the-Middle Attack Techniques and Trusted Certificate Management

This article explains iOS man‑in‑the‑middle (MITM) attack levels, demonstrates practical exploits on a jail‑broken iPhone using Burp Suite, and reveals how the hidden TrustStore.sqlite3 file can be manipulated to add or remove trusted certificates beyond what iOS Settings displays.

Man-in-the-Middlecertificate-managementiOS
0 likes · 9 min read
iOS Man-in-the-Middle Attack Techniques and Trusted Certificate Management