Tagged articles
5 articles
Page 1 of 1
Machine Heart
Machine Heart
Jun 12, 2026 · Artificial Intelligence

Breaking Fable 5’s Safety in Under 5 Seconds with a Single Dialogue

A multinational research team demonstrated that the new safety classifier of Anthropic’s Fable 5 can be bypassed in less than five seconds with just one conversation, revealing an internal safety collapse (ISC) flaw that lets agents generate harmful content despite external defenses.

AI safetyInternal Safety CollapsePrompt Engineering
0 likes · 11 min read
Breaking Fable 5’s Safety in Under 5 Seconds with a Single Dialogue
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Apr 27, 2026 · Information Security

Real-Time Agentic Risk Detection with Flink, Fluss, and Large Language Models

The article presents a Flink‑Fluss‑LLM architecture that captures full‑link agent events via a non‑intrusive hook, combines semantic AI inference with deterministic CEP rules, and delivers millisecond‑level alerts for malicious user detection, tool result poisoning, and chain‑attack risk mitigation.

AI FunctionFlinkFluss
0 likes · 41 min read
Real-Time Agentic Risk Detection with Flink, Fluss, and Large Language Models

SkillAttack Reveals 6,500+ Attack Paths – Community‑Built SkillAtlas Secures Agent Skills

SkillAttack automates red‑team testing of LLM‑driven Agent Skills, exposing real attack paths across dozens of models, while the community‑curated SkillAtlas now hosts over 6,500 publicly searchable traces covering 233 skills and 18 major model families, inviting researchers and developers to contribute.

AI safetyAttack Path LibraryRed Team Automation
0 likes · 7 min read
SkillAttack Reveals 6,500+ Attack Paths – Community‑Built SkillAtlas Secures Agent Skills
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Mar 3, 2026 · Artificial Intelligence

AI Agents: Current State, Challenges, and Insights from the MIT‑Cambridge‑Stanford Report

The MIT‑Cambridge‑Stanford 2025 AI Agent Index analyzes 30 leading agents, revealing rapid market growth, diverse autonomy levels, opaque memory handling, security gaps, and a programming‑centric usage pattern that raises both opportunity and governance concerns.

AI agentsClaude CodeMIT report
0 likes · 23 min read
AI Agents: Current State, Challenges, and Insights from the MIT‑Cambridge‑Stanford Report
SuanNi
SuanNi
Mar 3, 2026 · Information Security

Why OpenClaw’s 24‑Hour AI Assistant Fails Security Tests: 6 Critical Blind Spots

A comprehensive security audit of the OpenClaw autonomous AI agent reveals a 58.9% overall pass rate across 34 scenarios, exposing severe vulnerabilities in ambiguous command handling, prompt‑injection, and high‑privilege tool use, and proposes concrete defensive measures to mitigate these risks.

AI safetyagent securityrisk assessment
0 likes · 12 min read
Why OpenClaw’s 24‑Hour AI Assistant Fails Security Tests: 6 Critical Blind Spots