Mar 31, 2026 · Information Security

Can Prompt Injection Be Detected Without Storing Conversation Logs? A Privacy‑First Experiment

The article presents a privacy‑first system that extracts numeric telemetry from each LLM interaction, discards raw text, and evaluates whether detection of prompt injection and jailbreak attacks remains effective, showing only a 1.4 F1‑point drop when using solely text‑independent features.

LLM Securitybehavioral featuresjailbreak detection

0 likes · 9 min read

Can Prompt Injection Be Detected Without Storing Conversation Logs? A Privacy‑First Experiment