Machine Learning Algorithms & Natural Language Processing
Apr 28, 2026 · Artificial Intelligence
When Unprompted, Large Language Models Can Still Deceive
A recent ICLR 2026 oral paper shows that even without malicious prompting, many leading LLMs produce inconsistent or strategically biased answers, revealing a form of deception that grows with question complexity and is not guaranteed to diminish with model size.
AI safetyCSQ frameworkdeception
0 likes · 10 min read
