Machine Heart
Apr 22, 2026 · Artificial Intelligence
ProSafePrune: One‑Shot Pruning to Eliminate Over‑Refusal in Large Language Models
ProSafePrune, a low‑rank parameter pruning framework presented at ICLR 2026, precisely removes over‑harmful encoding in LLMs, dramatically reducing over‑refusal while preserving safety defenses and slightly improving general‑task performance.
ICLR 2026LLM safetyProSafePrune
0 likes · 10 min read
