Jan 28, 2026 · Artificial Intelligence

BLM‑Guard: Explainable Multimodal Ad Moderation Using Chain‑of‑Thought and Policy‑Aligned RL

The paper introduces BLM‑Guard, an explainable multimodal ad‑moderation framework that combines interleaved‑modal chain‑of‑thought reasoning with a policy‑aligned reinforcement‑learning reward to detect hidden cross‑modal violations in short‑video ads, and presents a new benchmark that demonstrates state‑of‑the‑art performance across multiple risk scenarios.

Chain-of-Thoughtad risk detectionbenchmark

0 likes · 12 min read

BLM‑Guard: Explainable Multimodal Ad Moderation Using Chain‑of‑Thought and Policy‑Aligned RL

Xiaohongshu Tech REDtech

Jan 15, 2026 · Information Security

How Hi-Guard Improves Trustworthy Multimodal Content Moderation with Policy‑Aligned Reasoning

The Hi-Guard framework transforms content moderation by aligning multimodal models with policy rules through hierarchical prompting, a structured taxonomy, and soft‑margin reinforcement learning, achieving significant gains in accuracy, precision, recall, and explainability for large‑scale user‑generated content platforms.

Reinforcement Learningcontent moderationexplainability

0 likes · 9 min read

How Hi-Guard Improves Trustworthy Multimodal Content Moderation with Policy‑Aligned Reasoning