Machine Heart
Mar 30, 2026 · Artificial Intelligence
Proactive Interaction for Video Multimodal Models: MMDuet2 & ProactiveVideoQA
This article surveys the ICLR 2026 papers ProactiveVideoQA and MMDuet2, detailing how video multimodal large models can decide when to reply autonomously, the PAUC benchmark for evaluating timeliness and accuracy, a reinforcement‑learning training pipeline that requires no precise timestamps, and experimental findings on data construction, frame‑sampling density, and SOTA performance.
MMDuet2PAUCbenchmark
0 likes · 17 min read
