Jan 5, 2026 · Artificial Intelligence

Can AI Really Understand Dynamic First‑Person Scenes? Inside the New EOC‑Bench

The article introduces EOC‑Bench, a pioneering benchmark that evaluates multimodal large language models on dynamic first‑person visual tasks across past, present, and future time dimensions, presents its 3,277 questions, novel multi‑scale temporal accuracy metric, extensive model comparisons, and detailed error analysis revealing current models’ limitations in temporal perception and memory.

MLLM evaluationTemporal Reasoningdynamic perception

0 likes · 10 min read

Can AI Really Understand Dynamic First‑Person Scenes? Inside the New EOC‑Bench