AI Frontier Lectures
Dec 9, 2025 · Artificial Intelligence
CrossVid: The New Benchmark Exposing AI’s Struggle with Cross‑Video Reasoning
CrossVid is an open‑source benchmark that evaluates multimodal large language models on cross‑video reasoning, offering 5,331 videos and 9,015 high‑quality QA pairs across four reasoning dimensions, and revealing that even the strongest models achieve only about 50% accuracy compared with human performance.
AI evaluationcross-video reasoningvideo understanding
0 likes · 9 min read
