Kuaishou Tech
Kuaishou Tech
Dec 4, 2025 · Artificial Intelligence

Can a Tree‑Reasoned Model Master Video Emotion Understanding?

The paper introduces VidEmo, a multimodal video foundation model that uses a two‑stage emotion‑clue‑guided reasoning framework and a large emotion‑centric dataset (Emo‑CFG) to achieve state‑of‑the‑art performance on facial attribute, expression, and fine‑grained emotion tasks, surpassing Gemini 2.0.

AIDatasetcomputer vision
0 likes · 15 min read
Can a Tree‑Reasoned Model Master Video Emotion Understanding?