Tagged articles
5 articles
Page 1 of 1
Machine Heart
Machine Heart
Apr 29, 2026 · Artificial Intelligence

VEGA-3D: Unleashing Implicit 3D Priors in Video Generation for Scene Understanding

VEGA-3D extracts the hidden 3D priors embedded in large video generation models, fuses them with semantic features via token‑level adaptive gating, and demonstrates dramatically higher multi‑view consistency and state‑of‑the‑art results on 3D scene‑understanding benchmarks such as ScanRefer, ScanQA, VSI‑Bench and LIBERO—all without any additional 3D annotations.

Embodied AIVEGA-3DVideo Generation
0 likes · 10 min read
VEGA-3D: Unleashing Implicit 3D Priors in Video Generation for Scene Understanding
Data Party THU
Data Party THU
Oct 9, 2025 · Artificial Intelligence

Can One Model Master All Audio‑Visual Tasks? Introducing Crab’s Unified Approach

This article presents Crab, a unified audio‑visual scene understanding model that leverages a novel display‑cooperation learning paradigm, introduces the AV‑UIE dataset with explicit reasoning steps, and demonstrates superior performance across temporal, spatial, pixel‑level, and spatio‑temporal tasks through extensive experiments and ablations.

BenchmarkDatasetLoRA
0 likes · 12 min read
Can One Model Master All Audio‑Visual Tasks? Introducing Crab’s Unified Approach
AI Frontier Lectures
AI Frontier Lectures
Jun 20, 2025 · Artificial Intelligence

Can One Model Master All Audio‑Visual Tasks? Introducing Crab’s Unified Approach

Researchers from RUC, Tsinghua, and Tencent present Crab, a unified audio‑visual scene understanding model that leverages explicit cooperation and a new AV‑UIE dataset with visible reasoning steps, achieving state‑of‑the‑art performance across temporal, spatial, pixel‑level, and spatio‑temporal tasks.

LoRAaudio-visualscene understanding
0 likes · 13 min read
Can One Model Master All Audio‑Visual Tasks? Introducing Crab’s Unified Approach
DataFunTalk
DataFunTalk
Mar 20, 2023 · Artificial Intelligence

Construction and Application of Meituan Hotel & Travel Knowledge Graph

This article details Meituan's hotel‑travel knowledge graph, describing its background, scene‑recognition challenges, multi‑layer graph architecture, technical modules for knowledge mining, relation discrimination and supply tagging, and its practical applications in search, recommendation, ranking, and risk control, while also outlining future directions and a Q&A session.

AIKnowledge GraphMeituan
0 likes · 15 min read
Construction and Application of Meituan Hotel & Travel Knowledge Graph
DataFunTalk
DataFunTalk
Sep 30, 2022 · Artificial Intelligence

Applying Knowledge Graphs to Scene Understanding in Meituan's Hotel and Travel Search

This presentation details how Meituan leverages knowledge‑graph technology to model hotel and travel business characteristics, perform scene cognition, build a multi‑layer knowledge graph, and design a five‑stage search architecture that combines precise and generic queries with AI‑driven ranking and explainable recommendation techniques.

AIKnowledge GraphMeituan
0 likes · 21 min read
Applying Knowledge Graphs to Scene Understanding in Meituan's Hotel and Travel Search