Tag

CVPR2025

1 views collected around this technical thread.

AntTech
AntTech
Jun 15, 2025 · Artificial Intelligence

21 Ant Research Papers Shaping CVPR 2025: AI Image & Video Generation Breakthroughs

The Interactive Intelligence Lab of Ant Technology Research Institute presented 21 accepted CVPR 2025 papers covering visual generation, editing, 3D vision, digital humans and multimodal AI, highlighting tools such as MagicQuill, Lumos, Aurora, FLARE, LeviTor, MangaNinja, AniDoc, Mimir, AvatarArtist, DiffListener, MotionStone, TensorialGaussianAvatars, DualTalk, CompreCap and Uni-AD.

CVPR2025Multimodal Modelscomputer vision
0 likes · 20 min read
21 Ant Research Papers Shaping CVPR 2025: AI Image & Video Generation Breakthroughs
Kuaishou Audio & Video Technology
Kuaishou Audio & Video Technology
Jun 11, 2025 · Artificial Intelligence

Kuaishou Showcases 12 Cutting-Edge CVPR 2025 Papers on Video Generation and AI

Kuaishou presented twelve peer‑reviewed papers at CVPR 2025 covering video quality assessment, large‑scale video datasets, dynamic 3D avatar reconstruction, 4D scene simulation, controllable video generation, scaling laws for diffusion transformers, multimodal foundations, and more, highlighting the company's leading research in computer vision and AI.

AI researchCVPR2025computer vision
0 likes · 21 min read
Kuaishou Showcases 12 Cutting-Edge CVPR 2025 Papers on Video Generation and AI
Kuaishou Tech
Kuaishou Tech
Jun 10, 2025 · Artificial Intelligence

Top 12 Cutting-Edge Video Generation Papers from Kuaishou at CVPR 2025

The article highlights CVPR 2025’s acceptance statistics and showcases twelve cutting‑edge video‑generation papers from Kuaishou, spanning datasets, quality assessment, style control, scaling laws, 4D simulation, interleaved image‑text data, vision‑language acceleration, high‑fidelity avatars, patch‑wise super‑resolution, narrative‑driven benchmarks, sketch‑based editing, and spatio‑temporal diffusion, each with links and abstracts.

CVPR2025Kuaishoucomputer vision
0 likes · 20 min read
Top 12 Cutting-Edge Video Generation Papers from Kuaishou at CVPR 2025
AntTech
AntTech
Mar 14, 2025 · Artificial Intelligence

MP-GUI: Modality Perception with Multimodal Large Language Models for GUI Understanding

The CVPR 2025 paper "MP-GUI: Modality Perception with MLLMs for GUI Understanding" presents a novel algorithm that enhances multimodal large language models' ability to perceive and reason about graphical user interfaces by integrating text, visual, and spatial signals through specialized perception modules and a dynamic fusion gate, achieving state‑of‑the‑art performance on multiple GUI benchmarks.

CVPR2025GUI UnderstandingMLLM
0 likes · 5 min read
MP-GUI: Modality Perception with Multimodal Large Language Models for GUI Understanding