What’s New in AI? Video QA, Audio Generation, and Major Industry Moves

This roundup highlights the latest AI breakthroughs, including Zhipu AI's video‑understanding model for temporal Q&A, Tencent's video‑to‑audio generation system, Vimeo's AI‑content labeling policy, Apple’s Core ML inclusion of ByteDance’s depth model, AMD’s acquisition of Silo AI, Claude’s new editing features, Quark’s all‑in‑one search AI, TikTok’s VR live streaming on Vision Pro, the launch of the "Xinliu" AI search assistant, and Canva’s restrictions on political AI‑generated posters.

Baidu MEUX
Baidu MEUX
Baidu MEUX
What’s New in AI? Video QA, Audio Generation, and Major Industry Moves

1. Zhipu AI releases CogVLM2‑Video, enhancing temporal QA

Zhipu AI announced the open‑source CogVLM2‑Video model that can answer time‑based questions in videos. It was trained using an automatic temporal localization data construction method that generated 30,000 video‑QA pairs and incorporates multi‑frame images and timestamps as encoder inputs.

CogVLM2‑Video illustration
CogVLM2‑Video illustration

2. Tencent launches VTA‑LDM for video‑to‑audio alignment

Tencent AI Lab introduced the "Implicitly Aligned Video‑to‑Audio Generation" model VTA‑LDM, which generates high‑quality audio that aligns precisely with input video, expanding the applications of video generation technology.

VTA‑LDM illustration
VTA‑LDM illustration

3. Vimeo adds AI‑generated content labeling

Vimeo announced a policy requiring creators to clearly label AI‑generated or AI‑modified videos, joining YouTube and TikTok in mandating disclosure for realistic AI‑created content.

Vimeo AI labeling
Vimeo AI labeling

4. ByteDance’s Depth Anything V2 joins Apple Core ML

ByteDance’s monocular depth‑estimation model Depth Anything V2, now part of Apple’s Core ML model library, scales from 25 M to 1.3 B parameters and supports use cases such as video effects, autonomous driving, 3D modeling, and AR.

Depth Anything V2
Depth Anything V2

5. AMD acquires Finnish AI startup Silo AI for $665 M

AMD announced a $665 million cash acquisition of Silo AI, a leading European AI lab focused on custom AI models and platforms, adding a 300‑person team to accelerate AMD’s large‑language‑model development.

AMD acquisition
AMD acquisition

6. Claude 3.5 Sonnet adds rapid content editing

Anthropic released Claude 3.5 Sonnet with new Artifacts sharing and remixing capabilities, allowing users to edit and share generated games, apps, and code, positioning Claude as a versatile creation partner.

Claude 3.5 Sonnet
Claude 3.5 Sonnet

7. Quark launches "Super Search Box" AI service

Quark unveiled an upgraded "Super Search Box" that combines intelligent answering, content creation, and summarization, delivering precise text and video responses and handling complex, cross‑disciplinary queries with deep learning.

Quark Super Search Box
Quark Super Search Box

8. TikTok VR live streams available on Apple Vision Pro

TikTok announced that its VR live‑streaming app can be downloaded from the Apple Vision Pro App Store, supporting 6DoF 3D, 180° and 360° immersive live experiences.

TikTok VR on Vision Pro
TikTok VR on Vision Pro

9. "Xinliu" AI search assistant launches

The large‑model product "Xinliu" is now live, offering AI‑powered search, knowledge Q&A, intelligent reading, and creative assistance, with a web version released and mobile app versions forthcoming.

Xinliu AI assistant
Xinliu AI assistant

10. Canva bans AI‑generated political posters

Canva clarified that its Magic Media AI tools cannot be used to create political or medical content, citing potential harm or inappropriateness.

Canva policy
Canva policy
Artificial Intelligencevideo understandingAI ModelsAudio Generation
Baidu MEUX
Written by

Baidu MEUX

MEUX, Baidu Mobile Ecosystem UX Design Center, handling end-to-end experience design for user and commercial products in Baidu's mobile ecosystem. Send resumes to [email protected]

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.