What’s New in AI? Video QA, Audio Generation, and Major Industry Moves
This roundup highlights the latest AI breakthroughs, including Zhipu AI's video‑understanding model for temporal Q&A, Tencent's video‑to‑audio generation system, Vimeo's AI‑content labeling policy, Apple’s Core ML inclusion of ByteDance’s depth model, AMD’s acquisition of Silo AI, Claude’s new editing features, Quark’s all‑in‑one search AI, TikTok’s VR live streaming on Vision Pro, the launch of the "Xinliu" AI search assistant, and Canva’s restrictions on political AI‑generated posters.
1. Zhipu AI releases CogVLM2‑Video, enhancing temporal QA
Zhipu AI announced the open‑source CogVLM2‑Video model that can answer time‑based questions in videos. It was trained using an automatic temporal localization data construction method that generated 30,000 video‑QA pairs and incorporates multi‑frame images and timestamps as encoder inputs.
2. Tencent launches VTA‑LDM for video‑to‑audio alignment
Tencent AI Lab introduced the "Implicitly Aligned Video‑to‑Audio Generation" model VTA‑LDM, which generates high‑quality audio that aligns precisely with input video, expanding the applications of video generation technology.
3. Vimeo adds AI‑generated content labeling
Vimeo announced a policy requiring creators to clearly label AI‑generated or AI‑modified videos, joining YouTube and TikTok in mandating disclosure for realistic AI‑created content.
4. ByteDance’s Depth Anything V2 joins Apple Core ML
ByteDance’s monocular depth‑estimation model Depth Anything V2, now part of Apple’s Core ML model library, scales from 25 M to 1.3 B parameters and supports use cases such as video effects, autonomous driving, 3D modeling, and AR.
5. AMD acquires Finnish AI startup Silo AI for $665 M
AMD announced a $665 million cash acquisition of Silo AI, a leading European AI lab focused on custom AI models and platforms, adding a 300‑person team to accelerate AMD’s large‑language‑model development.
6. Claude 3.5 Sonnet adds rapid content editing
Anthropic released Claude 3.5 Sonnet with new Artifacts sharing and remixing capabilities, allowing users to edit and share generated games, apps, and code, positioning Claude as a versatile creation partner.
7. Quark launches "Super Search Box" AI service
Quark unveiled an upgraded "Super Search Box" that combines intelligent answering, content creation, and summarization, delivering precise text and video responses and handling complex, cross‑disciplinary queries with deep learning.
8. TikTok VR live streams available on Apple Vision Pro
TikTok announced that its VR live‑streaming app can be downloaded from the Apple Vision Pro App Store, supporting 6DoF 3D, 180° and 360° immersive live experiences.
9. "Xinliu" AI search assistant launches
The large‑model product "Xinliu" is now live, offering AI‑powered search, knowledge Q&A, intelligent reading, and creative assistance, with a web version released and mobile app versions forthcoming.
10. Canva bans AI‑generated political posters
Canva clarified that its Magic Media AI tools cannot be used to create political or medical content, citing potential harm or inappropriateness.
Baidu MEUX
MEUX, Baidu Mobile Ecosystem UX Design Center, handling end-to-end experience design for user and commercial products in Baidu's mobile ecosystem. Send resumes to [email protected]
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
