Artificial Intelligence 8 min read

What’s New in AI? Video QA, Audio Generation, and Major Industry Moves

This roundup highlights the latest AI breakthroughs, including Zhipu AI's video‑understanding model for temporal Q&A, Tencent's video‑to‑audio generation system, Vimeo's AI‑content labeling policy, Apple’s Core ML inclusion of ByteDance’s depth model, AMD’s acquisition of Silo AI, Claude’s new editing features, Quark’s all‑in‑one search AI, TikTok’s VR live streaming on Vision Pro, the launch of the "Xinliu" AI search assistant, and Canva’s restrictions on political AI‑generated posters.

Baidu MEUX

Jul 24, 2024

What’s New in AI? Video QA, Audio Generation, and Major Industry Moves

1. Zhipu AI releases CogVLM2‑Video, enhancing temporal QA

Zhipu AI announced the open‑source CogVLM2‑Video model that can answer time‑based questions in videos. It was trained using an automatic temporal localization data construction method that generated 30,000 video‑QA pairs and incorporates multi‑frame images and timestamps as encoder inputs.

2. Tencent launches VTA‑LDM for video‑to‑audio alignment

Tencent AI Lab introduced the "Implicitly Aligned Video‑to‑Audio Generation" model VTA‑LDM, which generates high‑quality audio that aligns precisely with input video, expanding the applications of video generation technology.

3. Vimeo adds AI‑generated content labeling

Vimeo announced a policy requiring creators to clearly label AI‑generated or AI‑modified videos, joining YouTube and TikTok in mandating disclosure for realistic AI‑created content.

4. ByteDance’s Depth Anything V2 joins Apple Core ML

ByteDance’s monocular depth‑estimation model Depth Anything V2, now part of Apple’s Core ML model library, scales from 25 M to 1.3 B parameters and supports use cases such as video effects, autonomous driving, 3D modeling, and AR.

5. AMD acquires Finnish AI startup Silo AI for $665 M

AMD announced a $665 million cash acquisition of Silo AI, a leading European AI lab focused on custom AI models and platforms, adding a 300‑person team to accelerate AMD’s large‑language‑model development.

6. Claude 3.5 Sonnet adds rapid content editing

Anthropic released Claude 3.5 Sonnet with new Artifacts sharing and remixing capabilities, allowing users to edit and share generated games, apps, and code, positioning Claude as a versatile creation partner.

7. Quark launches "Super Search Box" AI service

Quark unveiled an upgraded "Super Search Box" that combines intelligent answering, content creation, and summarization, delivering precise text and video responses and handling complex, cross‑disciplinary queries with deep learning.

8. TikTok VR live streams available on Apple Vision Pro

TikTok announced that its VR live‑streaming app can be downloaded from the Apple Vision Pro App Store, supporting 6DoF 3D, 180° and 360° immersive live experiences.

9. "Xinliu" AI search assistant launches

The large‑model product "Xinliu" is now live, offering AI‑powered search, knowledge Q&A, intelligent reading, and creative assistance, with a web version released and mobile app versions forthcoming.

10. Canva bans AI‑generated political posters

Canva clarified that its Magic Media AI tools cannot be used to create political or medical content, citing potential harm or inappropriateness.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Artificial Intelligence Video Understanding AI models audio generation

Written by

Baidu MEUX

MEUX, Baidu Mobile Ecosystem UX Design Center, handling end-to-end experience design for user and commercial products in Baidu's mobile ecosystem. Send resumes to [email protected]

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

1. Zhipu AI releases CogVLM2‑Video, enhancing temporal QA

2. Tencent launches VTA‑LDM for video‑to‑audio alignment

3. Vimeo adds AI‑generated content labeling

4. ByteDance’s Depth Anything V2 joins Apple Core ML

5. AMD acquires Finnish AI startup Silo AI for $665 M

6. Claude 3.5 Sonnet adds rapid content editing

7. Quark launches "Super Search Box" AI service

8. TikTok VR live streams available on Apple Vision Pro

9. "Xinliu" AI search assistant launches

10. Canva bans AI‑generated political posters

Baidu MEUX

How this landed with the community

Was this worth your time?

0 Comments

4. ByteDance’s Depth Anything V2 joins Apple Core ML

5. AMD acquires Finnish AI startup Silo AI for $665 M

6. Claude 3.5 Sonnet adds rapid content editing

8. TikTok VR live streams available on Apple Vision Pro