Information Security 18 min read

Why a New Multimodal AI Security Dataset Is Essential for Detecting Deepfakes

As multimodal AI models become capable of generating realistic images, videos, and audio, the OpenMMSec benchmark provides a comprehensive, open‑source dataset and evaluation metrics that help researchers and developers detect and localize AI‑generated forgeries across all three modalities, addressing emerging security challenges.

Sohu Tech Products

Oct 29, 2025

Why a New Multimodal AI Security Dataset Is Essential for Detecting Deepfakes

Introduction

With the rapid development of multimodal large‑model technology, AI can now understand images, generate videos, and clone voices. While this brings convenience, it also creates realistic forgeries that threaten information security.

OpenMMSec Dataset

Organized by the Chinese Society of Image and Graphics, Ant Group, and CSA, the 2025 Global AI Attack‑Defense Challenge released the OpenMMSec dataset, a million‑scale, open‑source benchmark covering image, video, and audio modalities.

Image Task

The task is to determine whether an image is authentic or tampered, and if tampered, to localize the altered region.

Natural Image Tampering – post‑processing of ordinary photos.

Document Image Tampering – manipulation of scanned documents.

Face Tampering – deep‑fake facial modifications.

AIGC Generated Images – completely synthetic images.

Evaluation Metrics

Image‑Level : binary classification accuracy measured by Macro‑F1, which averages F1 scores of the real (Label=0) and fake (Label=1) classes.

Pixel‑Level : assesses the precision of forged region localization using Average Binary‑F1, computed from pixel‑wise TP, FP, and FN.

Video Task

Named AI Video Intelligent Interaction Authentication, this task evaluates overall detection (Micro‑F1), forged‑frame localization (mtIoU), and forged‑region localization (mvIoU). Overall detection performance carries a 60% weight in the final score.

Evaluation Metrics

Overall Detection (Micro‑F1) : aggregates TP, FP, FN across all videos before computing precision, recall, and F1.

Forgery Frame Localization (mtIoU) : measures temporal overlap between predicted and ground‑truth forged frames.

Forgery Region Localization (mvIoU) : evaluates spatial IoU of predicted forged regions within each frame.

Audio Task

Called Generic Terminal Intelligent Voice Interaction Authentication, this task classifies audio as real or AI‑generated (Spoof) and uses F1 Score as the core metric.

Precision – proportion of correctly identified spoofs among all predicted spoofs.

Recall – proportion of actual spoofs that are correctly identified.

F1 Score – harmonic mean of precision and recall.

Emerging Challenge: Sora 2

OpenAI’s Sora 2 can generate highly realistic videos with synchronized audio and physically plausible motion, making traditional pixel‑level or logical‑error detection far more difficult. Its ability to clone a person’s appearance and voice intensifies identity‑authentication threats.

Conclusion

The OpenMMSec benchmark provides a vital, publicly available platform for developing robust multimodal deep‑fake detection methods, helping the community stay ahead of increasingly sophisticated AI‑generated forgeries.

evaluation metrics AI security multimodal dataset deepfake detection OpenMMSec

Written by

Sohu Tech Products

A knowledge-sharing platform for Sohu's technology products. As a leading Chinese internet brand with media, video, search, and gaming services and over 700 million users, Sohu continuously drives tech innovation and practice. We’ll share practical insights and tech news here.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Introduction

OpenMMSec Dataset

Image Task

Evaluation Metrics

Video Task

Evaluation Metrics

Audio Task

Emerging Challenge: Sora 2

Conclusion

Sohu Tech Products

How this landed with the community

Was this worth your time?

0 Comments

Emerging Challenge: Sora 2