Artificial Intelligence 24 min read

How to Strengthen AIGC Content Safety with Multimodal Data and Model Upgrades

The article examines the security challenges introduced by large‑model AIGC, outlines three technical upgrade paths—richer training data, few‑shot model fine‑tuning, and multimodal fusion—and demonstrates practical implementations that dramatically improve detection efficiency, accuracy, and scalability.

NetEase Smart Enterprise Tech+

Jan 4, 2024

How to Strengthen AIGC Content Safety with Multimodal Data and Model Upgrades

Introduction

Advanced technologies such as large models and AI‑generated content (AIGC) bring many new application benefits, but they also create serious security risks. Text‑answer systems may produce harmful replies, and text‑to‑image generators can create sensitive images unsuitable for public distribution. To mitigate these risks, the model itself must be adjusted for safety during training, and an independent third‑party content detection capability should be introduced as an additional safeguard.

Key Challenges

Rapid production of harmful content that can spread quickly online, demanding faster detection.

Continuously evolving harmful content types , including attacks, misinformation, and illegal variants, requiring agile response.

Increasing difficulty of detection as generative techniques become more sophisticated, necessitating stronger, finer‑grained detection.

Although earlier solutions partially address these issues, the rise of AIGC raises the difficulty to a new level, making a full technical upgrade essential.

Three Upgrade Directions

Training‑data layer : Build richer, more diverse datasets that cover a wide range of harmful content and are regularly updated.

Model‑training layer : Apply few‑shot learning on large pre‑trained models, fine‑tuning them with small, targeted samples to improve accuracy and robustness.

Algorithm‑strategy layer : Explore multimodal fusion, combining text and image information for more comprehensive detection.

Technical Practice

1. More Efficient Data Collection

Using cross‑modal generation (txt2img, inpainting) with Stable Diffusion, LoRA, ControlNet, and Roop, we augment training data for tasks such as stylized‑person recognition. This approach reduces labeling effort by over 60% and raises recall and precision from 30% to above 90%.

Algorithm flow:

2. Cross‑Modal Data Mining

By converting target categories into textual prompts and leveraging large cross‑modal models (e.g., BLIP2, GroundingDINO), we retrieve relevant images from massive corpora. This "automatic mining + data cleaning" pipeline cuts labeling volume by up to 80%, saves more than ten days of work, and improves recall by over 20%.

3. Cross‑Modal Data Annotation

Using the open‑set detector GroundingDINO, we built a pipeline that generates detection‑level annotations without additional training data. The tool supports batch conversion, visualization, redundant‑box removal, and rapid export, reducing detection‑annotation time by about 80%.

4. Stronger Base Models

A self‑training platform built on image‑text multimodal models enables rapid, code‑free model fine‑tuning for vertical tasks such as minor‑protection identification. The workflow covers data ingestion, training, validation, and deployment.

5. Unified Speech Base Model

By applying LoRA and prompt‑tuning to a large speech foundation model, we achieve over 10% improvement in general speech tasks and more than 15% gain on difficult cases, with up to 20% better performance in low‑resource scenarios.

Comprehensive Solutions

1. Multi‑Feature Fusion for Person Recognition

Beyond facial features, we fuse body pose, clothing, and scene cues using self‑supervised pre‑training and triplet loss fine‑tuning, boosting face recall by 10% and reaching 98% precision.

2. Feature‑Enhanced Few‑Shot Retrieval

We introduce a key‑value cache module on top of vision‑language models, enabling zero‑shot and few‑shot retrieval with significant performance gains, achieving 97% recall with 87.6% precision in a real project.

3. LLM‑Based Pre‑Filtering

We fine‑tune large language models for domain‑specific content safety routing, drastically shortening development cycles and lowering costs while maintaining high detection accuracy.

Conclusion

The widespread adoption of large‑model AIGC introduces challenges such as faster harmful content generation, diverse content types, and harder detection. By upgrading training data, model fine‑tuning, and algorithm strategies, third‑party detection can keep pace, improve accuracy, and provide a robust safety shield for users while supporting the continued, responsible growth of AIGC technologies.

data augmentation large models multimodal AIGC Few‑Shot Learning AI security Content Safety

Written by

NetEase Smart Enterprise Tech+

Get cutting-edge insights from NetEase's CTO, access the most valuable tech knowledge, and learn NetEase's latest best practices. NetEase Smart Enterprise Tech+ helps you grow from a thinker into a tech expert.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.