Artificial Intelligence 10 min read

Key Insights from Wang Weiqiang’s Speech on Large‑Model Security at the AI Innovation and Governance Conference

Wang Weiqiang, chief scientist of Ant Group’s Security Lab, highlighted the urgent need for both rapid detection and long‑term trustworthy safeguards for large AI models, outlining Ant’s data‑detox, guard‑rail, and detection platforms as core solutions to emerging risks such as hallucinations, bias, and data leakage.

AntTech
AntTech
AntTech
Key Insights from Wang Weiqiang’s Speech on Large‑Model Security at the AI Innovation and Governance Conference

On December 19, the "AI Innovation and Governance Conference" was held in Guangzhou, bringing together experts such as Academician Wu Hequan and Chen Junlong, with Ant Group’s security lab chief scientist Wang Weiqiang delivering a keynote on the urgency and practice of large‑model security.

Wang emphasized that large‑model security must be both "fast"—quickly detecting and eliminating malicious content—and "slow"—building systematic, long‑term trustworthiness across the entire AI ecosystem.

He described the myriad risks introduced by large models, including AI hallucinations, data leakage, bias, discrimination, uncontrolled outputs, and the ease with which non‑experts can misuse these powerful tools.

The shift in platform responsibility was highlighted: whereas traditional content risk management focused on user‑generated content, today the liability rests on model providers and internet platforms that must manage content safety, privacy, ethics, and AI‑generated content labeling.

Risks arise at multiple stages: toxic pre‑training data, biased or malicious fine‑tuning annotations, and the inherent probabilistic nature of generation that can produce uncontrolled or harmful results.

Ant Group has been investing in trustworthy AI since 2015, defining four pillars—privacy protection, explainability, robustness, and fairness—and extending them to address the unique challenges of large‑model AI.

Three key safety measures were outlined: (1) data detoxification at the source, (2) reinforced guard‑rails to mitigate black‑box inference issues, and (3) adversarial testing to defend against external attacks.

For data detox, Ant developed a pipeline capable of screening billions of risk samples daily and performing fine‑grained annotation, significantly reducing the raw model’s risk rate.

Controllability is pursued through techniques such as SFT, RLHF/RRHF, and RLAIF for human alignment, image‑risk suppression, and a massive safety knowledge base that guides generation toward positive values.

A multi‑layer guard‑rail system evaluates user queries using the safety knowledge base and risk‑question‑answer modules, ensuring that generated content remains reliable and controllable.

The resulting defense suite, branded as the "Tianjian" large‑model risk‑defense platform, offers fence, rapid, and scenario‑based defenses to filter content, protect privacy, enforce ethics, and manage compliance.

Ant also launched the "Yijian 2.0" large‑model security detection platform, an industrial‑grade system that generates tens of thousands of test samples daily, conducts adversarial attacks, and performs comprehensive pre‑deployment scans.

In addition, Ant has built a deep‑fake detection capability covering text, image, audio, and video, leveraging multimodal generative algorithms and hierarchical classification to distinguish AI‑generated media from authentic content.

These integrated solutions, collectively referred to as "Ant Tianjian," are now publicly available, embodying the "fast" and "slow" security philosophy and underscoring the long‑term, collaborative effort required to achieve trustworthy, controllable AI.

Ant Group reaffirms its commitment to advancing trustworthy AI, acknowledging that large‑model security is still in its early stages and will require ongoing research, industry cooperation, and societal governance.

Large Modelstrustworthy AIRisk MitigationAI safetyai governanceAnt Group
AntTech
Written by

AntTech

Technology is the core driver of Ant's future creation.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.