Ensuring Trustworthy and Secure AI: Insights from the 2023 Pujiang Innovation Forum
The 2023 Pujiang Innovation Forum highlighted the rapid rise of generative AI, its associated security and privacy risks, and presented Ant Group's multi‑stage, multi‑layered approach—including data, training, and inference controls and three core defense technologies—to achieve safe, reliable, and open knowledge sharing in the era of large language models.
The 2023 Pujiang Innovation Forum, co‑hosted by the Ministry of Science and Technology and the Shanghai Municipal Government, gathered over 300 experts from more than 32 countries to discuss "Open Science: Embracing Knowledge Sharing and Scientific Collaboration".
Speakers emphasized that while the booming digital economy brings unprecedented opportunities, the rapid evolution of AI—especially large‑scale generative models—creates escalating security threats, including hallucinations, privacy leaks, and malicious misuse.
Sun Bowen, technical lead of Ant Group's trustworthy AI detection platform "Yijian," outlined a three‑stage control framework: data controllability (quality assurance and toxic‑data removal), training controllability (adversarial data and reinforcement learning to reduce toxicity), and inference controllability (reliability testing, explainability tools, and logic‑graph knowledge fusion).
To operationalize these controls, Ant Group implemented three core defense technologies—Fence Defense (atomic risk intent detection in user inputs), Rapid Defense (fast iteration of risk mitigation), and Scenario Defense (context‑aware risk assessment).
The risk‑mitigation workflow follows a three‑step process: pre‑deployment risk detection using a robust evaluation suite, in‑process adversarial training with risk‑derived samples, and post‑deployment adversarial sample reconstruction to uncover attack vectors and refine defenses.
All these capabilities are integrated into the publicly available Ant Jian AI Security Detection Platform, a joint Ant‑Tsinghua University effort that supports text, image, and multimodal content, employing over a million test cases, dozens of generation‑inducing methods, and a taxonomy of 199 sub‑categories covering content, data, and ethical safety.
The platform also offers detection of AI‑generated content across text, image, and video by leveraging frame extraction, speech transcription, and pretrained models enriched with part‑of‑speech features and attention mechanisms.
In conclusion, Sun reaffirmed the importance of trustworthy AI for safe knowledge sharing, advocating for balanced openness that preserves creativity while mitigating misinformation and privacy risks.
AntTech
Technology is the core driver of Ant's future creation.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.