How Can We Build Trustworthy AI with Systemic Multi‑Agent Governance?
The article reviews Yang Xiaofang’s presentation on trustworthy AI, emphasizing the need for systematic support, inclusive design, and participatory governance, and outlines the evolution, capabilities, risks, and multi‑layered solutions for multi‑agent AI systems.
At the AI for Good Global Summit in Geneva, Yang Xiaofang, Director of Model Data Security at Ant Group, presented on "The Path to Good‑Going Trustworthy AI: Systemic Support, Inclusive Design, Participatory Governance".
She highlighted that rapid AI deployment lacks unified end‑to‑end security testing standards for single agents, making risks hard to quantify. She proposes that single‑agent standards are the "minimum viable unit" of AI governance, while multi‑agent governance will become the core structure of future AI ecosystems.
Beyond technical safeguards, open standards and security tools are essential to lower the entry barrier for SMEs, improve overall industry security, and foster inclusivity. Engaging young people as co‑creators rather than mere consumers is also crucial for AI for good.
Since the launch of ChatGPT, large‑model applications have surged, with projects like AutoGPT introducing a Reason + Act loop that enables goal decomposition and tool use. In 2024, many companies released drag‑and‑drop AI‑Agent platforms, spawning numerous agents. New "computer‑using Agent" models such as Anthropic’s Claude Sonnet and OpenAI’s operator, as well as the general‑purpose Manus Agent, have expanded agent capabilities.
Protocols like MCP and A2A are now giving agents broader communication networks, opening possibilities for agents to coordinate in transportation, energy, security, and even household assistance.
Four‑stage capability model of an AI agent (example: medical assistant) :
Perception : Detect user needs (e.g., recognizing a cough and offering medical advice).
Planning : Arrange services (e.g., providing AI‑driven diagnosis).
Action : Guide the user through steps to achieve the outcome.
Memory : Remember preferences for future interactions.
Agents exhibit three notable traits: planning (goal‑driven action sequences), adaptability (learning from user interaction), and collaboration (cooperating with other agents), each introducing new security challenges.
Examples of risks include over‑delegation, unauthorized actions, and cascading effects in collaborative systems, such as traffic‑signal agents unintentionally causing congestion when acting independently.
To mitigate these threats, a layered approach is recommended:
Technical layer : Apply AI security testing, hardening, and defensive technologies throughout each agent’s lifecycle.
Governance layer : Design human‑machine collaborative decision mechanisms to anticipate emergent multi‑agent interactions.
Ecology layer : Use decentralized identity authentication to secure communication among agents.
Recent efforts include the "Single‑Agent Operational Security Test Standard" jointly released by WDTA, Ant Group, Tsinghua University, China Telecom, and over twenty other entities, filling a gap in agent security testing standards.
Ultimately, fostering open standards, security tools, and inclusive participation—especially from younger developers—will help steer AI toward beneficial outcomes while managing the complex risks of multi‑agent systems.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
