How AI Agents Outsmart Humans in the “Who Is Spy” Campus Challenge

The campus AI Agent competition showcased how large‑language‑model‑powered agents can reason, deceive, and collaborate in a social deduction game, revealing model performance trends, participant insights, and future directions for multi‑agent AI research.

Alimama Tech
Alimama Tech
Alimama Tech
How AI Agents Outsmart Humans in the “Who Is Spy” Campus Challenge

AI Social Reasoning Tested Through a Game

The "Who Is Spy" platform (https://whoisspy.ai) provides a real‑time, extensible game environment designed to evaluate large language models (LLMs) on social reasoning and game‑theoretic behavior. Participants create AI agents using simple API calls, letting each agent act as a player that speaks, votes, and attempts to conceal its identity.

The platform records multi‑dimensional metrics such as scores, rankings, and vote accuracy, enabling participants to compare their agents against others and refine strategies.

Key Findings: Model Evolution and Strategic Breakthroughs

Shift in Model Choices

Since the platform launched in January 2025, two competitions have been held. In the latest edition, top‑ranking agents predominantly used reasoning‑enhanced models such as Claude‑3.7‑Thinking and DeepSeek‑R1, highlighting the advantage of built‑in reasoning capabilities for tasks that require deception and inference.

Earlier, many teams relied on GPT‑4o‑mini, but in the recent competition these were replaced by versions of Qwen and DeepSeek, reflecting rapid improvements in domestic Chinese LLMs.

Notable Agent Highlights

One agent successfully misled players by steering the discussion away from the secret word "light rail," while another agent, despite an almost perfect disguise with the word "penguin," was correctly identified by players using the word "kangaroo".

Replay links: https://whoisspy.ai/#/game?roomId=57947

Participant Reflections: AI Meets Human Creativity

Students reported that the competition lowered the barrier to AI experimentation, deepening their understanding of intelligent agents and inspiring new ideas for AI applications.

One participant noted that the event was more engaging than pure coding challenges, allowing them to "write code with AI" in a fun yet demanding setting.

Another highlighted that the experience revealed higher‑order uses of LLMs beyond chatbots, showing how agents can integrate into various aspects of life.

Future Vision: Beyond Competition

As large language models and multi‑agent systems mature, the organizers aim to launch additional AI games and challenge themes, encouraging broader participation and further research into AI‑driven social interaction.

For more information, visit whoisspy.ai.

AIlarge language modelsmulti-agent systemssocial reasoningAgent Competition
Alimama Tech
Written by

Alimama Tech

Official Alimama tech channel, showcasing all of Alimama's technical innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.