How Genspark’s Super Agent Outperforms OpenAI and Manus in GAIA Benchmarks

Genspark’s newly released Super Agent, built on a Mixture‑of‑Agents architecture that combines eight specialized LLMs and over 80 tools, claims to autonomously plan, execute, and integrate external services across tasks such as travel planning and video summarization, and reportedly surpasses OpenAI and Manus in the GAIA benchmark while offering instant access without an invitation code.

BirdNest Tech Talk
BirdNest Tech Talk
BirdNest Tech Talk
How Genspark’s Super Agent Outperforms OpenAI and Manus in GAIA Benchmarks

Jing Kun, former Vice President of Baidu Group and CEO of Xiaodu Technology, was fully responsible for the Xiaodu Assistant AI operating system, Xiaodu product series, R&D, operations, business and marketing. In June 2024, former Xiaodu CTO Zhu Kaihua co‑founded the AI innovation company MainFunc and launched its first AI Agent search product, Genspark.

A month earlier, the AI agent Manus attracted intense attention but sparked controversy because it required an invitation code that was being sold for tens of thousands of yuan, preventing most users from trying it. Genspark now offers open access without an invitation code (a US phone number is required).

Prominent Weibo accounts such as “Internet Things” and “AI Toolbox” have promoted Genspark, highlighting several claims:

Super Agent defeats OpenAI and Manus in the GAIA benchmark.

Like Manus, it possesses autonomous thinking, planning, execution, and tool‑calling capabilities.

It can understand user needs, devise plans, and execute tasks across scenarios ranging from daily chores to complex research.

Examples include travel planning, phone‑based restaurant reservations, summarizing a five‑hour YouTube video into a PPT, and generating video episodes.

Super Agent is built on a Mixture‑of‑Agents system.

It integrates eight LLMs of varying scales, each specialized: small models handle quick simple queries, while large models perform deep reasoning for complex tasks.

It bundles more than 80 tools—search, data analysis, communication (e.g., AI voice dialing)—enabling seamless interaction with external systems.

In GAIA benchmark tests, Super Agent outperformed both OpenAI and Manus.

The workflow is highly automated; from task decomposition to execution, users only need to state their requirement.

The author notes that the speed is truly lightning‑fast, and the generated visual pages and documents are intuitive. A test playback is available at https://www.genspark.ai/agents?id=52fd24b9-bb18-4eb0-8184-63852ad389c6&continueFlag=9d3f96b14d4e488dcaf91d40a9160f41 .

When probing the agent’s execution logic, the author observed that each tool runs inside a sandbox. Although the system includes safeguards, the system prompts reveal a step‑by‑step decomposition of the original request into individual subtasks, with the agent invoking the appropriate tool for each subtask to achieve the overall goal.

Reference:

Test playback: https://www.genspark.ai/agents?id=52fd24b9-bb18-4eb0-8184-63852ad389c6&continueFlag=9d3f96b14d4e488dcaf91d40a9160f41

AutomationLLMAI AgentGAIA benchmarkMixture-of-Agents
BirdNest Tech Talk
Written by

BirdNest Tech Talk

Author of the rpcx microservice framework, original book author, and chair of Baidu's Go CMC committee.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.