Artificial Intelligence 14 min read

How MiniMax Drives Joint Evolution of Models and Harnesses

The article analyzes MiniMax’s strategy of co‑evolving large language models with a Harness framework, contrasting product philosophies, detailing a live MaxHermes demo that creates and refines reusable Skills, and explaining how this dual evolution reshapes the competitive focus from single‑turn Q&A to sustained, self‑improving agent workflows.

Data Party THU

Apr 28, 2026

How MiniMax Drives Joint Evolution of Models and Harnesses

Product philosophies behind agent front‑ends

Early chatbots such as ChatGPT and DeepSeek start with a one‑turn prompt‑response premise (e.g., “Ask anything”). A second philosophy assumes the model’s value lies in a longer execution chain that includes tool use, state reading, context retention and skill formation. MaxHermes adopts the latter, beginning its interaction with “we work together”.

Live demo: turning a GitHub repo query into a reusable Skill

In a MiniMax‑Hermes livestream the author asked MaxHermes to analyse the repository https://github.com/NousResearch/hermes-agent. The agent automatically invoked the MCP tool and a web‑search tool, produced a detailed report, then, after the request changed, generated a framework diagram before re‑doing the analysis. This showed that the agent remembered the previous workflow.

Next the author instructed MaxHermes to encapsulate the whole process as a Skill named GitHub Repo Research . When a new repository link was later supplied, MaxHermes first retrieved the existing Skill and executed it, demonstrating iterative refinement and reuse of learned procedures.

Key mechanisms of the Harness framework

Harness acts as a “mech‑suit” around the model engine, turning raw model capability into real‑world task execution. It closes the loop among memory, scaling, tool invocation, task state and user feedback, preventing the system from collapsing back to a pure prompt‑response mode.

Any mainstream model (GPT, Claude, MiniMax, DeepSeek) can serve as the engine; the critical question is which model integrates most smoothly with Harness. As explained in the livestream, the model is the engine, Harness is the surrounding system that actually drives tool calls, state handling and feedback.

Performance metrics from MiniMax M2.7

70 %–80 % of the RL pipeline is autonomously handled by the model‑Agent combination.

In environments with >40 complex Skills and single‑turn token counts >2000, Skill adherence remains at 97 %.

The remaining 20 %–30 % of work requires human judgment for quality control, highlighting Harness’s role in directing human creativity.

Challenges for a self‑evolving agent system

Accurate tool selection and invocation at the right moment.

Maintaining consistent execution of Skills without deviation.

Preserving long‑term context across extended tasks.

Distinguishing permanent user preferences from temporary requests.

Model support for continuous growth

MiniMax’s M2.7 model is the first to demonstrate true self‑evolution, aligning its optimization target with Harness scenarios rather than static benchmarks. The model‑Harness partnership forms a feedback loop: real‑world task failures expose weaknesses, prompting model upgrades, which in turn raise Harness’s performance ceiling.

Two‑layer technical focus

1. Tool‑use layer

Beyond simple one‑turn tool calls (e.g., MCP search), a multi‑round workflow must decide *whether* to call a tool, *when* to call it, and *how* to process the returned data. In the livestream MiniMax reported that 70 %–80 % of the RL pipeline is handled autonomously, and that even with >40 Skills and >2000 tokens the system keeps 97 % Skill adherence.

2. Model‑growth layer

Hermes aims to distill each task’s experience into reusable capabilities. M2.7 is the first model that optimises for Harness‑driven scenarios, described in its documentation as “Human steeri at every layer, Models build at every layer”. This matching degree determines how far a model can push the Harness architecture.

Future outlook

When model and Harness evolve together, each new model release can trigger a system‑wide capability leap, moving beyond benchmark scores to tangible improvements in agent autonomy and adaptability.

AI agents tool use Hermes MiniMax skill memory self-evolving systems model-harness integration

Written by

Data Party THU

Official platform of Tsinghua Big Data Research Center, sharing the team's latest research, teaching updates, and big data news.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Product philosophies behind agent front‑ends

Live demo: turning a GitHub repo query into a reusable Skill

Key mechanisms of the Harness framework

Performance metrics from MiniMax M2.7

Challenges for a self‑evolving agent system

Model support for continuous growth

Two‑layer technical focus

1. Tool‑use layer

2. Model‑growth layer

Future outlook

Data Party THU

How this landed with the community

Was this worth your time?

0 Comments

Performance metrics from MiniMax M2.7