Artificial Intelligence 10 min read

Unlocking New Potential in Agent Swarms: OpenJiuwen’s MANGO Multi‑Agent Flow Network

The MANGO framework integrates reinforcement learning, text‑gradient updates, and a node‑skipping mechanism to jointly optimize workflow structure and execution policy in multi‑agent systems, achieving superior accuracy and dramatically lower compute costs across seven benchmark datasets.

Machine Heart

Jun 8, 2026

Unlocking New Potential in Agent Swarms: OpenJiuwen’s MANGO Multi‑Agent Flow Network

Problem

Multi‑agent collaboration can suffer from error propagation when workflow generation is incorrect or individual agents produce hallucinated outputs, causing errors to cascade through the collaboration chain.

MANGO Framework

MANGO (Multi‑Agent Network Gradient Optimization) is built on the AgentOS execution and scheduling layer. It models system architecture, task decomposition, and path selection end‑to‑end, enabling joint optimization of workflow paths and execution policies.

Core Features

End‑to‑end reinforcement‑learning optimization guarantees achievement of a global objective.

Text‑gradient updates allow node prompts to adapt to dynamic tasks.

A node‑jumping (skip‑k) mechanism reduces computational overhead while preserving accuracy.

Construction Steps

Flow‑Network Construction – Each workflow action is iteratively inserted into a flow network. Similarity between the new action and existing nodes is computed as vector similarity. If similarity < threshold, a new node is created; otherwise the action is placed in the most similar existing node. Each node is assigned a distinct large‑model agent, forming the multi‑agent system.

Reinforcement‑Learning Edge Optimization – After building the network from historical workflows, the goal is to select agents from source to sink to solve sub‑tasks. The edge‑selection problem is cast as a Markov Decision Process:

State: vector similarity between the current node’s problem content/role description and those of neighboring nodes.

Action: choose one of the neighboring edges.

Reward: weighted combination of process‑level correctness and final‑task performance.

Policy: optimized with the REINFORCE algorithm to maximize expected cumulative reward.

Text‑Gradient Node Optimization – For each node, both task content and role description are updated using text gradients derived from global signals (final task outcome) and local feedback (intermediate execution results), preventing gradient vanishing in long workflows.

The RL edge optimizer and the text‑gradient node updater form a mutually dependent loop: updated prompts alter state representations, influencing path selection, while sampled paths determine which node prompts receive gradient updates.

Node‑Skipping Mechanism

A skip‑k mechanism selectively bypasses already‑well‑optimized nodes. Skipped nodes inherit outputs from recorded intermediate steps, reusing real execution traces. The mechanism is controlled by a Skip‑k parameter (e.g., Skip‑3 skips up to three consecutive nodes).

Experimental Evaluation

Evaluation uses seven datasets: code generation (HumanEval, MBPP), mathematics (MATH500, GSM8K), reading comprehension (DROP), and multi‑domain QA (MMLU, GPQA‑Diamond). GPT‑4o‑mini serves as the base large model. Metrics: pass@1 for code tasks, accuracy for math and QA, F1 for DROP.

Results: MANGO achieves the best performance across all domains. On MATH500, accuracy improves by 12.8 % over MaAS; on DROP, F1 improves by 5.1 % over AFlow, even with a Skip‑2 setting.

Efficiency and Cost

On MATH500 with Skip‑3, API cost is $0.15 per million prompt tokens and $0.60 per million completion tokens. Compared to MaAS, training time drops 41.5 % and inference time drops 47.4 % while retaining the highest accuracy.

Conclusion

MANGO demonstrates that a data‑driven, end‑to‑end framework combining reinforcement learning, text‑gradient updates, and node‑skipping can mitigate error propagation, streamline workflow generation, and improve efficiency and stability of multi‑agent systems.

Paper: https://arxiv.org/abs/2605.12943

Repository: https://github.com/openJiuwen-ai/agent-store/tree/main/community/mango

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

multi-agent systems reinforcement learning Workflow Optimization GPT-4o-mini MANGO node skipping text gradient

Written by

Machine Heart

Professional AI media and industry service platform

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.