Unlocking New Potential in Agent Swarms: OpenJiuwen’s MANGO Multi‑Agent Flow Network
The MANGO framework integrates reinforcement learning, text‑gradient updates, and a node‑skipping mechanism to jointly optimize workflow structure and execution policy in multi‑agent systems, achieving superior accuracy and dramatically lower compute costs across seven benchmark datasets.
Problem
Multi‑agent collaboration can suffer from error propagation when workflow generation is incorrect or individual agents produce hallucinated outputs, causing errors to cascade through the collaboration chain.
MANGO Framework
MANGO (Multi‑Agent Network Gradient Optimization) is built on the AgentOS execution and scheduling layer. It models system architecture, task decomposition, and path selection end‑to‑end, enabling joint optimization of workflow paths and execution policies.
Core Features
End‑to‑end reinforcement‑learning optimization guarantees achievement of a global objective.
Text‑gradient updates allow node prompts to adapt to dynamic tasks.
A node‑jumping (skip‑k) mechanism reduces computational overhead while preserving accuracy.
Construction Steps
Flow‑Network Construction – Each workflow action is iteratively inserted into a flow network. Similarity between the new action and existing nodes is computed as vector similarity. If similarity < threshold, a new node is created; otherwise the action is placed in the most similar existing node. Each node is assigned a distinct large‑model agent, forming the multi‑agent system.
Reinforcement‑Learning Edge Optimization – After building the network from historical workflows, the goal is to select agents from source to sink to solve sub‑tasks. The edge‑selection problem is cast as a Markov Decision Process:
State: vector similarity between the current node’s problem content/role description and those of neighboring nodes.
Action: choose one of the neighboring edges.
Reward: weighted combination of process‑level correctness and final‑task performance.
Policy: optimized with the REINFORCE algorithm to maximize expected cumulative reward.
Text‑Gradient Node Optimization – For each node, both task content and role description are updated using text gradients derived from global signals (final task outcome) and local feedback (intermediate execution results), preventing gradient vanishing in long workflows.
The RL edge optimizer and the text‑gradient node updater form a mutually dependent loop: updated prompts alter state representations, influencing path selection, while sampled paths determine which node prompts receive gradient updates.
Node‑Skipping Mechanism
A skip‑k mechanism selectively bypasses already‑well‑optimized nodes. Skipped nodes inherit outputs from recorded intermediate steps, reusing real execution traces. The mechanism is controlled by a Skip‑k parameter (e.g., Skip‑3 skips up to three consecutive nodes).
Experimental Evaluation
Evaluation uses seven datasets: code generation (HumanEval, MBPP), mathematics (MATH500, GSM8K), reading comprehension (DROP), and multi‑domain QA (MMLU, GPQA‑Diamond). GPT‑4o‑mini serves as the base large model. Metrics: pass@1 for code tasks, accuracy for math and QA, F1 for DROP.
Results: MANGO achieves the best performance across all domains. On MATH500, accuracy improves by 12.8 % over MaAS; on DROP, F1 improves by 5.1 % over AFlow, even with a Skip‑2 setting.
Efficiency and Cost
On MATH500 with Skip‑3, API cost is $0.15 per million prompt tokens and $0.60 per million completion tokens. Compared to MaAS, training time drops 41.5 % and inference time drops 47.4 % while retaining the highest accuracy.
Conclusion
MANGO demonstrates that a data‑driven, end‑to‑end framework combining reinforcement learning, text‑gradient updates, and node‑skipping can mitigate error propagation, streamline workflow generation, and improve efficiency and stability of multi‑agent systems.
Paper: https://arxiv.org/abs/2605.12943
Repository: https://github.com/openJiuwen-ai/agent-store/tree/main/community/mango
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
