How HiveMind Optimizes LLM Multi‑Agent Trading Systems via Contribution‑Guided Online Prompts
The HiveMind framework introduces a contribution‑guided online prompt optimization (CG‑OPO) that quantifies each LLM‑driven agent’s impact with Shapley values and uses a DAG‑Shapley algorithm to efficiently attribute credit, enabling real‑time adaptive optimization of multi‑agent stock‑trading systems and achieving superior returns with far fewer LLM calls.
Background
Large language models (LLMs) enable multi‑agent systems (MAS) to handle complex collaborative tasks such as financial trading. In dynamic real‑world environments two challenges remain: (1) autonomously improving under‑performing agents at runtime, and (2) accurately attributing each agent’s contribution to overall system performance when agents interact through a directed‑acyclic‑graph (DAG) workflow.
Problem Definition
The work targets (1) runtime self‑optimization of poorly performing agents and (2) precise contribution measurement for each agent within a DAG‑structured workflow.
Method
HiveMind Framework Overview
HiveMind is an adaptive framework that closes the loop between performance evaluation, contribution measurement, bottleneck identification, prompt optimization, and system update. The core component, Contribution‑Guided Online Prompt Optimization (CG‑OPO), uses Shapley‑value‑based contribution scores to generate targeted prompt enhancements for the identified bottleneck agents, eliminating manual intervention.
System Formalization
MAS is modeled as a DAG G=(V,E) where V={a_1,…,a_N} are agents and an edge (a_i,a_j) indicates that the output of a_i feeds a_j. A topological order π satisfies π(a_i) < π(a_j) for every edge, guaranteeing acyclic information flow.
The information‑access function I maps each agent to the subset of data it may read, combining external data D_{external}^i and outputs of predecessor agents O(a_j).
In the financial‑trading case study the system comprises seven agents organized into three layers: analysis layer (news, technical, fundamental analysts), outlook layer (bullish, bearish, neutral outlook agents), and decision layer (trader agent) that produces executable orders.
Contribution‑Guided Online Prompt Optimization (CG‑OPO)
Contribution Measurement: Shapley values φ_i are computed by attributing the overall Sharpe‑ratio‑based performance v(S) to each agent a_i.
Bottleneck Identification: The agent with the lowest contribution score is selected as the target a*.
Trigger Condition: Optimization is invoked when φ_{a*}(t) falls below a predefined threshold.
Performance‑Based Reflection: Historical performance history H_t(a*) is split into failure cases F* and success cases S*. A meta‑optimizer extracts lessons L_t from these cases and injects them into the target agent’s prompt.
Prompt Transformation: Structured enhancements derived from L_t are merged into the agent’s prompt while preserving its core functionality.
DAG‑Shapley: Efficient Contribution Measurement
Connection‑Space Pruning: Only sub‑graphs that can affect the final output are retained; infeasible connections receive a zero contribution and are pruned.
Generalized Hierarchical Memoization (GHM): For each feasible connection S^k, inputs at layer L_i are computed once and reused across higher‑level connections, reducing redundant evaluations.
Complexity Analysis: DAG‑Shapley reduces the number of Shapley evaluations from 2^N to a linear combination of unique layer configurations |U_i|, achieving substantial computational savings.
Experiments
Dataset and Evaluation Framework
A 7‑agent DAG powered by GLM‑4Flash (temperature = 0) is evaluated on four major tech stocks (AAPL, META, MSFT, NVDA) across a bull market (2024‑10‑01 to 2024‑12‑30) and a bear market (2025‑01‑02 to 2025‑03‑28). CG‑OPO runs an optimization cycle every five trading days, recomputing Sharpe ratios for all feasible agent subsets.
Baselines
Static configuration without CG‑OPO (w/o CG‑OPO).
Technical indicators: MACD signal strategy and SMA trend‑following.
Buy‑and‑hold as a passive benchmark.
Results and Analysis
Trading Performance
CG‑OPO consistently outperforms baselines. In the bull market, META’s return rises from 4.44 % to 13.72 % (+209 %), NVDA from 14.76 % to 22.60 % (+53 %), and MSFT from 6.15 % to 7.95 % (+29 %). In the bear market, CG‑OPO yields positive returns for META (15.15 % vs. –3.75 % for buy‑and‑hold), MSFT (12.62 % vs. –9.50 %), and AAPL (3.89 % vs. –10.64 %). NVDA’s loss is limited to –21.81 % compared with –41.71 % for the static baseline. Rule‑based technical indicators underperform in all scenarios.
Computational Efficiency
DAG‑Shapley reduces Shapley evaluations from 128 to 49 (‑61.7 %) and LLM calls from 448 to 73 (‑83.7 %) while preserving attribution accuracy identical to the full Shapley computation, making real‑time deployment feasible.
Discussion
Results confirm that CG‑OPO improves returns across market regimes, especially for high‑volatility stocks in bull markets and provides defensive gains in bear markets. However, the adaptive strategy exhibits higher volatility and larger maximum drawdowns, indicating a more aggressive risk profile. Limitations include focus on a narrow set of tech stocks and a limited time horizon; future work should broaden asset classes, extend evaluation periods, and incorporate explicit risk‑management mechanisms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
