BiCNet: Mastering Multi-Agent Cooperation in StarCraft Battles

The paper introduces BiCNet, a bidirectional coordination network that learns optimal multi‑agent strategies in StarCraft micro‑battles—ranging from collision‑free movement to complex cover attacks and focused fire—outperforming prior state‑of‑the‑art methods and demonstrating scalable potential for real‑world cooperative AI tasks.

Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Cloud Developer
BiCNet: Mastering Multi-Agent Cooperation in StarCraft Battles

Introduction

Alibaba Cognitive Computing Lab and UCL Computer Science collaborated to study multi‑agent cooperation using the micro‑battle scenarios of StarCraft: Brood War. The proposed BiCNet (Bidirectional Coordination Network) automatically learns optimal strategies for multiple agents, from collision‑free movement to basic attack/retreat, up to complex cover attacks and focused fire.

Motivation

Cooperative intelligence is essential for achieving artificial general intelligence (AGI). While single agents have mastered games such as Atari, Go, and poker, true human intelligence involves social and collaborative abilities. Multi‑agent systems can solve problems beyond the capability of individuals, and the emerging algorithmic economy sees AI agents cooperating in markets, advertising, and recommendation.

BiCNet Architecture

BiCNet consists of an actor network and a critic network, both built on a bidirectional recurrent neural network (RNN). Parameters are shared across agents, making the model size independent of the number of agents. The actor produces individual actions while communicating via the bidirectional RNN; the critic estimates local Q‑values which are combined to form a global return.

BiCNet architecture
BiCNet architecture

Learned Cooperative Strategies

Collision‑free coordinated movement

Attack and retreat tactics

Cover attacks

Focused fire without wasting shots on dead targets

Cooperation among heterogeneous agents

Coordinated movement
Coordinated movement
Attack and retreat
Attack and retreat
Cover attack
Cover attack
Focused fire
Focused fire
Heterogeneous agent cooperation
Heterogeneous agent cooperation

Experimental Results

BiCNet was evaluated on a series of StarCraft micro‑battle tasks of increasing difficulty and compared against several baselines (e.g., CommNet). It achieved superior win rates across scenarios such as 3 Marines vs 1 Super Zergling, 4 Dragoons vs 2 Ultralisks, and large‑scale fights (15 Marines vs 16 Marines). The results demonstrate scalability and robustness of the learned policies.

Performance comparison
Performance comparison

Conclusion

The bidirectional coordination network provides a deep multi‑agent reinforcement‑learning framework that learns effective cooperation through end‑to‑end training. Future work includes investigating the relationship between reward design and learning dynamics, and exploring Nash equilibria when both sides employ deep multi‑agent models.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Deep Learningmulti-agent reinforcement learningBiCNetStarCraftcooperative AI
Alibaba Cloud Developer
Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.