Tagged articles
2 articles
Page 1 of 1
Meituan Technology Team
Meituan Technology Team
Feb 20, 2025 · Artificial Intelligence

Offline Multi-Agent Reinforcement Learning via In‑Sample Sequential Policy Optimization (InSPO)

The paper introduces InSPO, an offline multi‑agent reinforcement‑learning algorithm that integrates behavior‑regularized Markov games with in‑sample sequential policy updates, using inverse KL divergence and maximum‑entropy regularization to avoid out‑of‑distribution joint actions, improve coordination, and achieve monotonic improvement toward Quantized Response Equilibrium, validated on XOR, bridge, and StarCraft II benchmarks.

StarCraft IIbehavior regularizationbridge game
0 likes · 19 min read
Offline Multi-Agent Reinforcement Learning via In‑Sample Sequential Policy Optimization (InSPO)
Python Programming Learning Circle
Python Programming Learning Circle
Feb 20, 2025 · Artificial Intelligence

Building a StarCraft II AI Bot with DeepMind's pysc2 in Python

This article provides a step‑by‑step guide, complete with Python code examples, for creating a Protoss AI bot using DeepMind's pysc2 library to mine resources, construct buildings, train units, implement scouting, and execute attack strategies against increasingly difficult computer opponents in StarCraft II.

AI botDeepMindReinforcement Learning
0 likes · 28 min read
Building a StarCraft II AI Bot with DeepMind's pysc2 in Python