playing-games

Starcraft

3260 papers • 126 benchmarks • 313 datasets

Starcraft I is a RTS game; the task is to train an agent to play the game. ( Image credit: Macro Action Selection with Deep Reinforcement Learning in StarCraft )

(Image credit: Papersgraph)

Benchmarks

These leaderboards are used to track progress in starcraft

Trend

Dataset

Best Model

Actions

No benchmarks available.

Libraries

i

Use these libraries to find starcraft models and implementations

hyunghona/emu

4 papers 20

Datasets

VizDoom

StarCraft II Learning Environment

Subtasks

No subtasks available.

Most implemented papers

The StarCraft Multi-Agent Challenge

Tabish Rashid, Mikayel Samvelyan, C. S. D. Witt, Gregory Farquhar, Jakob N. Foerster, Shimon Whiteson, Philip H. S. Torr, Nantas Nardelli, Tim G. J. Rudner, Chia-Man Hung•Sun Feb 10 2019

The StarCraft Multi-Agent Challenge (SMAC), based on the popular real-time strategy game StarCraft II, is proposed as a benchmark problem and an open-source deep multi-agent RL learning framework including state-of-the-art algorithms is opened.

1131

Content

oxwhirl/pymarl

3 papers 1,821

facebookresearch/benchmarl

3 papers 234

3 papers 36

2 papers 2,966

2 papers 851

2 papers 555

cyanrain7/trust-region-policy-optim…

2 papers 180

cyanrain7/trpo-in-marl

2 papers 180

chauncygu/multi-agent-constrained-p…

2 papers 139

morning9393/HAPPO-HATRPO

2 papers 38

nju-rl/acorm

2 papers 19

anonymous-iclr22/trust-region-in-mu…

2 papers 11

0

Paper Graph

The Surprising Effectiveness of PPO in Cooperative Multi-Agent Games

Eugene Vinitsky, A. Bayen, Chao Yu, Akash Velu, Yu Wang, Yi Wu•Mon Mar 01 2021

It is shown that PPO-based multi-agent algorithms achieve surprisingly strong performance in four popularMulti-agent testbeds: the particle-world environments, the StarCraft multi- agent challenge, Google Research Football, and the Hanabi challenge, with minimal hyperparameter tuning and without any domain-specific algorithmic modifications or architectures.

1898 0

Paper Graph

QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

Tabish Rashid, Mikayel Samvelyan, C. S. D. Witt, Gregory Farquhar, Jakob N. Foerster, Shimon Whiteson•Thu Mar 29 2018

QMIX employs a network that estimates joint action-values as a complex non-linear combination of per-agent values that condition only on local observations, and structurally enforce that the joint-action value is monotonic in the per- agent values, which allows tractable maximisation of the jointaction-value in off-policy learning.

1846 0

Paper Graph

StarCraft II: A New Challenge for Reinforcement Learning

T. Ewalds, O. Vinyals, K. Simonyan, Julian Schrittwieser, Stig Petersen, David Silver, T. Lillicrap, H. V. Hasselt, T. Schaul, Petko Georgiev, John Quan, J. Agapiou, Sergey Bartunov, A. Vezhnevets, Michelle Yeo, Alireza Makhzani, Heinrich Küttler, Stephen Gaffney, Kevin Calderone, Paul Keet, Anthony Brunasso, David Lawrence, Anders Ekermo, J. Repp, Rodney Tsing•Tue Aug 15 2017

This paper introduces SC2LE (StarCraft II Learning Environment), a reinforcement learning environment based on the StarCraft II game that offers a new and challenging environment for exploring deep reinforcement learning algorithms and architectures and gives initial baseline results for neural networks trained from this data to predict game outcomes and player actions.

935 0

Paper Graph

HeR-DRL:Heterogeneous Relational Deep Reinforcement Learning for Single-Robot and Multi-Robot Crowd Navigation

Xinyu Zhou, Songhao Piao, Wenzheng Chi, Liguo Chen, Wei Li•Wed Apr 30 2025

Crowd navigation has garnered significant research attention in recent years, particularly with the advent of DRL-based methods. Current DRL-based methods have extensively explored interaction relationships in single-robot scenarios. However, the heterogeneity of multiple interaction relationships is often disregarded. This “interaction blind spot” hinders progress towards more complex scenarios, such as multi-robot crowd navigation. In this letter, we propose a heterogeneous relational deep reinforcement learning method, named HeR-DRL, which utilizes a customized heterogeneous Graph Neural Network (GNN) to enhance overall performance in crowd navigation. Firstly, we devised a method for constructing robot-crowd heterogenous relation graph that effectively simulates the heterogeneous pair-wise interaction relationships. Based on this graph, we proposed a novel heterogeneous GNN to encode interaction relationship information. Finally, we incorporate the encoded information into deep reinforcement learning to explore the optimal policy. HeR-DRL is rigorously evaluated by comparing it to state-of-the-art algorithms in both single-robot and multi-robot circle crossing scenarios. The experimental results demonstrate that HeR-DRL surpasses the state-of-the-art approaches in overall performance, particularly excelling in terms of efficiency and comfort. This underscores the significance of heterogeneous interactions in crowd navigation.

3 0

Paper Graph

Is Independent Learning All You Need in the StarCraft Multi-Agent Challenge?

C. S. D. Witt, Shimon Whiteson, Philip H. S. Torr, Viktor Makoviychuk, Tarun Gupta, Denys Makoviichuk, Mingfei Sun•Tue Nov 17 2020

It is demonstrated that Independent PPO (IPPO), a form of independent learning in which each agent simply estimates its local value function, can perform just as well as or better than state-of-the-art joint learning approaches on popular multi-agent benchmark suite SMAC with little hyperparameter tuning.

485 0

Paper Graph

Perceiver IO: A General Architecture for Structured Inputs & Outputs

Jean-Baptiste Alayrac, Sebastian Borgeaud, O. Vinyals, Andrew Zisserman, M. Botvinick, Evan Shelhamer, João Carreira, Carl Doersch, Skanda Koppula, Andrew Jaegle, Catalin Ionescu, David Ding, Andrew Brock, Olivier J. H'enaff•Thu Jul 29 2021

This work proposes Perceiver IO, a general-purpose architecture that handles data from arbitrary settings while scaling linearly with the size of inputs and outputs and augments the Perceiver with a flexible querying mechanism that enables outputs of various sizes and semantics, doing away with the need for task-specific architecture engineering.

739 0

Paper Graph

Counterfactual Multi-Agent Policy Gradients

Gregory Farquhar, Jakob N. Foerster, Shimon Whiteson, Nantas Nardelli, Triantafyllos Afouras•Tue May 23 2017

A new multi-agent actor-critic method called counterfactual multi- agent (COMA) policy gradients that uses a centralised critic to estimate the Q-function and decentralised actors to optimise the agents' policies.

2380 0

Paper Graph

Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning

Pushmeet Kohli, Gregory Farquhar, Jakob N. Foerster, Shimon Whiteson, Philip H. S. Torr, Nantas Nardelli, Triantafyllos Afouras•Mon Feb 27 2017

Two methods using a multi-agent variant of importance sampling to naturally decay obsolete data and conditioning each agent's value function on a fingerprint that disambiguates the age of the data sampled from the replay memory enable the successful combination of experience replay with multi- agent RL.

639 0

Paper Graph

QPLEX: Duplex Dueling Multi-Agent Q-Learning

Yang Yu, Jianhao Wang, Zhizhou Ren, Terry Liu, Chongjie Zhang•Sun Aug 02 2020

A novel MARL approach, called duPLEX dueling multi-agent Q-learning (QPLEX), which takes a duplex dueling network architecture to factorize the joint value function and encodes the IGM principle into the neural network architecture and thus enables efficient value function learning.

553 0

Paper Graph

Adding a benchmark result helps the community track progress.