3260 papers • 126 benchmarks • 313 datasets
The StarCraft Multi-Agent Challenge (SMAC) is a benchmark that provides elements of partial observability, challenging dynamics, and high-dimensional observation spaces. SMAC is built using the StarCraft II game engine, creating a testbed for research in cooperative MARL where each game unit is an independent RL agent.
(Image credit: Papersgraph)
These leaderboards are used to track progress in smac-16
Use these libraries to find smac-16 models and implementations
The StarCraft Multi-Agent Challenge (SMAC), based on the popular real-time strategy game StarCraft II, is proposed as a benchmark problem and an open-source deep multi-agent RL learning framework including state-of-the-art algorithms is opened.
It is demonstrated that Independent PPO (IPPO), a form of independent learning in which each agent simply estimates its local value function, can perform just as well as or better than state-of-the-art joint learning approaches on popular multi-agent benchmark suite SMAC with little hyperparameter tuning.
We present mlrMBO, a flexible and comprehensive R toolbox for model-based optimization (MBO), also known as Bayesian optimization, which addresses the problem of expensive black-box optimization by approximating the given objective function through a surrogate regression model. It is designed for both single- and multi-objective optimization with mixed continuous, categorical and conditional parameters. Additional features include multi-point batch proposal, parallelization, visualization, logging and error-handling. mlrMBO is implemented in a modular fashion, such that single components can be easily replaced or adapted by the user for specific use cases, e.g., any regression learner from the mlr toolbox for machine learning can be used, and infill criteria and infill optimizers are easily exchangeable. We empirically demonstrate that mlrMBO provides state-of-the-art performance by comparing it on different benchmark scenarios against a wide range of other optimizers, including DiceOptim, rBayesianOptimization, SPOT, SMAC, Spearmint, and Hyperopt.
FACMAC is a new method for cooperative multi-agent reinforcement learning in both discrete and continuous action spaces that uses a centralised but factored critic, which combines per-agent utilities into the joint action-value function via a non-linear monotonic function.
This paper analyses value-based methods that are known to have superior performance in complex environments and proposes a novel approach called MAVEN that hybridises value and policy-based methods by introducing a latent space for hierarchical control.
A simple and fast variant of Planet Wars used as a test-bed for statistical planning based Game AI agents, and for noisy hyper-parameter optimisation, using the recently developed N-Tuple Bandit Evolutionary Algorithm.
The results indicate that Differential Evolution outperforms SMAC for most datasets when tuning a given machine learning algorithm - particularly when breaking ties in a first-to-report fashion.
Adding a benchmark result helps the community track progress.