3260 papers • 126 benchmarks • 313 datasets
The StarCraft Multi-Agent Challenge (SMAC) is a benchmark that provides elements of partial observability, challenging dynamics, and high-dimensional observation spaces. SMAC is built using the StarCraft II game engine, creating a testbed for research in cooperative MARL where each game unit is an independent RL agent.
(Image credit: Papersgraph)
These leaderboards are used to track progress in smac-1
Use these libraries to find smac-1 models and implementations
The StarCraft Multi-Agent Challenge (SMAC), based on the popular real-time strategy game StarCraft II, is proposed as a benchmark problem and an open-source deep multi-agent RL learning framework including state-of-the-art algorithms is opened.
It is demonstrated that Independent PPO (IPPO), a form of independent learning in which each agent simply estimates its local value function, can perform just as well as or better than state-of-the-art joint learning approaches on popular multi-agent benchmark suite SMAC with little hyperparameter tuning.
We present mlrMBO, a flexible and comprehensive R toolbox for model-based optimization (MBO), also known as Bayesian optimization, which addresses the problem of expensive black-box optimization by approximating the given objective function through a surrogate regression model. It is designed for both single- and multi-objective optimization with mixed continuous, categorical and conditional parameters. Additional features include multi-point batch proposal, parallelization, visualization, logging and error-handling. mlrMBO is implemented in a modular fashion, such that single components can be easily replaced or adapted by the user for specific use cases, e.g., any regression learner from the mlr toolbox for machine learning can be used, and infill criteria and infill optimizers are easily exchangeable. We empirically demonstrate that mlrMBO provides state-of-the-art performance by comparing it on different benchmark scenarios against a wide range of other optimizers, including DiceOptim, rBayesianOptimization, SPOT, SMAC, Spearmint, and Hyperopt.
This paper analyses value-based methods that are known to have superior performance in complex environments and proposes a novel approach called MAVEN that hybridises value and policy-based methods by introducing a latent space for hierarchical control.
FACMAC is a new method for cooperative multi-agent reinforcement learning in both discrete and continuous action spaces that uses a centralised but factored critic, which combines per-agent utilities into the joint action-value function via a non-linear monotonic function.
The experiment results show that QMIX with normalized optimizations outperforms other works in SMAC, and beyond the common wisdom from these works, the monotonicity constraint can improve sample efficiency inSMAC and DEPP.
A new deterministic and efficient hyperparameter optimization method that employs radial basis functions as error surrogates, called HORD, which significantly outperforms the well-established Bayesian optimization methods such as GP, SMAC, and TPE.
A simple and fast variant of Planet Wars used as a test-bed for statistical planning based Game AI agents, and for noisy hyper-parameter optimisation, using the recently developed N-Tuple Bandit Evolutionary Algorithm.
The results indicate that Differential Evolution outperforms SMAC for most datasets when tuning a given machine learning algorithm - particularly when breaking ties in a first-to-report fashion.
This article proposes an approach, named SMIX, that uses an OFF-policy training to achieve this by avoiding the greedy assumption commonly made in CVF learning and can be used as a general tool to improve the overall performance of other centralized training with decentralized execution (CTDE)-type algorithms by enhancing their CVFs.
Adding a benchmark result helps the community track progress.