Is Independent Learning All You Need in the StarCraft Multi-Agent Challenge?

Published in

arXiv.org(2020)

External Links:

Generate Graph

TL;DR

It is demonstrated that Independent PPO (IPPO), a form of independent learning in which each agent simply estimates its local value function, can perform just as well as or better than state-of-the-art joint learning approaches on popular multi-agent benchmark suite SMAC with little hyperparameter tuning.

Abstract

Most recently developed approaches to cooperative multi-agent reinforcement learning in the \emph{centralized training with decentralized execution} setting involve estimating a centralized, joint value function. In this paper, we demonstrate that, despite its various theoretical shortcomings, Independent PPO (IPPO), a form of independent learning in which each agent simply estimates its local value function, can perform just as well as or better than state-of-the-art joint learning approaches on popular multi-agent benchmark suite SMAC with little hyperparameter tuning. We also compare IPPO to several variants; the results suggest that IPPO's strong performance may be due to its robustness to some forms of environment non-stationarity.

Authors

C. S. D. Witt

6 papers

Shimon Whiteson

13 papers

Philip H. S. Torr

52 papers

References35 items

Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms on a Building Energy Demand Coordination Task

Revisiting Design Choices in Proximal Policy Optimization

Implementation Matters in Deep Policy Gradients: A Case Study on PPO and TRPO

Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

Deep Multi-Agent Reinforcement Learning for Decentralized Continuous Cooperative Control

Is Independent Learning All You Need in the StarCraft Multi-Agent Challenge?

Published in

arXiv.org(2020)

External Links:

Generate Graph

TL;DR

Abstract

Authors

C. S. D. Witt

6 papers

Shimon Whiteson

13 papers

Philip H. S. Torr

52 papers

References35 items

Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms on a Building Energy Demand Coordination Task

Revisiting Design Choices in Proximal Policy Optimization

Implementation Matters in Deep Policy Gradients: A Case Study on PPO and TRPO

Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

Deep Multi-Agent Reinforcement Learning for Decentralized Continuous Cooperative Control

Viktor Makoviychuk

2 papers

Tarun Gupta

1 papers

Denys Makoviichuk

1 papers

Mingfei Sun

1 papers

MAVEN: Multi-Agent Variational Exploration

Deep Coordination Graphs

Exploration with Unreliable Intrinsic Reward in Multi-Agent Reinforcement Learning

QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning

Truly Proximal Policy Optimization

The StarCraft Multi-Agent Challenge

QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

Guided Deep Reinforcement Learning for Swarm Systems

StarCraft II: A New Challenge for Reinforcement Learning

Proximal Policy Optimization Algorithms

Value-Decomposition Networks For Cooperative Multi-Agent Learning

Counterfactual Multi-Agent Policy Gradients

A Concise Introduction to Decentralized POMDPs

Multi-agent reinforcement learning as a rehearsal for decentralized planning

Asynchronous Methods for Deep Reinforcement Learning

High-Dimensional Continuous Control Using Generalized Advantage Estimation

Trust Region Policy Optimization

Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification

Multiagent Learning: Basics, Challenges, and Prospects

An Overview of Recent Progress in the Study of Distributed Multi-Agent Coordination

A Comprehensive Survey of Multiagent Reinforcement Learning

Distributed agent-based air traffic flow management

Cooperative Multi-Agent Learning: The State of the Art

The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems

A Complexity Analysis of Cooperative Mechanisms in Reinforcement Learning

Reinforcement Learning, second edition: An Introduction. Bradford Books, Cambridge, Massachusetts, second edition edition

Reinforcement Learning, second edition: An Introduc-tion

Lenient Learning in Independent-Learner Stochastic Cooperative Games

Multi Agent Reinforcement Learning Independent vs Cooperative Agents

Value Function Clipping for PPO · Issue #136 · ikostrikov/pytorch-a2c-ppo-acktr-gail

Field of Study

Computer Science

Journal Information

Name

ArXiv

Volume

abs/2005.00687

Venue Information

Name

arXiv.org

Type

URL

https://arxiv.org

Alternate Names

ArXiv