Melting Pot 2.0 (2022-11-24T00:00:00.000000Z)

TL;DR

Melting Pot 2.0 is described, which revises and expands on Melting Pot, and introduces support for scenarios with asymmetric roles, and explains how to integrate them into the evaluation protocol.

Abstract

Multi-agent artificial intelligence research promises a path to develop intelligent technologies that are more human-like and more human-compatible than those produced by"solipsistic"approaches, which do not consider interactions between agents. Melting Pot is a research tool developed to facilitate work on multi-agent artificial intelligence, and provides an evaluation protocol that measures generalization to novel social partners in a set of canonical test scenarios. Each scenario pairs a physical environment (a"substrate") with a reference set of co-players (a"background population"), to create a social situation with substantial interdependence between the individuals involved. For instance, some scenarios were inspired by institutional-economics-based accounts of natural resource management and public-good-provision dilemmas. Others were inspired by considerations from evolutionary biology, game theory, and artificial life. Melting Pot aims to cover a maximally diverse set of interdependencies and incentives. It includes the commonly-studied extreme cases of perfectly-competitive (zero-sum) motivations and perfectly-cooperative (shared-reward) motivations, but does not stop with them. As in real-life, a clear majority of scenarios in Melting Pot have mixed incentives. They are neither purely competitive nor purely cooperative and thus demand successful agents be able to navigate the resulting ambiguity. Here we describe Melting Pot 2.0, which revises and expands on Melting Pot. We also introduce support for scenarios with asymmetric roles, and explain how to integrate them into the evaluation protocol. This report also contains: (1) details of all substrates and scenarios; (2) a complete description of all baseline algorithms and results. Our intention is for it to serve as a reference for researchers using Melting Pot 2.0.

References84 items

Rethink reporting of evaluation results in AI

Emergent Bartering Behaviour in Multi-Agent Reinforcement Learning

Mapping global dynamics of benchmark creation and saturation in artificial intelligence

Spurious normativity enhances learning of compliance and enforcement behavior in artificial agents

Ethical and social risks of harm from Language Models

Statistical discrimination in learning agents

Collaborating with Humans without Human Data

Harms of AI

Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting Pot

The Option Keyboard: Combining Skills in Reinforcement Learning

A learning agent that acquires social norms from public sanctions in decentralized multi-agent settings

Podracer architectures for scalable Reinforcement Learning

Modelling Cooperation in Network Games with Spatio-Temporal Complexity

Open Problems in Cooperative AI

Towards Playing Full MOBA Games with Deep Reinforcement Learning

Model-free conventions in multi-agent reinforcement learning with heterogeneous preferences

The Origins and Psychology of Human Cooperation.

OPtions as REsponses: Grounding behavioural hierarchies in multi-agent reinforcement learning

Too Many Cooks: Coordinating Multi-agent Collaboration Through Inverse Planning

Social Diversity and Social Preferences in Mixed-Motive Reinforcement Learning

Dota 2 with Large Scale Deep Reinforcement Learning

Grandmaster level in StarCraft II using multi-agent reinforcement learning

Dissecting racial bias in an algorithm used to manage the health of populations

On the Utility of Learning about Humans for Human-AI Coordination

Human Compatible: Artificial Intelligence and the Problem of Control

V-MPO: On-Policy Maximum a Posteriori Policy Optimization for Discrete and Continuous Control

Emergent Tool Use From Multi-Agent Autocurricula

A Survey on Bias and Fairness in Machine Learning

Item response theory in AI: Analysing machine learning classifiers at the instance level

Machine behaviour

Open-ended Learning in Symmetric Zero-sum Games

Malthusian Reinforcement Learning

A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play

Quantifying Generalization in Reinforcement Learning

Generalization and Regularization in DQN

Multi-task Deep Reinforcement Learning with PopArt

Representation Learning with Contrastive Predictive Coding

Value-Decomposition Networks For Cooperative Multi-Agent Learning Based On Team Reward

Human-level performance in 3D multiplayer games with population-based reinforcement learning

A Dissection of Overfitting and Generalization in Continuous Reinforcement Learning

QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

Inequity aversion improves cooperation in intertemporal social dilemmas

IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures

A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning

Mastering the game of Go without human knowledge

Evaluation in artificial intelligence: from task-oriented to ability-oriented measurement

Learning with Opponent-Learning Awareness

Prosocial learning agents solve generalized Stag Hunts better than selfish ones

A multi-agent reinforcement learning model of common-pool resource appropriation

Maintaining cooperation in complex social dilemmas using deep reinforcement learning

Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments

Six Challenges for Neural Machine Translation

FeUdal Networks for Hierarchical Reinforcement Learning

Multi-agent Reinforcement Learning in Sequential Social Dilemmas

Reinforcement Learning with Unsupervised Auxiliary Tasks

Concrete Problems in AI Safety

Cooperative Inverse Reinforcement Learning

Asynchronous Methods for Deep Reinforcement Learning

Mastering the game of Go with deep neural networks and tree search

Research Priorities for Robust and Beneficial Artificial Intelligence

ImageNet Large Scale Visual Recognition Challenge

On the difficulty of training recurrent neural networks

Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction

Meta-analysis in medical research.

Lab Experiments for the Study of Social-Ecological Systems

The Elements of Statistical Learning: Data Mining, Inference, and Prediction

Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning

The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems

The evolution of cooperation

A Treatise of Human Nature: Being an Attempt to introduce the experimental Method of Reasoning into Moral Subjects

Games and Decisions: Introduction and Critical Survey.

Review: R. Duncan Luce and Howard Raiffa, Games and decisions: Introduction and critical survey

Agent-Based Computational Economics: Overview and Brief History 1

Evolutionary Game Theory

Tracking the Impact and Evolution of AI: The AIcollaboratory

The animalai olympics

A Meta-Analysis of Overfitting in Machine Learning

Aligning Superintelligence with Human Interests: A Technical Research Agenda

Understanding Institutional Diversity

Reinforcement Learning: An Introduction

Territory Inside Out: scenario 1

Running with Scissors in the Matrix Repeated: scenarios 0, 1, 2, 3, 4 for ACB exploiter was no better than random. We trained an OPRE exploiter as replacement

Collaborative Cooking Asymmetric / Circuit / Cramped / Figure Eight: scenario 2

Predator Prey Random Forest: scenario 3