Etalumis: Bringing Probabilistic Programming to Scientific Simulators at Scale (2019-07-08T00:00:00.000000Z)

TL;DR

A novel PPL framework that couples directly to existing scientific simulators through a cross-platform probabilistic execution protocol and provides Markov chain Monte Carlo (MCMC) and deep-learning-based inference compilation (IC) engines for tractable inference is presented.

Abstract

Probabilistic programming languages (PPLs) are receiving wide-spread attention for performing Bayesian inference in complex generative models. However, applications to science remain limited because of the impracticability of rewriting complex scientific simulators in a PPL, the computational cost of inference, and the lack of scalable implementations. To address these, we present a novel PPL framework that couples directly to existing scientific simulators through a cross-platform probabilistic execution protocol and provides Markov chain Monte Carlo (MCMC) and deep-learning-based inference compilation (IC) engines for tractable inference. To guide IC inference, we perform distributed training of a dynamic 3DCNN-LSTM architecture with a PyTorch-MPI-based framework on 1,024 32-core CPU nodes of the Cori supercomputer with a global mini-batch size of 128k: achieving a performance of 450 Tflop/s through enhancements to PyTorch. We demonstrate a Large Hadron Collider (LHC) use-case with the C++ Sherpa simulator and achieve the largest-scale posterior inference in a Turing-complete PPL.

Authors

Prabhat

6 papers

A. G. Baydin

4 papers

Philip H. S. Torr

52 papers

TL;DR

Abstract

Authors

References104 items

An Empirical Model of Large-Batch Training

ImageNet/ResNet-50 Training in 224 Seconds

Massively Distributed SGD: ImageNet/ResNet-50 Training in a Flash

Measuring the Effects of Data Parallelism on Neural Network Training

Pyro: Deep Universal Probabilistic Programming

Exascale Deep Learning for Climate Analytics

An Introduction to Probabilistic Programming

CosmoFlow: Using Deep Learning to Learn the Universe at Scale

Mining gold from implicit models to improve likelihood-free inference

Fast inference of deep neural networks in FPGAs for particle physics

Scaling GRPC Tensorflow on 512 nodes of Cori Supercomputer

Improvements to Inference Compilation for Probabilistic Programming in Large-Scale Scientific Simulators

TensorFlow Distributions

Don't Decay the Learning Rate, Increase the Batch Size

Automatic differentiation in PyTorch

CARLA: An Open Urban Driving Simulator

Deep Learning at 15PF : Supervised and Semi-Supervised Classification for Scientific Data

Large Batch Training of Convolutional Networks

Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour

A comprehensive study of batch construction strategies for recurrent neural networks in MXNet

Pycobra: A Python Toolbox for Ensemble Learning and Visualisation

Performance of the ATLAS trigger system in 2015

Edward: A library for probabilistic modeling, inference, and criticism

Inference Compilation and Universal Probabilistic Programming

Asynchronous Stochastic Gradient Descent with Delay Compensation

Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation

On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima

Accelerating recurrent neural network training using sequence bucketing and multi-GPU data parallelization

Distributed Deep Learning Using Synchronous Stochastic Gradient Descent

Revisiting Distributed Synchronous SGD

Semantics for probabilistic programming: higher-order functions, continuous distributions, and soft constraints

Reconstruction of hadronic decay products of tau leptons with the ATLAS experiment

FireCaffe: Near-Linear Acceleration of Deep Neural Network Training on Compute Clusters

Probabilistic machine learning and artificial intelligence

Approximate Bayesian computation for forward modeling in cosmology

Search for the Standard Model Higgs boson produced in association with top quarks and decaying into $$\varvec{b\bar{b}}$$bb¯ in $$\varvec{pp}$$pp collisions at $$\sqrt{\mathbf{s}}= \varvec{8{{\,\mathrm TeV}}}$$s=8TeV with the ATLAS detector

Automatic differentiation in machine learning: a survey

Auto-Encoding Variational Bayes

Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis

ZeroMQ: Messaging for Many Applications

Large Scale Distributed Deep Networks

Climate Interactive: The C-ROADS Climate Policy Model

Stochastic variational inference

Performance of the ATLAS Trigger System

The No-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo

Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent

Lightweight Implementations of Probabilistic Programming Languages Via Transformational Compilation

Handbook of Markov Chain Monte Carlo

Curriculum learning

Event generation with SHERPA 1.1

The ATLAS Experiment at the CERN Large Hadron Collider

Towards a comprehensive simulation model of malaria epidemiology and control

TrueSkillTM: A Bayesian Skill Rating System

Pattern Recognition and Machine Learning

A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking

Long Short-Term Memory

An Introduction To Quantum Field Theory

Dynamical Likelihood Method for Reconstruction of Events with Missing Momentum. I. Method and Toy Models

Regularization and Renormalization of Gauge Fields

A Model of Leptons

Partial Symmetries of Weak Interactions

Linux man-pages project

Bayesian Distributed Stochastic Gradient Descent

Layer-WiseAdaptiveRateControl for Training of Deep Networks

/Infer.NET 0.3

Variational Inference via χ Upper Bound Minimization

Supplementary Materials for: “Scaling probabilistic models of genetic variation to millions of humans”

Fast ε-free Inference of Simulation Models with Bayesian Conditional Density Estimation

Automatic inference for higher-order probabilistic programs

e-prints

Amortized Inference in Probabilistic Reasoning

Adam: AMethod for Stochastic Optimization

Bayesian data analysis, third edition

Volume 1 (NIPS’12)

Probabilistic Inference Using Markov Chain Monte Carlo Methods

A Tutorial on Particle Filtering and Smoothing: Fifteen years later