A novel PPL framework that couples directly to existing scientific simulators through a cross-platform probabilistic execution protocol and provides Markov chain Monte Carlo (MCMC) and deep-learning-based inference compilation (IC) engines for tractable inference is presented.
Probabilistic programming languages (PPLs) are receiving wide-spread attention for performing Bayesian inference in complex generative models. However, applications to science remain limited because of the impracticability of rewriting complex scientific simulators in a PPL, the computational cost of inference, and the lack of scalable implementations. To address these, we present a novel PPL framework that couples directly to existing scientific simulators through a cross-platform probabilistic execution protocol and provides Markov chain Monte Carlo (MCMC) and deep-learning-based inference compilation (IC) engines for tractable inference. To guide IC inference, we perform distributed training of a dynamic 3DCNN-LSTM architecture with a PyTorch-MPI-based framework on 1,024 32-core CPU nodes of the Cori supercomputer with a global mini-batch size of 128k: achieving a performance of 450 Tflop/s through enhancements to PyTorch. We demonstrate a Large Hadron Collider (LHC) use-case with the C++ Sherpa simulator and achieve the largest-scale posterior inference in a Turing-complete PPL.
Gilles Louppe
5 papers
Lawrence Meadows
2 papers
Lei Shao
3 papers
Lukas Heinrich
2 papers
W. Bhimji
2 papers
Andreas Munk
2 papers
Bradley Gram-Hansen
3 papers
K. Cranmer
2 papers
Frank D. Wood
3 papers
Jialin Liu
1 papers
Saeid Naderiparizi
1 papers
Mingfei Ma
1 papers
Xiaohui Zhao
1 papers
V. Lee
1 papers