Large Batch Optimization for Deep Learning: Training BERT in 76 minutes (2019-04-01T00:00:00.000000Z)