ZeRO: Memory optimizations Toward Training Trillion Parameter Models (2019-10-04T00:00:00.000000Z)