AdamP: Slowing Down the Slowdown for Momentum Optimizers on Scale-invariant Weights (2020-06-15T00:00:00.000000Z)

TL;DR

This paper proposes a simple and effective remedy, SGDP and AdamP: get rid of the radial component, or the norm-increasing direction, at each optimizer step, which alters the effective step sizes without changing the effective update directions, thus enjoying the original convergence properties of GD optimizers.

Authors

Sangdoo Yun

12 papers

Byeongho Heo

4 papers

Dongyoon Han

9 papers

Sanghyuk Chun

8 papers

Seong Joon Oh

7 papers

Youngjung Uh

3 papers

Jung-Woo Ha

11 papers

Gyuwan Kim

2 papers

AdamP: Slowing Down the Slowdown for Momentum Optimizers on Scale-invariant Weights

TL;DR

Authors

Field of Study

Journal Information

Name

Venue Information

Name

Type

URL

Alternate Names