Dayal
Kalra
Kalra, D. ., & Barkeshli, M. . (2023). Phase diagram of early training dynamics in deep neural networks: effect of the learning rate, depth, and width. In 37th Conference on Neural Information Processing Systems (NeurIPS). http://doi.org/10.5555/3666122.3668370 (Original work published September 2023)