Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
This lecture discusses the challenges of Batch Normalization in ResNets, proposing an alternative approach called NFNets. It explores the benefits of BatchNorm, the drawbacks it introduces, and how NFNets address these issues by downsizing the residual branch, using adaptive gradient clipping, explicit regularization, and Scaled Weight Standardization. The presentation covers the impact of these modifications on signal propagation, large batch training, implicit regularization, and mean-shift elimination in ReLU networks. It concludes by showcasing the ImageNet results of NFNets, demonstrating their superior performance and efficiency compared to existing models.