Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
This lecture covers the optimization method of Stochastic Gradient Descent (SGD) for minimizing non-strongly convex functions. It explains the convergence rate of SGD iterates and the impact of step size on convergence. The lecture also discusses the modified Huber loss function and its computation, along with the optimal step size for running SGD to minimize this loss. Various convergence behaviors are compared and explained, shedding light on the differences observed in different scenarios.