Lecture

Stochastic Gradient Descent: Optimization Techniques

Description

This lecture covers Stochastic Gradient Descent (SGD) with Averaging, comparing it with Gradient Descent (GD). It explains the motivation for using averaging to reduce oscillation effects in optimization problems. The lecture also discusses different types of averaging and their impact on convergence rates. Additionally, it explores the application of SGD in large-scale optimization problems and the advantages of using variants like Mini-batch SGD and SGD with Momentum. The lecture delves into the challenges of non-convex stochastic optimization and the performance of SGD in such scenarios. It concludes with a discussion on sparse recovery techniques and the Lasso optimization method for solving non-smooth convex minimization problems.

About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.