Lecture

Optimization for Machine Learning: Stochastic Gradient Descent

Description

This lecture covers Stochastic Gradient Descent (SGD) and non-convex optimization in the context of machine learning. It explains the algorithm of SGD, the concept of unbiasedness, and the convergence rate comparison between SGD and Gradient Descent. The lecture also delves into the use of bounded stochastic gradients and the implications of smooth functions and bounded Hessians in optimization. Additionally, it discusses the behavior of gradient descent in non-convex functions and the application of mini-batch SGD. The importance of strong convexity and the convergence properties of gradient descent on smooth functions are also explored.

About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.