Optimization for Machine Learning: Stochastic Gradient Descent

Description

This lecture covers Stochastic Gradient Descent (SGD) and non-convex optimization in the context of machine learning. It explains the algorithm of SGD, the concept of unbiasedness, and the convergence rate comparison between SGD and Gradient Descent. The lecture also delves into the use of bounded stochastic gradients and the implications of smooth functions and bounded Hessians in optimization. Additionally, it discusses the behavior of gradient descent in non-convex functions and the application of mini-batch SGD. The importance of strong convexity and the convergence properties of gradient descent on smooth functions are also explored.

This video is available exclusively on Mediaspace for a restricted audience. Please log in to MediaSpace to access it if you have the necessary permissions.

Watch on Mediaspace

Official source

https://mediaspace.epfl.ch/media/0_l4uhx0n3

About this result

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Ontological neighbourhood

Mathematics

Analysis: Calculus

Statistics

Statistical inference: Mathematical statistics

Information engineering

Machine learning: Artificial neural networks

Related lectures (75)