Optimization for Machine Learning: Stochastic Gradient Descent

Description

This lecture covers Stochastic Gradient Descent (SGD) and non-convex optimization in the context of machine learning. It explains the algorithm of SGD, the concept of unbiasedness, and the convergence rate comparison between SGD and Gradient Descent. The lecture also delves into the use of bounded stochastic gradients and the implications of smooth functions and bounded Hessians in optimization. Additionally, it discusses the behavior of gradient descent in non-convex functions and the application of mini-batch SGD. The importance of strong convexity and the convergence properties of gradient descent on smooth functions are also explored.

This video is available exclusively on Mediaspace for a restricted audience. Please log in to MediaSpace to access it if you have the necessary permissions.

Watch on Mediaspace

Official source

About this result

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.