Explores optimization methods like gradient descent and subgradients for training machine learning models, including advanced techniques like Adam optimization.
Covers gradient descent methods for convex and nonconvex problems, including smooth unconstrained convex minimization, maximum likelihood estimation, and examples like ridge regression and image classification.
Discusses Stochastic Gradient Descent and its application in non-convex optimization, focusing on convergence rates and challenges in machine learning.