This lecture covers the transition from stochastic gradient descent to non-smooth optimization, focusing on topics such as sparsity, compressive sensing, and atomic norms. It delves into stochastic programming, synthetic least-squares problems, and the convergence of SGD for strongly convex problems. The instructor explains the importance of step-size selection and averaging techniques to enhance optimization performance.