This lecture covers the optimality of convergence rate in gradient descent, acceleration techniques, and the stochastic gradient descent algorithm. It discusses the convergence rate of gradient descent for convex functions, information theoretic lower bounds, accelerated gradient descent algorithms, global convergence properties, adaptive first-order methods, and variable metric gradient descent algorithms. The lecture also explores adaptive gradient methods, AdaGrad, convergence rates for AdaGrad, AcceleGrad, and the performance comparison of optimization algorithms for convex and non-convex problems.