This lecture explores structures in non-convex optimization, focusing on scalable optimization for deep learning. Topics include optimization formulations for deep learning training problems, stochastic gradient descent and its variants, critical points classification, the strict saddle property, convergence of SGD, avoidance of saddle points, speed of convergence to local minimizers, and the optimization landscape of overparametrized neural networks.