This lecture covers the optimization problem of adversarial training, focusing on non-smooth activation functions and the structure of neural networks. It also explores the practical implementation of stochastic adversarial training and its application in improving interpretability and fairness in machine learning. The lecture delves into the concepts of directional derivatives, Wasserstein distance, and neural network distances inspired by the 1-Wasserstein distance. Additionally, it discusses the formulation of Wasserstein GANs and practical implementation strategies for training Wasserstein GANs.