This lecture explores the fundamental limits of gradient-based learning on neural networks. Topics covered include the binomial theorem, exponential series, moment-generating functions, chi-squared random variables, and the multinomial distribution. The lecture delves into the properties of cumulative distribution functions, conditional probability functions, and exponential families. Examples and calculations are provided to illustrate the concepts discussed.