Delves into the fundamental limits of gradient-based learning on neural networks, covering topics such as binomial theorem, exponential series, and moment-generating functions.
Discusses the Dirichlet distribution, Bayesian inference, posterior mean and variance, conjugate priors, and predictive distribution in the Dirichlet-Multinomial model.