Covers Markov processes, transition densities, and distribution conditional on information, discussing classification of states and stationary distributions.
Delves into the fundamental limits of gradient-based learning on neural networks, covering topics such as binomial theorem, exponential series, and moment-generating functions.
Introduces Hidden Markov Models, explaining the basic problems and algorithms like Forward-Backward, Viterbi, and Baum-Welch, with a focus on Expectation-Maximization.