Lecture
This lecture introduces policy gradient methods using a simple example of a single neuron with binary output. The lecture covers how to associate actions with stimuli, optimize rewards directly, and change neuron weights to maximize expected rewards.