Skip to main content
Lecture

Subtracting the mean reward via the value function