,
We propose a metric for evaluating the generalization ability of deep neural networks trained with mini-batch gradient descent. Our metric, called gradient disparity, is the l2 norm distance between the gradient vectors of two mini-batches drawn from the t ...
Springer2021