The following problem is considered: given a joint distribution P XY and an event E, bound P XY (E) in terms of P X P Y (E) (where P X P Y is the product of the marginals of P XY ) and a measure of dependence of X and Y. Such bounds have direct applications in the analysis of the generalization error of learning algorithms, where E represents a large error event and the measure of dependence controls the degree of overfitting. Herein, bounds are demonstrated using several information-theoretic metrics, in particular: mutual information, lautum information, maximal leakage, and J ∞ . The mutual information bound can outperform comparable bounds in the literature by an arbitrarily large factor.
Daniel Maria Busiello, Giorgio Nicoletti
Michael Christoph Gastpar, Erixhen Sula