Workshop: Information-Theoretic Methods for Rigorous, Responsible, and Reliable Machine Learning (ITR3)

Towards a Unified Information-Theoretic Framework for Generalization

Mahdi Haghifam · Gintare Karolina Dziugaite · Shay Moran


In this work, we investigate the expressiveness of the ``conditional mutual information'' (CMI) framework of Steinke and Zakynthinou (2020) and the prospect of using it to provide a unified framework for proving generalization bounds, with a focus on the realizable setting. We first demonstrate that one can use this framework to express non-trivial (but sub-optimal) bounds for any learning algorithm that outputs hypotheses from a class of bounded VC dimension. Then, we prove that the CMI framework yields the optimal bound on the expected risk of Support Vector Machines (SVMs) for learning halfspaces. This result is an application of our general result showing that stable compression schemes Bousquetet al. (2020) of size k have uniformly bounded CMI of order O(k). We further show that an inherent limitation of proper learning of VC classes contradicts the existence of a proper learner with constant CMI, and it implies a negative resolution to an open problem of Steinke and Zakynthinou (2020). We further study the CMI of empirical risk minimizers (ERMs) of class H and show that it is possible to output all consistent classifiers (version space) with bounded CMI if and only if H has a bounded star number (Hanneke and Yang (2015)).

Chat is not available.