Workshop: Information-Theoretic Methods for Rigorous, Responsible, and Reliable Machine Learning (ITR3)

Learning under Distribution Mismatch and Model Misspecification

Mohammad Saeed Masiha · Mohammad Reza Aref


We study learning algorithms when there is a mismatch between the distributions of the training and test datasets of a learning algorithm. The effect of this mismatch on the generalization error and model misspecification are quantified. Moreover, we provide a connection between the generalization error and the rate-distortion theory, which allows one to use bounds from the rate-distortion theory to derive new bounds on the generalization error and vice versa. In particular, the rate-distortion-based bound strictly improves over the earlier bound by Xu and Raginsky even when there is no mismatch.

Chat is not available.