Plenary Speaker
in
Workshop: High-dimensional Learning Dynamics Workshop: The Emergence of Structure and Reasoning
Generalization Error of min-norm interpolators in transfer learning, Pragya Sur
Pragya Sur
Min-norm interpolators naturally emerge as implicit regularized limits of modern machine learning algorithms. Recently, their out-of-distribution risk was studied when test samples are unavailable during training. However, in many applications, a limited amount of test data is typically available during training. Properties of min-norm interpolation in this setting are not well understood. In this talk, I will present a characterization of the bias and variance of pooled min-L2-norm interpolation under covariate and model shifts. I will show that the pooled interpolator captures both early fusion and a form of intermediate fusion. Our results have several implications. For example, under model shift, adding data always hurts prediction when the signal-to-noise ratio is low. However, for higher signal-to-noise ratios, transfer learning helps as long as the shift-to-signal ratio lies below a threshold that I will define. I will further present data-driven methods to determine: (i) when the pooled interpolator outperforms the target-based interpolator, and (ii) the optimal number of target samples that minimizes generalization error. Our results also show that under covariate shift, if the source sample size is small relative to the dimension, heterogeneity between domains improves the risk. Time permitting, I will introduce a novel anisotropic local law that helps achieve some of these characterizations and may be of independent interest in random matrix theory. This is based on joint work with Yanke Song and Sohom Bhattacharya.