The Interplay Between Interpolation and Aggregation in Regression: Optimal Sample Complexity
Mikael Møller Høgsgaard ⋅ Kasper Green Larsen ⋅ Liang-Yu Zou
Abstract
This work investigates theoretically the interplay between interpolation and aggregation in regression. We establish that the $\gamma$-graph dimension characterizes learnability for a broad class of natural aggregation procedures. Furthermore, we prove that an extremely simple aggregation procedure, combining three interpolating hypotheses via the median, is optimal among all these aggregation procedures, and is strictly more powerful than proper learning. Finally, we show that some hypothesis classes are learnable only by aggregating infinitely many hypotheses or by using non-interpolating aggregation rules (which may predict outside the range of their inputs), and any finite interpolating aggregation fails to achieve even trivial performance.
Successful Page Load