Oral
Learning Deep ResNet Blocks Sequentially using Boosting Theory
Furong Huang · Jordan Ash · John Langford · Robert Schapire

Wed Jul 11th 05:40 -- 05:50 PM @ K1 + K2

We prove a \emph{multi-channel telescoping sum boosting} theory for the ResNet architectures which simultaneously creates a new technique for boosting over features (in contrast with labels) and provides a new algorithm for ResNet-style architectures. Our proposed training algorithm, \emph{BoostResNet}, is particularly suitable in non-differentiable architectures. Our method only requires the relatively inexpensive sequential training of $T$ ``shallow ResNets''. We prove that the training error decays exponentially with the depth $T$ if the weak module classifiers that we train perform slightly better than some weak baseline. In other words, we propose a weak learning condition and prove a boosting theory for ResNet under the weak learning condition. A generalization error bound based on margin theory is proved and suggests that ResNet could be resistant to overfitting using a network with $l_1$ norm bounded weights.

Author Information

Furong Huang (University of Maryland College Park)
Jordan Ash (Princeton University)
John Langford (Microsoft Research)
Robert Schapire (Microsoft Research)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors