Timezone: »
This work studies the behavior of neural networks trained with the logistic loss via gradient descent on binary classification data where the underlying data distribution is general, and the (optimal) Bayes risk is not necessarily zero. In this setting, it is shown that gradient descent with early stopping achieves population risk arbitrarily close to optimal in terms of not just logistic and misclassification losses, but also in terms of calibration, meaning the sigmoid mapping of its outputs approximates the true underlying conditional distribution arbitrarily finely. Moreover, the necessary iteration, sample, and architectural complexities of this analysis all scale naturally with a certain complexity measure of the true conditional model. Lastly, while it is not shown that early stopping is necessary, it is shown that any univariate classifier satisfying a local interpolation property is necessarily inconsistent.
Author Information
Ziwei Ji (University of Illinois at Urbana-Champaign)
Matus Telgarsky (UIUC)
More from the Same Authors
-
2022 Poster: Agnostic Learnability of Halfspaces via Logistic Loss »
Ziwei Ji · Kwangjun Ahn · Pranjal Awasthi · Satyen Kale · Stefani Karp -
2022 Oral: Agnostic Learnability of Halfspaces via Logistic Loss »
Ziwei Ji · Kwangjun Ahn · Pranjal Awasthi · Satyen Kale · Stefani Karp -
2021 Poster: Fast margin maximization via dual acceleration »
Ziwei Ji · Nati Srebro · Matus Telgarsky -
2021 Spotlight: Fast margin maximization via dual acceleration »
Ziwei Ji · Nati Srebro · Matus Telgarsky -
2019 Poster: A Gradual, Semi-Discrete Approach to Generative Network Training via Explicit Wasserstein Minimization »
Yucheng Chen · Matus Telgarsky · Chao Zhang · Bolton Bailey · Daniel Hsu · Jian Peng -
2019 Oral: A Gradual, Semi-Discrete Approach to Generative Network Training via Explicit Wasserstein Minimization »
Yucheng Chen · Matus Telgarsky · Chao Zhang · Bolton Bailey · Daniel Hsu · Jian Peng -
2017 Poster: Neural networks and rational functions »
Matus Telgarsky -
2017 Talk: Neural networks and rational functions »
Matus Telgarsky