Timezone: »
Recent works have shown that when training a linear predictor over separable data using a gradient method and an exponentially-tailed loss function, the predictor asymptotically converges in direction to the max-margin predictor. Consequently, the predictors asymptotically do not overfit. A recent study revealed that for certain gradient methods, overfitting does not occur even non-asymptotically, by obtaining finite-time generalization bounds for gradient flow, gradient descent (GD) and stochastic GD. In this work, we continue this line of research and obtain finite-time generalization bounds for other first-order methods, namely normalized GD and Nesterov's accelerated GD. Our results show that for these methods, faster train loss convergence aligns with improved generalization in terms of the test error.
Author Information
Puneesh Deora (Indian Statistical Institute)
I am a Visting Researcher in the CVPR unit, Indian Statistical Institute Kolkata. I completed my Bachelor's in ECE from the Indian Institute of Technology Roorkee in 2020. My areas of interest broadly lie in machine learning theory, optimization, adversarial robustness. I am currently working on improving deep metric learning approaches via optimal negative vectors in the embedding space. In the past, I have worked on generative modeling for compressive sensing image reconstruction, zero-shot learning for action recognition in videos, signal processing for fetal heart rate monitoring.
Bhavya Vasudeva (University of Southern California)
I am a Visiting Researcher at CVPR Unit, ISI Kolkata. I graduated from IIT Roorkee with a bachelor’s in Electronics and Communication Engineering in 2020. I am broadly interested in the areas of machine learning, computer vision, ML for health, adversarial robustness, explainable AI and causal inference. In the past, I have worked on deep metric learning, zero-shot action recognition in videos, generative adversarial network-based methods for compressive sensing MRI reconstruction, and efficient methods for fetal heart rate monitoring.
Vatsal Sharan (USC)
Christos Thrampoulidis (University of British Columbia)
More from the Same Authors
-
2021 : Benign Overfitting in Multiclass Classification: All Roads Lead to Interpolation »
Ke Wang · Vidya Muthukumar · Christos Thrampoulidis -
2021 : Binary Classification of Gaussian Mixtures: Abundance of Support Vectors, Benign Overfitting and Regularization »
Ke Wang · Christos Thrampoulidis -
2021 : Label-Imbalanced and Group-Sensitive Classification under Overparameterization »
Ganesh Ramachandra Kini · Orestis Paraskevas · Samet Oymak · Christos Thrampoulidis -
2023 : Generalization and Stability of Interpolating Neural Networks with Minimal Width »
Hossein Taheri · Christos Thrampoulidis -
2023 : Supervised-Contrastive Loss Learns Orthogonal Frames and Batching Matters »
Ganesh Ramachandra Kini · Vala Vakilian · Tina Behnia · Jaidev Gill · Christos Thrampoulidis -
2023 : On the Training and Generalization Dynamics of Multi-head Attention »
Puneesh Deora · Rouzbeh Ghaderi · Hossein Taheri · Christos Thrampoulidis -
2023 : Mitigating Simplicity Bias in Deep Learning for Improved OOD Generalization and Robustness »
Bhavya Vasudeva · Kameron Shahabi · Vatsal Sharan -
2023 Poster: On the Role of Attention in Prompt-tuning »
Samet Oymak · Ankit Singh Rawat · Mahdi Soltanolkotabi · Christos Thrampoulidis -
2022 Poster: FedNest: Federated Bilevel, Minimax, and Compositional Optimization »
Davoud Ataee Tarzanagh · Mingchen Li · Christos Thrampoulidis · Samet Oymak -
2022 Oral: FedNest: Federated Bilevel, Minimax, and Compositional Optimization »
Davoud Ataee Tarzanagh · Mingchen Li · Christos Thrampoulidis · Samet Oymak -
2021 Poster: Safe Reinforcement Learning with Linear Function Approximation »
Sanae Amani · Christos Thrampoulidis · Lin Yang -
2021 Spotlight: Safe Reinforcement Learning with Linear Function Approximation »
Sanae Amani · Christos Thrampoulidis · Lin Yang