Timezone: »

Fast Test Error Rates for Gradient-based Algorithms on Separable Data
Puneesh Deora · Bhavya Vasudeva · Vatsal Sharan · Christos Thrampoulidis

Recent works have shown that when training a linear predictor over separable data using a gradient method and an exponentially-tailed loss function, the predictor asymptotically converges in direction to the max-margin predictor. Consequently, the predictors asymptotically do not overfit. A recent study revealed that for certain gradient methods, overfitting does not occur even non-asymptotically, by obtaining finite-time generalization bounds for gradient flow, gradient descent (GD) and stochastic GD. In this work, we continue this line of research and obtain finite-time generalization bounds for other first-order methods, namely normalized GD and Nesterov's accelerated GD. Our results show that for these methods, faster train loss convergence aligns with improved generalization in terms of the test error.

Author Information

Puneesh Deora (Indian Statistical Institute)

I am a Visting Researcher in the CVPR unit, Indian Statistical Institute Kolkata. I completed my Bachelor's in ECE from the Indian Institute of Technology Roorkee in 2020. My areas of interest broadly lie in machine learning theory, optimization, adversarial robustness. I am currently working on improving deep metric learning approaches via optimal negative vectors in the embedding space. In the past, I have worked on generative modeling for compressive sensing image reconstruction, zero-shot learning for action recognition in videos, signal processing for fetal heart rate monitoring.

Bhavya Vasudeva (University of Southern California)

I am a Visiting Researcher at CVPR Unit, ISI Kolkata. I graduated from IIT Roorkee with a bachelor’s in Electronics and Communication Engineering in 2020. I am broadly interested in the areas of machine learning, computer vision, ML for health, adversarial robustness, explainable AI and causal inference. In the past, I have worked on deep metric learning, zero-shot action recognition in videos, generative adversarial network-based methods for compressive sensing MRI reconstruction, and efficient methods for fetal heart rate monitoring.

Vatsal Sharan (USC)
Christos Thrampoulidis (University of British Columbia)

More from the Same Authors