Timezone: »

 
Spotlight
Investigating Generalization by Controlling Normalized Margin
Alexander Farhang · Jeremy Bernstein · Kushal Tirumala · Yang Liu · Yisong Yue

Thu Jul 21 11:00 AM -- 11:05 AM (PDT) @ Ballroom 1 & 2

Weight norm ‖W‖ and margin γ participate in learning theory via the normalized margin γ/‖W‖. Since standard neural net optimizers do not control normalized margin, it is hard to test whether this quantity causally relates to generalization. This paper designs a series of experimental studies that explicitly control normalized margin and thereby tackle two central questions. First: does normalized margin always have a causal effect on generalization? The paper finds that no—networks can be produced where normalized margin has seemingly no relationship with generalization, counter to the theory of Bartlett et al. (2017). Second: does normalized margin ever have a causal effect on generalization? The paper finds that yes—in a standard training setup, test performance closely tracks normalized margin. The paper suggests a Gaussian process model as a promising explanation for this behavior.

Author Information

Alexander Farhang (Caltech)
Jeremy Bernstein (Caltech)
Kushal Tirumala (California Institute of Technology)
Yang Liu (Abacus.AI)
Yisong Yue (Caltech)

Yisong Yue is an assistant professor in the Computing and Mathematical Sciences Department at the California Institute of Technology. He was previously a research scientist at Disney Research. Before that, he was a postdoctoral researcher in the Machine Learning Department and the iLab at Carnegie Mellon University. He received a Ph.D. from Cornell University and a B.S. from the University of Illinois at Urbana-Champaign. Yisong's research interests lie primarily in the theory and application of statistical machine learning. He is particularly interested in developing novel methods for interactive machine learning and structured prediction. In the past, his research has been applied to information retrieval, recommender systems, text classification, learning from rich user interfaces, analyzing implicit human feedback, data-driven animation, behavior analysis, sports analytics, policy learning in robotics, and adaptive planning & allocation problems.

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors