Timezone: »
A recent study on overparameterized neural networks by Huh et al. (2021) finds that gradient-based training of overparameterized neural networks finds low-rank parameters, implying that certain implicit low-rank constraints are in action. Inspired by this, we empirically study the effective VC dimension of neural networks with low-rank parameters. Our finding is that their effective VC dimension is proportional to a specific weighted sum of per-layer parameter counts, which we call the effective number of parameters. As the effective VC dimension lower bounds the VC dimension, our result suggests a possibility that the analytic VC dimension upper bound proposed by Bartlett et al. (2019) is indeed tight for neural networks with low-rank parameters.
Author Information
Daewon Seo (UW-Madison)
Hongyi Wang (University of Wisconsin-Madison, IBM Research)
I’m currently a second-year Ph.D. student at Computer Sciences Department of University of Wisconsin - Madison, advised by Prof. Dimitris Papailiopoulos. My research interests locate in machine learning, distributed system, and large-scale optimization.
Dimitris Papailiopoulos (University of Wisconsin - Madison)
Kangwook Lee (UW Madison)
I am an Assistant Professor at the Electrical and Computer Engineering department and the Computer Sciences department (by courtesy) at the University of Wisconsin-Madison. Previously, I was a Research Assistant Professor at Information and Electronics Research Institute of KAIST, working with Prof. Changho Suh. Before that, I was a postdoctoral scholar at the same institute. I received my PhD in May 2016 from the Electrical Engineering and Computer Science department at UC Berkeley and my Master of Science degree from the same department in December 2012, both under the supervision of Prof. Kannan Ramchandran. I was a member of Berkeley Laboratory of Information and System Sciences (BLISS, aka Wireless Foundation) and BASiCS Group. I received my Bachelor of Science degree in Electrical Engineering from Korea Advanced Institute of Science and Technology (KAIST) in May 2010.
More from the Same Authors
-
2022 Poster: GenLabel: Mixup Relabeling using Generative Models »
Jy yong Sohn · Liang Shang · Hongxu Chen · Jaekyun Moon · Dimitris Papailiopoulos · Kangwook Lee -
2022 Spotlight: GenLabel: Mixup Relabeling using Generative Models »
Jy yong Sohn · Liang Shang · Hongxu Chen · Jaekyun Moon · Dimitris Papailiopoulos · Kangwook Lee -
2021 Poster: Coded-InvNet for Resilient Prediction Serving Systems »
Tuan Dinh · Kangwook Lee -
2021 Oral: Coded-InvNet for Resilient Prediction Serving Systems »
Tuan Dinh · Kangwook Lee -
2021 Poster: Discrete-Valued Latent Preference Matrix Estimation with Graph Side Information »
Changhun Jo · Kangwook Lee -
2021 Spotlight: Discrete-Valued Latent Preference Matrix Estimation with Graph Side Information »
Changhun Jo · Kangwook Lee -
2020 Poster: FR-Train: A Mutual Information-Based Approach to Fair and Robust Training »
Yuji Roh · Kangwook Lee · Steven Whang · Changho Suh -
2019 : Poster Session I »
Stark Draper · Mehmet Aktas · Basak Guler · Hongyi Wang · Venkata Gandikota · Hyegyeong Park · Jinhyun So · Lev Tauz · hema venkata krishna giri Narra · Zhifeng Lin · Mohammadali Maddahali · Yaoqing Yang · Sanghamitra Dutta · Amirhossein Reisizadeh · Jianyu Wang · Eren Balevi · Siddharth Jain · Paul McVay · Michael Rudow · Pedro Soto · Jun Li · Adarsh Subramaniam · Umut Demirhan · Vipul Gupta · Deniz Oktay · Leighton P Barnes · Johannes Ballé · Farzin Haddadpour · Haewon Jeong · Rong-Rong Chen · Mohammad Fahim -
2018 Poster: DRACO: Byzantine-resilient Distributed Training via Redundant Gradients »
Lingjiao Chen · Hongyi Wang · Zachary Charles · Dimitris Papailiopoulos -
2018 Oral: DRACO: Byzantine-resilient Distributed Training via Redundant Gradients »
Lingjiao Chen · Hongyi Wang · Zachary Charles · Dimitris Papailiopoulos