Fri 12:00 p.m. - 12:01 p.m.
|
Opening Remarks
(
Remarks
)
>
SlidesLive Video
|
馃敆
|
Fri 12:00 p.m. - 12:45 p.m.
|
Feature Learning in Two-layer Neural Networks under Structured Data, Murat A. Erdogdu
(
Plenary Speaker
)
>
SlidesLive Video
|
Murat Erdogdu
馃敆
|
Fri 12:45 p.m. - 1:15 p.m.
|
Contributed talks 1
(
Contributed talks
)
>
SlidesLive Video
|
MENGQI LOU 路 Zhichao Wang
馃敆
|
Fri 1:15 p.m. - 2:15 p.m.
|
Poster Session/Coffee Break
|
馃敆
|
Fri 2:15 p.m. - 3:00 p.m.
|
High-dimensional Optimization in the Age of ChatGPT, Sanjeev Arora
(
Plenary Speaker
)
>
SlidesLive Video
|
Sanjeev Arora
馃敆
|
Fri 3:00 p.m. - 4:30 p.m.
|
Lunch
|
馃敆
|
Fri 4:30 p.m. - 5:15 p.m.
|
Multi-level theory of neural representations: Capacity of neural manifolds in biological and artificial neural networks, SueYeon Chung
(
Plenary Speaker
)
>
SlidesLive Video
|
SueYeon Chung
馃敆
|
Fri 5:15 p.m. - 6:00 p.m.
|
Contributed talks 2
(
Contributed talks
)
>
SlidesLive Video
|
Simon Du 路 Wei Huang 路 Yuandong Tian
馃敆
|
Fri 6:00 p.m. - 6:30 p.m.
|
Coffee Break
|
馃敆
|
Fri 6:30 p.m. - 7:15 p.m.
|
A strong implicit bias in SGD dynamics towards much simpler subnetworks through stochastic collapse to invariant sets, Surya Ganguli
(
Plenary Speaker
)
>
link
SlidesLive Video
|
Surya Ganguli
馃敆
|
Fri 7:15 p.m. - 8:00 p.m.
|
Solving overparametrized systems of random equations, Andrea Montanari
(
Plenary Speaker
)
>
SlidesLive Video
|
Andrea Montanari
馃敆
|
Fri 7:59 p.m. - 8:00 p.m.
|
Closing remarks
(
Remarks
)
>
|
馃敆
|
-
|
Learning to Plan in Multi-dimensional Stochastic Differential Equations
(
Poster
)
>
|
Mohamad Sadegh Shirani Faradonbeh 路 Mohamad Kazem Shirani Faradonbeh
馃敆
|
-
|
Elephant Neural Networks: Born to Be a Continual Learner
(
Poster
)
>
|
Qingfeng Lan 路 Rupam Mahmood
馃敆
|
-
|
Investigating the Edge of Stability Phenomenon in Reinforcement Learning
(
Poster
)
>
|
Rares Iordan 路 Mihaela Rosca 路 Marc Deisenroth
馃敆
|
-
|
Deep Neural Networks Extrapolate Cautiously in High Dimensions
(
Poster
)
>
|
Katie Kang 路 Amrith Setlur 路 Claire Tomlin 路 Sergey Levine
馃敆
|
-
|
Implicit regularisation in stochastic gradient descent: from single-objective to two-player games
(
Poster
)
>
|
Mihaela Rosca 路 Marc Deisenroth
馃敆
|
-
|
How to escape sharp minima
(
Poster
)
>
|
Kwangjun Ahn 路 Ali Jadbabaie 路 Suvrit Sra
馃敆
|
-
|
Adapting to Gradual Distribution Shifts with Continual Weight Averaging
(
Poster
)
>
|
Jared Fernandez 路 Saujas Vaduguru 路 Sanket Vaibhav Mehta 路 Yonatan Bisk 路 Emma Strubell
馃敆
|
-
|
On the Problem of Transferring Learning Trajectories Between Neural Networks
(
Poster
)
>
|
Daiki Chijiwa
馃敆
|
-
|
Neural Collapse in the Intermediate Hidden Layers of Classification Neural Networks
(
Poster
)
>
|
Liam Parker
馃敆
|
-
|
Flatter, Faster: Scaling Momentum for Optimal Speedup of SGD
(
Poster
)
>
|
Aditya Cowsik 路 Tankut Can 路 Paolo Glorioso
馃敆
|
-
|
Which Features are Learned by Contrastive Learning? On the Role of Simplicity Bias in Class Collapse and Feature Suppression
(
Poster
)
>
|
Yihao Xue 路 Siddharth Joshi 路 Eric Gan 路 Pin-Yu Chen 路 Baharan Mirzasoleiman
馃敆
|
-
|
An improved residual based random forest for robust prediction
(
Poster
)
>
|
Mingyan Li
馃敆
|
-
|
How Does Adaptive Optimization Impact Local Neural Network Geometry?
(
Poster
)
>
|
Kaiqi Jiang 路 Dhruv Malik 路 Yuanzhi Li
馃敆
|
-
|
Effects of Overparameterization on Sharpness-Aware Minimization: A Preliminary Investigation
(
Poster
)
>
|
Sungbin Shin 路 Dongyeop Lee 路 Namhoon Lee
馃敆
|
-
|
High-dimensional Learning Dynamics of Deep Neural Nets in the Neural Tangent Regime
(
Poster
)
>
|
Yongqi Du 路 Zenan Ling 路 Robert Qiu 路 Zhenyu Liao
馃敆
|
-
|
On the Equivalence Between Implicit and Explicit Neural Networks: A High-dimensional Viewpoint
(
Poster
)
>
|
Zenan Ling 路 Zhenyu Liao 路 Robert Qiu
馃敆
|
-
|
Hyperparameter Tuning using Loss Landscape
(
Poster
)
>
|
Jianlong Chen 路 Qinxue Cao 路 Yefan Zhou 路 Konstantin Sch眉rholt 路 Yaoqing Yang
馃敆
|
-
|
Sharpness-Aware Minimization Leads to Low-Rank Features
(
Poster
)
>
|
Maksym Andriushchenko 路 Dara Bahri 路 Hossein Mobahi 路 Nicolas Flammarion
馃敆
|
-
|
Layerwise Linear Mode Connectivity
(
Poster
)
>
|
Linara Adilova 路 Asja Fischer 路 Martin Jaggi
馃敆
|
-
|
Does Double Descent Occur in Self-Supervised Learning?
(
Poster
)
>
|
Alisia Lupidi 路 Yonatan Gideoni 路 Dulhan Jayalath
馃敆
|
-
|
On the Universality of Linear Recurrences Followed by Nonlinear Projections
(
Poster
)
>
|
Antonio Orvieto 路 Soham De 路 Razvan Pascanu 路 Caglar Gulcehre 路 Samuel Smith
馃敆
|
-
|
Network Degeneracy as an Indicator of Training Performance: Comparing Finite and Infinite Width Angle Predictions
(
Poster
)
>
|
Cameron Jakub 路 Mihai Nica
馃敆
|
-
|
Implicitly Learned Invariance and Equivariance in Linear Regression
(
Poster
)
>
|
Yonatan Gideoni
馃敆
|
-
|
Latent State Transitions in Training Dynamics
(
Poster
)
>
|
Michael Hu 路 Angelica Chen 路 Naomi Saphra 路 Kyunghyun Cho
馃敆
|
-
|
Hessian Inertia in Neural Networks
(
Poster
)
>
|
Xuchan Bao 路 Alberto Bietti 路 Aaron Defazio 路 Vivien Cabannnes
馃敆
|
-
|
Generalization and Stability of Interpolating Neural Networks with Minimal Width
(
Poster
)
>
|
Hossein Taheri 路 Christos Thrampoulidis
馃敆
|
-
|
The Training Process of Many Deep Networks Explores the Same Low-Dimensional Manifold
(
Poster
)
>
|
Jialin Mao 路 Han Kheng Teoh 路 Itay Griniasty 路 Rahul Ramesh 路 Rubing Yang 路 Mark Transtrum 路 James Sethna 路 Pratik Chaudhari
馃敆
|
-
|
An Adaptive Method for Minimizing Non-negative Losses
(
Poster
)
>
|
Antonio Orvieto 路 Lin Xiao
馃敆
|
-
|
The Marginal Value of Momentum for Small Learning Rate SGD
(
Poster
)
>
|
Runzhe Wang 路 Sadhika Malladi 路 Tianhao Wang 路 Kaifeng Lyu 路 Zhiyuan Li
馃敆
|
-
|
Spectral Evolution and Invariance in Linear-width Neural Networks
(
Poster
)
>
|
Zhichao Wang 路 Andrew Engel 路 Anand Sarwate 路 Ioana Dumitriu 路 Tony Chiang
馃敆
|
-
|
On the Joint Interaction of Models, Data, and Features
(
Poster
)
>
|
YiDing Jiang 路 Christina Baek 路 Zico Kolter
馃敆
|
-
|
Predictive Sparse Manifold Transform
(
Poster
)
>
|
Yujia Xie 路 Xinhui Li 路 Vince Calhoun
馃敆
|
-
|
Margin Maximization in Attention Mechanism
(
Poster
)
>
|
Davoud Ataee Tarzanagh 路 Yingcong Li 路 Xuechen Zhang 路 Samet Oymak
馃敆
|
-
|
Supervised-Contrastive Loss Learns Orthogonal Frames and Batching Matters
(
Poster
)
>
|
Ganesh Ramachandra Kini 路 Vala Vakilian 路 Tina Behnia 路 Jaidev Gill 路 Christos Thrampoulidis
馃敆
|
-
|
Characterizing and Improving Transformer Solutions for Dyck Grammars
(
Poster
)
>
|
Kaiyue Wen 路 Yuchen Li 路 Bingbin Liu 路 Andrej Risteski
馃敆
|
-
|
Benign Overfitting of Two-Layer Neural Networks under Inputs with Intrinsic Dimension
(
Poster
)
>
|
Shunta Akiyama 路 Kazusato Oko 路 Taiji Suzuki
馃敆
|
-
|
Implicit regularization of multi-task learning and finetuning in overparameterized neural networks
(
Poster
)
>
|
Samuel Lippl 路 Jack Lindsey
馃敆
|
-
|
Sharpness Minimization Algorithms Do Not Only Minimize Sharpness To Achieve Better Generalization
(
Poster
)
>
|
Kaiyue Wen 路 Tengyu Ma 路 Zhiyuan Li
馃敆
|
-
|
The phases of large learning rate gradient descent through effective parameters
(
Poster
)
>
|
Lawrence Wang 路 Stephen Roberts
馃敆
|
-
|
On Privileged and Convergent Bases in Neural Network Representations
(
Poster
)
>
|
Davis Brown 路 Nikhil Vyas 路 Yamini Bansal
馃敆
|
-
|
On the Effectiveness of Sharpness-Aware Minimization with Large Mini-batches
(
Poster
)
>
|
Jinseok Chung 路 Seonghwan Park 路 Jaeho Lee 路 Namhoon Lee
馃敆
|
-
|
Fast Test Error Rates for Gradient-based Algorithms on Separable Data
(
Poster
)
>
|
Puneesh Deora 路 Bhavya Vasudeva 路 Vatsal Sharan 路 Christos Thrampoulidis
馃敆
|
-
|
On the Advantage of Lion Compared to signSGD with Momentum
(
Poster
)
>
|
Alessandro Noiato 路 Luca Biggio 路 Antonio Orvieto
馃敆
|
-
|
On the Training and Generalization Dynamics of Multi-head Attention
(
Poster
)
>
|
Puneesh Deora 路 Rouzbeh Ghaderi 路 Hossein Taheri 路 Christos Thrampoulidis
馃敆
|
-
|
Learning Stochastic Dynamical Systems as an Implicit Regularization with Graph Neural Network
(
Poster
)
>
|
Jin Guo 路 Ting Gao 路 Yufu Lan 路 Peng Zhang 路 Sikun Yang 路 Jinqiao Duan
馃敆
|
-
|
Graph Neural Networks Provably Benefit from Structural Information: A Feature Learning Perspective
(
Oral
)
>
|
Wei Huang 路 Yuan Cao 路 Haonan Wang 路 Xin Cao 路 Taiji Suzuki
馃敆
|
-
|
Over-Parameterization Exponentially Slows Down Gradient Descent for Learning a Single Neuron
(
Oral
)
>
link
|
Weihang Xu 路 Simon Du
馃敆
|
-
|
Sharp predictions for mini-batched prox-linear iterations in rank one matrix sensing
(
Oral
)
>
|
MENGQI LOU 路 Kabir Chandrasekher 路 Ashwin Pananjady
馃敆
|
-
|
Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer
(
Oral
)
>
|
Yuandong Tian 路 Yiping Wang 路 Beidi Chen 路 Simon Du
馃敆
|
-
|
Learning in the Presence of Low-dimensional Structure: A Spiked Random Matrix Perspective
(
Oral
)
>
|
Jimmy Ba 路 Murat Erdogdu 路 Taiji Suzuki 路 Zhichao Wang 路 Denny Wu
馃敆
|