Fri 12:00 p.m. - 12:01 p.m.
|
Opening Remarks
(
Remarks
)
>
SlidesLive Video
|
🔗
|
Fri 12:00 p.m. - 12:45 p.m.
|
Feature Learning in Two-layer Neural Networks under Structured Data, Murat A. Erdogdu
(
Plenary Speaker
)
>
SlidesLive Video
|
Murat Erdogdu
🔗
|
Fri 12:45 p.m. - 1:15 p.m.
|
Contributed talks 1
(
Contributed talks
)
>
SlidesLive Video
|
MENGQI LOU · Zhichao Wang
🔗
|
Fri 1:15 p.m. - 2:15 p.m.
|
Poster Session/Coffee Break
|
🔗
|
Fri 2:15 p.m. - 3:00 p.m.
|
High-dimensional Optimization in the Age of ChatGPT, Sanjeev Arora
(
Plenary Speaker
)
>
SlidesLive Video
|
Sanjeev Arora
🔗
|
Fri 3:00 p.m. - 4:30 p.m.
|
Lunch
|
🔗
|
Fri 4:30 p.m. - 5:15 p.m.
|
Multi-level theory of neural representations: Capacity of neural manifolds in biological and artificial neural networks, SueYeon Chung
(
Plenary Speaker
)
>
SlidesLive Video
|
SueYeon Chung
🔗
|
Fri 5:15 p.m. - 6:00 p.m.
|
Contributed talks 2
(
Contributed talks
)
>
SlidesLive Video
|
Simon Du · Wei Huang · Yuandong Tian
🔗
|
Fri 6:00 p.m. - 6:30 p.m.
|
Coffee Break
|
🔗
|
Fri 6:30 p.m. - 7:15 p.m.
|
A strong implicit bias in SGD dynamics towards much simpler subnetworks through stochastic collapse to invariant sets, Surya Ganguli
(
Plenary Speaker
)
>
link
SlidesLive Video
|
Surya Ganguli
🔗
|
Fri 7:15 p.m. - 8:00 p.m.
|
Solving overparametrized systems of random equations, Andrea Montanari
(
Plenary Speaker
)
>
SlidesLive Video
|
Andrea Montanari
🔗
|
Fri 7:59 p.m. - 8:00 p.m.
|
Closing remarks
(
Remarks
)
>
|
🔗
|
-
|
Learning to Plan in Multi-dimensional Stochastic Differential Equations
(
Poster
)
>
|
Mohamad Sadegh Shirani Faradonbeh · Mohamad Kazem Shirani Faradonbeh
🔗
|
-
|
Elephant Neural Networks: Born to Be a Continual Learner
(
Poster
)
>
|
Qingfeng Lan · Rupam Mahmood
🔗
|
-
|
Investigating the Edge of Stability Phenomenon in Reinforcement Learning
(
Poster
)
>
|
Rares Iordan · Mihaela Rosca · Marc Deisenroth
🔗
|
-
|
Deep Neural Networks Extrapolate Cautiously in High Dimensions
(
Poster
)
>
|
Katie Kang · Amrith Setlur · Claire Tomlin · Sergey Levine
🔗
|
-
|
Implicit regularisation in stochastic gradient descent: from single-objective to two-player games
(
Poster
)
>
|
Mihaela Rosca · Marc Deisenroth
🔗
|
-
|
How to escape sharp minima
(
Poster
)
>
|
Kwangjun Ahn · Ali Jadbabaie · Suvrit Sra
🔗
|
-
|
Adapting to Gradual Distribution Shifts with Continual Weight Averaging
(
Poster
)
>
|
Jared Fernandez · Saujas Vaduguru · Sanket Vaibhav Mehta · Yonatan Bisk · Emma Strubell
🔗
|
-
|
On the Problem of Transferring Learning Trajectories Between Neural Networks
(
Poster
)
>
|
Daiki Chijiwa
🔗
|
-
|
Neural Collapse in the Intermediate Hidden Layers of Classification Neural Networks
(
Poster
)
>
|
Liam Parker
🔗
|
-
|
Flatter, Faster: Scaling Momentum for Optimal Speedup of SGD
(
Poster
)
>
|
Aditya Cowsik · Tankut Can · Paolo Glorioso
🔗
|
-
|
Which Features are Learned by Contrastive Learning? On the Role of Simplicity Bias in Class Collapse and Feature Suppression
(
Poster
)
>
|
Yihao Xue · Siddharth Joshi · Eric Gan · Pin-Yu Chen · Baharan Mirzasoleiman
🔗
|
-
|
An improved residual based random forest for robust prediction
(
Poster
)
>
|
Mingyan Li
🔗
|
-
|
How Does Adaptive Optimization Impact Local Neural Network Geometry?
(
Poster
)
>
|
Kaiqi Jiang · Dhruv Malik · Yuanzhi Li
🔗
|
-
|
Effects of Overparameterization on Sharpness-Aware Minimization: A Preliminary Investigation
(
Poster
)
>
|
Sungbin Shin · Dongyeop Lee · Namhoon Lee
🔗
|
-
|
High-dimensional Learning Dynamics of Deep Neural Nets in the Neural Tangent Regime
(
Poster
)
>
|
Yongqi Du · Zenan Ling · Robert Qiu · Zhenyu Liao
🔗
|
-
|
On the Equivalence Between Implicit and Explicit Neural Networks: A High-dimensional Viewpoint
(
Poster
)
>
|
Zenan Ling · Zhenyu Liao · Robert Qiu
🔗
|
-
|
Hyperparameter Tuning using Loss Landscape
(
Poster
)
>
|
Jianlong Chen · Qinxue Cao · Yefan Zhou · Konstantin Schürholt · Yaoqing Yang
🔗
|
-
|
Sharpness-Aware Minimization Leads to Low-Rank Features
(
Poster
)
>
|
Maksym Andriushchenko · Dara Bahri · Hossein Mobahi · Nicolas Flammarion
🔗
|
-
|
Layerwise Linear Mode Connectivity
(
Poster
)
>
|
Linara Adilova · Asja Fischer · Martin Jaggi
🔗
|
-
|
Does Double Descent Occur in Self-Supervised Learning?
(
Poster
)
>
|
Alisia Lupidi · Yonatan Gideoni · Dulhan Jayalath
🔗
|
-
|
On the Universality of Linear Recurrences Followed by Nonlinear Projections
(
Poster
)
>
|
Antonio Orvieto · Soham De · Razvan Pascanu · Caglar Gulcehre · Samuel Smith
🔗
|
-
|
Network Degeneracy as an Indicator of Training Performance: Comparing Finite and Infinite Width Angle Predictions
(
Poster
)
>
|
Cameron Jakub · Mihai Nica
🔗
|
-
|
Implicitly Learned Invariance and Equivariance in Linear Regression
(
Poster
)
>
|
Yonatan Gideoni
🔗
|
-
|
Latent State Transitions in Training Dynamics
(
Poster
)
>
|
Michael Hu · Angelica Chen · Naomi Saphra · Kyunghyun Cho
🔗
|
-
|
Hessian Inertia in Neural Networks
(
Poster
)
>
|
Xuchan Bao · Alberto Bietti · Aaron Defazio · Vivien Cabannnes
🔗
|
-
|
Generalization and Stability of Interpolating Neural Networks with Minimal Width
(
Poster
)
>
|
Hossein Taheri · Christos Thrampoulidis
🔗
|
-
|
The Training Process of Many Deep Networks Explores the Same Low-Dimensional Manifold
(
Poster
)
>
|
Jialin Mao · Han Kheng Teoh · Itay Griniasty · Rahul Ramesh · Rubing Yang · Mark Transtrum · James Sethna · Pratik Chaudhari
🔗
|
-
|
An Adaptive Method for Minimizing Non-negative Losses
(
Poster
)
>
|
Antonio Orvieto · Lin Xiao
🔗
|
-
|
The Marginal Value of Momentum for Small Learning Rate SGD
(
Poster
)
>
|
Runzhe Wang · Sadhika Malladi · Tianhao Wang · Kaifeng Lyu · Zhiyuan Li
🔗
|
-
|
Spectral Evolution and Invariance in Linear-width Neural Networks
(
Poster
)
>
|
Zhichao Wang · Andrew Engel · Anand Sarwate · Ioana Dumitriu · Tony Chiang
🔗
|
-
|
On the Joint Interaction of Models, Data, and Features
(
Poster
)
>
|
YiDing Jiang · Christina Baek · Zico Kolter
🔗
|
-
|
Predictive Sparse Manifold Transform
(
Poster
)
>
|
Yujia Xie · Xinhui Li · Vince Calhoun
🔗
|
-
|
Margin Maximization in Attention Mechanism
(
Poster
)
>
|
Davoud Ataee Tarzanagh · Yingcong Li · Xuechen Zhang · Samet Oymak
🔗
|
-
|
Supervised-Contrastive Loss Learns Orthogonal Frames and Batching Matters
(
Poster
)
>
|
Ganesh Ramachandra Kini · Vala Vakilian · Tina Behnia · Jaidev Gill · Christos Thrampoulidis
🔗
|
-
|
Characterizing and Improving Transformer Solutions for Dyck Grammars
(
Poster
)
>
|
Kaiyue Wen · Yuchen Li · Bingbin Liu · Andrej Risteski
🔗
|
-
|
Benign Overfitting of Two-Layer Neural Networks under Inputs with Intrinsic Dimension
(
Poster
)
>
|
Shunta Akiyama · Kazusato Oko · Taiji Suzuki
🔗
|
-
|
Implicit regularization of multi-task learning and finetuning in overparameterized neural networks
(
Poster
)
>
|
Samuel Lippl · Jack Lindsey
🔗
|
-
|
Sharpness Minimization Algorithms Do Not Only Minimize Sharpness To Achieve Better Generalization
(
Poster
)
>
|
Kaiyue Wen · Tengyu Ma · Zhiyuan Li
🔗
|
-
|
The phases of large learning rate gradient descent through effective parameters
(
Poster
)
>
|
Lawrence Wang · Stephen Roberts
🔗
|
-
|
On Privileged and Convergent Bases in Neural Network Representations
(
Poster
)
>
|
Davis Brown · Nikhil Vyas · Yamini Bansal
🔗
|
-
|
On the Effectiveness of Sharpness-Aware Minimization with Large Mini-batches
(
Poster
)
>
|
Jinseok Chung · Seonghwan Park · Jaeho Lee · Namhoon Lee
🔗
|
-
|
Fast Test Error Rates for Gradient-based Algorithms on Separable Data
(
Poster
)
>
|
Puneesh Deora · Bhavya Vasudeva · Vatsal Sharan · Christos Thrampoulidis
🔗
|
-
|
On the Advantage of Lion Compared to signSGD with Momentum
(
Poster
)
>
|
Alessandro Noiato · Luca Biggio · Antonio Orvieto
🔗
|
-
|
On the Training and Generalization Dynamics of Multi-head Attention
(
Poster
)
>
|
Puneesh Deora · Rouzbeh Ghaderi · Hossein Taheri · Christos Thrampoulidis
🔗
|
-
|
Learning Stochastic Dynamical Systems as an Implicit Regularization with Graph Neural Network
(
Poster
)
>
|
Jin Guo · Ting Gao · Yufu Lan · Peng Zhang · Sikun Yang · Jinqiao Duan
🔗
|
-
|
Graph Neural Networks Provably Benefit from Structural Information: A Feature Learning Perspective
(
Oral
)
>
|
Wei Huang · Yuan Cao · Haonan Wang · Xin Cao · Taiji Suzuki
🔗
|
-
|
Over-Parameterization Exponentially Slows Down Gradient Descent for Learning a Single Neuron
(
Oral
)
>
link
|
Weihang Xu · Simon Du
🔗
|
-
|
Sharp predictions for mini-batched prox-linear iterations in rank one matrix sensing
(
Oral
)
>
|
MENGQI LOU · Kabir Chandrasekher · Ashwin Pananjady
🔗
|
-
|
Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer
(
Oral
)
>
|
Yuandong Tian · Yiping Wang · Beidi Chen · Simon Du
🔗
|
-
|
Learning in the Presence of Low-dimensional Structure: A Spiked Random Matrix Perspective
(
Oral
)
>
|
Jimmy Ba · Murat Erdogdu · Taiji Suzuki · Zhichao Wang · Denny Wu
🔗
|