Show Detail » |
Timezone: America/Los_Angeles |

Filter Events:

Filter Rooms:

SUN 17 JUL

6 a.m.

7 a.m.

(ends 4:00 PM)

9 a.m.

9:30 a.m.

Expo Talk Panel:

(ends 10:15 AM)

10:20 a.m.

Expo Demonstration:

(ends 11:20 AM)

11 a.m.

Cancelled:

(ends 11:45 AM)

11:30 a.m.

Coffee Break

12:15 p.m.

Expo Demonstration:

(ends 1:00 PM)

1:15 p.m.

2:30 p.m.

MON 18 JUL

4 a.m.

(ends 3:00 PM)

5 a.m.

5:30 a.m.

6 a.m.

6:30 a.m.

Tutorial:

(ends 8:45 AM)

7 a.m.

Coffee Break

8 a.m.

9 a.m.

Lunch Break - on your own

10 a.m.

Tutorial:

(ends 12:00 PM)

11 a.m.

noon

Coffee Break

12:30 p.m.

Tutorial:

(ends 2:50 PM)

4 p.m.

TUE 19 JUL

3:30 a.m.

Breakfast on your own

4 a.m.

(ends 4:00 PM)

5:45 a.m.

6 a.m.

7 a.m.

Coffee Break

7:30 a.m.

Spotlight
s
7:30-8:05

[7:30]
Differentially Private Approximate Quantiles

[7:35]
Fairness Interventions as (Dis)Incentives for Strategic Manipulation

[7:40]
Robust Models Are More Interpretable Because Attributions Look Normal

[7:45]
Sequential Covariate Shift Detection Using Classifier Two-Sample Tests

[7:50]
A Joint Exponential Mechanism For Differentially Private Top-$k$

[7:55]
Transfer Learning In Differential Privacy's Hybrid-Model

[8:00]
Robust Kernel Density Estimation with Median-of-Means principle

Oral
s
8:05-8:25

[8:05]
Bounding Training Data Reconstruction in Private (Deep) Learning

Spotlight
s
8:25-9:00

[8:25]
Plug & Play Attacks: Towards Robust and Flexible Model Inversion Attacks

[8:30]
FriendlyCore: Practical Differentially Private Aggregation

[8:35]
ViT-NeT: Interpretable Vision Transformers with Neural Tree Decoder

[8:40]
Fishing for User Data in Large-Batch Federated Learning via Gradient Magnification

[8:45]
Public Data-Assisted Mirror Descent for Private Model Training

[8:50]
Low-Complexity Deep Convolutional Neural Networks on Fully Homomorphic Encryption Using Multiplexed Parallel Convolutions

[8:55]
Robin Hood and Matthew Effects: Differential Privacy Has Disparate Impact on Synthetic Data

(ends 9:00 AM)

Oral
s
7:30-7:50

[7:30]
Tackling covariate shift with node-based Bayesian neural networks

Spotlight
s
7:50-8:10

[7:50]
Why the Rich Get Richer? On the Balancedness of Random Partition Models

[7:55]
A Completely Tuning-Free and Robust Approach to Sparse Precision Matrix Estimation

[8:00]
Markov Chain Monte Carlo for Continuous-Time Switching Dynamical Systems

[8:05]
Calibrated Learning to Defer with One-vs-All Classifiers

Oral
s
8:10-8:30

[8:10]
Tractable Uncertainty for Structure Learning

Spotlight
s
8:30-8:55

[8:30]
DNA: Domain Generalization with Diversified Neural Averaging

[8:35]
Unified Fourier-based Kernel and Nonlinearity Design for Equivariant Networks on Homogeneous Spaces

[8:40]
DynaMixer: A Vision MLP Architecture with Dynamic Mixing

[8:45]
Channel Importance Matters in Few-Shot Image Classification

[8:50]
Fishr: Invariant Gradient Variances for Out-of-Distribution Generalization

(ends 9:00 AM)

Spotlight
s
7:30-8:00

[7:30]
Dynamic Regret of Online Markov Decision Processes

[7:35]
On the Impossibility of Learning to Cooperate with Adaptive Partner Strategies in Repeated Games

[7:40]
Distributional Hamilton-Jacobi-Bellman Equations for Continuous-Time Reinforcement Learning

[7:45]
Provable Reinforcement Learning with a Short-Term Memory

[7:50]
Optimistic Linear Support and Successor Features as a Basis for Optimal Policy Transfer

[7:55]
Mirror Learning: A Unifying Framework of Policy Optimisation

Oral
s
8:00-8:20

[8:00]
Improved No-Regret Algorithms for Stochastic Shortest Path with Linear MDP

Spotlight
s
8:20-8:50

[8:20]
Learning Infinite-horizon Average-reward Markov Decision Process with Constraints

[8:25]
A State-Distribution Matching Approach to Non-Episodic Reinforcement Learning

[8:30]
Langevin Monte Carlo for Contextual Bandits

[8:35]
Prompting Decision Transformer for Few-Shot Policy Generalization

[8:40]
Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning

[8:45]
Human-in-the-loop: Provably Efficient Preference-based Reinforcement Learning with General Function Approximation

(ends 9:00 AM)

Oral
s
7:30-7:50

[7:30]
Detecting Adversarial Examples Is (Nearly) As Hard As Classifying Them

Spotlight
s
7:50-8:10

[7:50]
ShiftAddNAS: Hardware-Inspired Search for More Accurate and Efficient Neural Networks

[7:55]
Provably Adversarially Robust Nearest Prototype Classifiers

[8:00]
Certifying Out-of-Domain Generalization for Blackbox Functions

[8:05]
Intriguing Properties of Input-Dependent Randomized Smoothing

Oral
s
8:10-8:30

[8:10]
To Smooth or Not? When Label Smoothing Meets Noisy Labels

Spotlight
s
8:30-8:55

[8:30]
Evaluating the Adversarial Robustness of Adaptive Test-time Defenses

[8:35]
On the Generalization Analysis of Adversarial Learning

[8:40]
Demystifying the Adversarial Robustness of Random Transformation Defenses

[8:45]
Double Sampling Randomized Smoothing

[8:50]
TPC: Transformation-Specific Smoothing for Point Cloud Models

(ends 9:00 AM)

Spotlight
s
7:30-8:00

[7:30]
Certified Robustness Against Natural Language Attacks by Causal Intervention

[7:35]
A$^3$T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing

[7:40]
On the Learning of Non-Autoregressive Transformers

[7:45]
Latent Diffusion Energy-Based Model for Interpretable Text Modelling

[7:50]
UNIREX: A Unified Learning Framework for Language Model Rationale Extraction

[7:55]
Black-Box Tuning for Language-Model-as-a-Service

Oral
s
8:00-8:20

[8:00]
Understanding Dataset Difficulty with $\mathcal{V}$-Usable Information

Spotlight
s
8:20-9:00

[8:20]
Co-training Improves Prompt-based Learning for Large Language Models

[8:25]
Directed Acyclic Transformer for Non-Autoregressive Machine Translation

[8:30]
StreamingQA: A Benchmark for Adaptation to New Knowledge over Time in Question Answering Models

[8:35]
Unsupervised Detection of Contextualized Embedding Bias with Application to Ideology

[8:40]
Generative Cooperative Networks for Natural Language Generation

[8:45]
What Language Model Architecture and Pretraining Objective Works Best for Zero-Shot Generalization?

[8:50]
Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding

[8:55]
ROCK: Causal Inference Principles for Reasoning about Commonsense Causality

(ends 9:00 AM)

Oral
s
7:30-7:50

[7:30]
Exact Optimal Accelerated Complexity for Fixed-Point Iterations

Spotlight
s
7:50-8:15

[7:50]
Fast Convex Optimization for Two-Layer ReLU Networks: Equivalent Model Classes and Cone Decompositions

[7:55]
NysADMM: faster composite convex optimization via low-rank approximation

[8:00]
FedNew: A Communication-Efficient and Privacy-Preserving Newton-Type Method for Federated Learning

[8:05]
Unraveling Attention via Convex Duality: Analysis and Interpretations of Vision Transformers

[8:10]
Pairwise Conditional Gradients without Swap Steps and Sparser Kernel Herding

Oral
s
8:15-8:35

[8:15]
Continuous-Time Analysis of Accelerated Gradient Methods via Conservation Laws in Dilated Coordinate Systems

Spotlight
s
8:35-9:00

[8:35]
Only tails matter: Average-Case Universality and Robustness in the Convex Regime

[8:40]
Batch Greenkhorn Algorithm for Entropic-Regularized Multimarginal Optimal Transport: Linear Rate of Convergence and Iteration Complexity

[8:45]
Approximate Frank-Wolfe Algorithms over Graph-structured Support Sets

[8:50]
Neural Fisher Discriminant Analysis: Optimal Neural Network Embeddings in Polynomial Time

[8:55]
Active Sampling for Min-Max Fairness

(ends 9:00 AM)

Oral
s
7:30-7:50

[7:30]
Online Learning for Min Sum Set Cover and Pandoraâ€™s Box

Spotlight
s
7:50-8:15

[7:50]
Smoothed Adversarial Linear Contextual Bandits with Knapsacks

[7:55]
Simultaneously Learning Stochastic and Adversarial Bandits with General Graph Feedback

[8:00]
Thompson Sampling for (Combinatorial) Pure Exploration

[8:05]
Revisiting Online Submodular Minimization: Gap-Dependent Regret Bounds, Best of Both Worlds and Adversarial Robustness

[8:10]
Rotting Infinitely Many-Armed Bandits

Oral
s
8:15-8:35

[8:15]
Batched Dueling Bandits

Spotlight
s
8:35-9:00

[8:35]
Equivalence Analysis between Counterfactual Regret Minimization and Online Mirror Descent

[8:40]
Consistent Polyhedral Surrogates for Top-k Classification and Variants

[8:45]
Stochastic Contextual Dueling Bandits under Linear Stochastic Transitivity Models

[8:50]
Optimal and Efficient Dynamic Regret Algorithms for Non-Stationary Dueling Bandits

[8:55]
Online Nonsubmodular Minimization with Delayed Costs: From Full Information to Bandit Feedback

(ends 9:00 AM)

Spotlight
s
7:30-8:05

[7:30]
Multi-Task Learning as a Bargaining Game

[7:35]
Frustratingly Easy Transferability Estimation

[7:40]
Transformer Neural Processes: Uncertainty-Aware Meta Learning Via Sequence Modeling

[7:45]
A Difference Standardization Method for Mutual Transfer Learning

[7:50]
Improving Task-free Continual Learning by Distributionally Robust Memory Evolution

[7:55]
A Multi-objective / Multi-task Learning Framework Induced by Pareto Stationarity

[8:00]
Sparse Invariant Risk Minimization

Oral
s
8:05-8:25

[8:05]
Head2Toe: Utilizing Intermediate Representations for Better Transfer Learning

Spotlight
s
8:25-9:00

[8:25]
A Closer Look at Smoothness in Domain Adversarial Training

[8:30]
Balancing Discriminability and Transferability for Source-Free Domain Adaptation

[8:35]
Model Agnostic Sample Reweighting for Out-of-Distribution Learning

[8:40]
Zero-shot AutoML with Pretrained Models

[8:45]
Efficient Variance Reduction for Meta-learning

[8:50]
Generalizing to Evolving Domains with Latent Structure-Aware Sequential Autoencoder

[8:55]
Partial disentanglement for domain adaptation

(ends 9:00 AM)

Spotlight
s
7:30-8:05

[7:30]
Structural Entropy Guided Graph Hierarchical Pooling

[7:35]
Self-Supervised Representation Learning via Latent Graph Prediction

[7:40]
DSTAGNN: Dynamic Spatial-Temporal Aware Graph Neural Network for Traffic Flow Forecasting

[7:45]
Coarsening the Granularity: Towards Structurally Sparse Lottery Tickets

[7:50]
Omni-Granular Ego-Semantic Propagation for Self-Supervised Graph Representation Learning

[7:55]
Analyzing and Mitigating Interference in Neural Architecture Search

[8:00]
Reverse Engineering $\ell_p$ attacks: A block-sparse optimization approach with recovery guarantees

Oral
s
8:05-8:25

[8:05]
Unified Scaling Laws for Routed Language Models

Spotlight
s
8:25-9:00

[8:25]
DRAGONN: Distributed Randomized Approximate Gradients of Neural Networks

[8:30]
A deep convolutional neural network that is invariant to time rescaling

[8:35]
LyaNet: A Lyapunov Framework for Training Neural ODEs

[8:40]
Transfer and Marginalize: Explaining Away Label Noise with Privileged Information

[8:45]
On Collective Robustness of Bagging Against Data Poisoning

[8:50]
Hindering Adversarial Attacks with Implicit Neural Representations

[8:55]
From Noisy Prediction to True Label: Noisy Prediction Calibration via Generative Model

(ends 9:00 AM)

Spotlight
s
7:30-8:05

[7:30]
Exploring and Exploiting Hubness Priors for High-Quality GAN Latent Sampling

[7:35]
ButterflyFlow: Building Invertible Layers with Butterfly Matrices

[7:40]
Controlling Conditional Language Models without Catastrophic Forgetting

[7:45]
GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models

[7:50]
Structure-preserving GANs

[7:55]
DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale

[8:00]
Estimating the Optimal Covariance with Imperfect Mean in Diffusion Probabilistic Models

Oral
s
8:05-8:25

[8:05]
Equivariant Diffusion for Molecule Generation in 3D

Spotlight
s
8:25-9:00

[8:25]
Forward Operator Estimation in Generative Models with Kernel Transfer Operators

[8:30]
Conditional GANs with Auxiliary Discriminative Classifier

[8:35]
Improved StyleGAN-v2 based Inversion for Out-of-Distribution Images

[8:40]
Matching Normalizing Flows and Probability Paths on Manifolds

[8:45]
Marginal Distribution Adaptation for Discrete Sets via Module-Oriented Divergence Minimization

[8:50]
Learning to Incorporate Texture Saliency Adaptive Attention to Image Cartoonization

[8:55]
Region-Based Semantic Factorization in GANs

(ends 9:00 AM)

9 a.m.

Lunch Break - on your own

10:30 a.m.

Spotlight
s
10:30-11:00

[10:30]
Online Continual Learning through Mutual Information Maximization

[10:35]
Learning Iterative Reasoning through Energy Minimization

[10:40]
DepthShrinker: A New Compression Paradigm Towards Boosting Real-Hardware Efficiency of Compact Neural Networks

[10:45]
PoF: Post-Training of Feature Extractor for Improving Generalization

[10:50]
Improving Ensemble Distillation With Weight Averaging and Diversifying Perturbation

[10:55]
Set Based Stochastic Subsampling

Oral
s
11:00-11:20

[11:00]
Monarch: Expressive Structured Matrices for Efficient and Accurate Training

Spotlight
s
11:20-11:55

[11:20]
Generalizing to New Physical Systems via Context-Informed Dynamics Model

[11:25]
Self-conditioning Pre-Trained Language Models

[11:30]
TAM: Topology-Aware Margin Loss for Class-Imbalanced Node Classification

[11:35]
Bitwidth Heterogeneous Federated Learning with Progressive Weight Dequantization

[11:40]
Penalizing Gradient Norm for Efficiently Improving Generalization in Deep Learning

[11:45]
Knowledge Base Question Answering by Case-based Reasoning over Subgraphs

[11:50]
When AUC meets DRO: Optimizing Partial AUC for Deep Learning with Non-Convex Convergence Guarantee

(ends 12:00 PM)

Spotlight
s
10:30-11:05

[10:30]
Meaningfully debugging model mistakes using conceptual counterfactual explanations

[10:35]
Measuring the Effect of Training Data on Deep Learning Predictions via Randomized Experiments

[10:40]
Robust Counterfactual Explanations for Tree-Based Ensembles

[10:45]
A Rigorous Study of Integrated Gradients Method and Extensions to Internal Neuron Attributions

[10:50]
Estimating and Penalizing Induced Preference Shifts in Recommender Systems

[10:55]
Framework for Evaluating Faithfulness of Local Explanations

[11:00]
A Consistent and Efficient Evaluation Strategy for Attribution Methods

Oral
s
11:05-11:25

[11:05]
Training Characteristic Functions with Reinforcement Learning: XAI-methods play Connect Four

Spotlight
s
11:25-12:00

[11:25]
Label-Descriptive Patterns and Their Application to Characterizing Classification Errors

[11:30]
XAI for Transformers: Better Explanations through Conservative Propagation

[11:35]
Quantification and Analysis of Layer-wise and Pixel-wise Information Discarding

[11:40]
Interpretable Off-Policy Learning via Hyperbox Search

[11:45]
Neuron Dependency Graphs: A Causal Abstraction of Neural Networks

[11:50]
On the Adversarial Robustness of Causal Algorithmic Recourse

[11:55]
Knowledge-Grounded Self-Rationalization via Extractive and Natural Language Explanations

(ends 12:00 PM)

Spotlight
s
10:30-11:00

[10:30]
Robust Group Synchronization via Quadratic Programming

[10:35]
UAST: Uncertainty-Aware Siamese Tracking

[10:40]
You Only Cut Once: Boosting Data Augmentation with a Single Cut

[10:45]
Generative Modeling for Multi-task Visual Learning

[10:50]
HyperTransformer: Model Generation for Supervised and Semi-Supervised Few-Shot Learning

[10:55]
Parametric Visual Program Induction with Function Modularization

Oral
s
11:00-11:20

[11:00]
Path-Gradient Estimators for Continuous Normalizing Flows

Spotlight
s
11:20-11:55

[11:20]
Variational Feature Pyramid Networks

[11:25]
Deep Neural Network Fusion via Graph Matching with Applications to Model Ensemble and Federated Learning

[11:30]
VLMixer: Unpaired Vision-Language Pre-training via Cross-Modal CutMix

[11:35]
Neural Implicit Dictionary Learning via Mixture-of-Expert Training

[11:40]
Time Is MattEr: Temporal Self-supervision for Video Transformers

[11:45]
Benchmarking and Analyzing Point Cloud Classification under Corruptions

[11:50]
Understanding The Robustness in Vision Transformers

(ends 12:00 PM)

Oral
s
10:30-10:50

[10:30]
Learning Mixtures of Linear Dynamical Systems

Spotlight
s
10:50-11:15

[10:50]
Massively Parallel $k$-Means Clustering for Perturbation Resilient Instances

[10:55]
Residual-Based Sampling for Online Outlier-Robust PCA

[11:00]
Scaling Gaussian Process Optimization by Evaluating a Few Unique Candidates Multiple Times

[11:05]
Streaming Algorithms for Support-Aware Histograms

[11:10]
Power-Law Escape Rate of SGD

Oral
s
11:15-11:35

[11:15]
Generalized Results for the Existence and Consistency of the MLE in the Bradley-Terry-Luce Model

Spotlight
s
11:35-12:00

[11:35]
Faster Algorithms for Learning Convex Functions

[11:40]
Feature selection using e-values

[11:45]
ActiveHedge: Hedge meets Active Learning

[11:50]
One-Pass Algorithms for MAP Inference of Nonsymmetric Determinantal Point Processes

[11:55]
Deciphering Lasso-based Classification Through a Large Dimensional Analysis of the Iterative Soft-Thresholding Algorithm

(ends 12:00 PM)

Spotlight
s
10:30-11:00

[10:30]
An iterative clustering algorithm for the Contextual Stochastic Block Model with optimality guarantees

[10:35]
Smoothed Adaptive Weighting for Imbalanced Semi-Supervised Learning: Improve Reliability Against Unknown Distribution Data

[10:40]
Class-Imbalanced Semi-Supervised Learning with Adaptive Thresholding

[10:50]
Meta-Learning Hypothesis Spaces for Sequential Decision-making

[10:55]
A Tighter Analysis of Spectral Clustering, and Beyond

Oral
s
11:00-11:20

[11:00]
Online Active Regression

Spotlight
s
11:20-11:55

[11:20]
On Finite-Sample Identifiability of Contrastive Learning-Based Nonlinear Independent Component Analysis

[11:25]
Revisiting Contrastive Learning through the Lens of Neighborhood Component Analysis: an Integrated Framework

[11:30]
Open-Sampling: Exploring Out-of-Distribution data for Re-balancing Long-tailed datasets

[11:35]
Confidence Score for Source-Free Unsupervised Domain Adaptation

[11:40]
Gradient Based Clustering

[11:45]
Global Optimization of K-Center Clustering

[11:50]
Latent Outlier Exposure for Anomaly Detection with Contaminated Data

(ends 12:00 PM)

Spotlight
s
10:30-11:00

[10:30]
Additive Gaussian Processes Revisited

[10:35]
Probabilistic ODE Solutions in Millions of Dimensions

[10:40]
Adaptive Gaussian Process Change Point Detection

[10:45]
Volatility Based Kernels and Moving Average Means for Accurate Forecasting with Gaussian Processes

[10:50]
Fenrir: Physics-Enhanced Regression for Initial Value Problems

[10:55]
Variational nearest neighbor Gaussian process

Oral
s
11:00-11:20

[11:00]
Preconditioning for Scalable Gaussian Process Hyperparameter Optimization

Spotlight
s
11:20-11:50

[11:20]
Spectral Representation of Robustness Measures for Optimization Under Input Uncertainty

[11:25]
Bayesian Optimization under Stochastic Delayed Feedback

[11:30]
Bayesian Optimization for Distributionally Robust Chance-constrained Problem

[11:35]
Efficient Distributionally Robust Bayesian Optimization with Worst-case Sensitivity

[11:40]
Improved Convergence Rates for Sparse Approximation Methods in Kernel-Based Learning

[11:45]
Scalable First-Order Bayesian Optimization via Structured Automatic Differentiation

(ends 12:00 PM)

Oral
s
10:30-10:50

[10:30]
Align-RUDDER: Learning From Few Demonstrations by Reward Redistribution

Spotlight
s
10:50-11:15

[10:50]
AnyMorph: Learning Transferable Polices By Inferring Agent Morphology

[10:55]
DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with Prototypical Representations

[11:00]
Stabilizing Off-Policy Deep Reinforcement Learning from Pixels

[11:05]
Influence-Augmented Local Simulators: a Scalable Solution for Fast Deep RL in Large Networked Systems

[11:10]
CtrlFormer: Learning Transferable State Representation for Visual Control via Transformer

Oral
s
11:15-11:35

[11:15]
Offline RL Policies Should Be Trained to be Adaptive

Spotlight
s
11:35-12:00

[11:35]
Lyapunov Density Models: Constraining Distribution Shift in Learning-Based Control

[11:40]
PMIC: Improving Multi-Agent Reinforcement Learning with Progressive Mutual Information Collaboration

[11:45]
Supervised Off-Policy Ranking

[11:50]
The Primacy Bias in Deep Reinforcement Learning

[11:55]
Regularizing a Model-based Policy Stationary Distribution to Stabilize Offline Reinforcement Learning

(ends 12:00 PM)

Oral
s
10:30-10:50

[10:30]
Topology-Aware Network Pruning using Multi-stage Graph Embedding and Reinforcement Learning

Spotlight
s
10:50-11:10

[10:50]
Stochastic Reweighted Gradient Descent

[10:55]
Sharpened Quasi-Newton Methods: Faster Superlinear Rate and Larger Local Convergence Neighborhood

[11:00]
Image-to-Image Regression with Distribution-Free Uncertainty Quantification and Applications in Imaging

[11:05]
FedNL: Making Newton-Type Methods Applicable to Federated Learning

Oral
s
11:10-11:30

[11:10]
Solving Stackelberg Prediction Game with Least Squares Loss via Spherically Constrained Least Squares Reformulation

Spotlight
s
11:30-11:55

[11:30]
Dimension-free Complexity Bounds for High-order Nonconvex Finite-sum Optimization

[11:35]
Value Function based Difference-of-Convex Algorithm for Bilevel Hyperparameter Selection Problems

[11:40]
Probabilistic Bilevel Coreset Selection

[11:45]
Linear-Time Gromov Wasserstein Distances using Low Rank Couplings and Costs

[11:50]
On Implicit Bias in Overparameterized Bilevel Optimization

(ends 12:00 PM)

Spotlight
s
10:30-11:05

[10:30]
pathGCN: Learning General Graph Spatial Operators from Paths

[10:35]
Graph-Coupled Oscillator Networks

[10:40]
HousE: Knowledge Graph Embedding with Householder Parameterization

[10:45]
Interpretable and Generalizable Graph Learning via Stochastic Attention Mechanism

[10:50]
ProGCL: Rethinking Hard Negative Mining in Graph Contrastive Learning

[10:55]
G$^2$CN: Graph Gaussian Convolution Networks with Concentrated Graph Filters

[11:00]
SpeqNets: Sparsity-aware permutation-equivariant graph networks

Oral
s
11:05-11:25

[11:05]
data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language

Spotlight
s
11:25-11:55

[11:25]
Position Prediction as an Effective Pretraining Strategy

[11:30]
Orchestra: Unsupervised Federated Learning via Globally Consistent Clustering

[11:35]
Deep and Flexible Graph Neural Architecture Search

[11:40]
GNNRank: Learning Global Rankings from Pairwise Comparisons via Directed Graph Neural Networks

[11:45]
Large-Scale Graph Neural Architecture Search

[11:50]
Optimization-Induced Graph Implicit Nonlinear Diffusion

(ends 12:00 PM)

Oral
s
10:30-10:50

[10:30]
Robustness Implies Generalization via Data-Dependent Generalization Bounds

Spotlight
s
10:50-11:15

[10:50]
Learning to Hash Robustly, Guaranteed

[10:55]
Policy Gradient Method For Robust Reinforcement Learning

[11:00]
A query-optimal algorithm for finding counterfactuals

[11:05]
Linear Bandit Algorithms with Sublinear Time Complexity

[11:10]
Quantum-Inspired Algorithms from Randomized Numerical Linear Algebra

Oral
s
11:15-11:35

[11:15]
Individual Preference Stability for Clustering

Spotlight
s
11:35-12:00

[11:35]
Correlated Quantization for Distributed Mean Estimation and Optimization

[11:40]
Multiple-Play Stochastic Bandits with Shareable Finite-Capacity Arms

[11:45]
Coordinated Attacks against Contextual Bandits: Fundamental Limits and Defense Mechanisms

[11:50]
The Algebraic Path Problem for Graph Metrics

[11:55]
Steerable 3D Spherical Neurons

(ends 12:00 PM)

11 a.m.

noon

Coffee Break

12:30 p.m.

1 p.m.

Short Break

1:15 p.m.

Spotlight
s
1:15-1:50

[1:15]
Prototype Based Classification from Hierarchy to Fairness

[1:20]
Neural-Symbolic Models for Logical Queries on Knowledge Graphs

[1:25]
Deep Probability Estimation

[1:30]
Uncertainty Modeling in Generative Compressed Sensing

[1:35]
Going Deeper into Permutation-Sensitive Graph Neural Networks

[1:40]
Learning from Counterfactual Links for Link Prediction

[1:45]
Training Discrete Deep Generative Models via Gapped Straight-Through Estimator

Oral
s
1:50-2:10

[1:50]
Correct-N-Contrast: a Contrastive Approach for Improving Robustness to Spurious Correlations

Spotlight
s
2:10-2:45

[2:10]
Principal Component Flows

[2:15]
Bit Prioritization in Variational Autoencoders via Progressive Coding

[2:20]
Generative Flow Networks for Discrete Probabilistic Modeling

[2:25]
Diffusion bridges vector quantized variational autoencoders

[2:30]
Mitigating Modality Collapse in Multimodal VAEs via Impartial Optimization

[2:35]
Soft Truncation: A Universal Training Technique of Score-based Diffusion Model for High Precision Score Estimation

[2:40]
Fast and Reliable Evaluation of Adversarial Robustness with Minimum-Margin Attack

(ends 2:45 PM)

Spotlight
s
1:15-1:50

[1:15]
Coordinated Double Machine Learning

[1:20]
Exploiting Independent Instruments: Identification and Distribution Generalization

[1:25]
Partial Counterfactual Identification from Observational and Experimental Data

[1:30]
On Measuring Causal Contributions via do-interventions

[1:35]
The Role of Deconfounding in Meta-learning

[1:40]
CITRIS: Causal Identifiability from Temporal Intervened Sequences

[1:45]
Online Balanced Experimental Design

Oral
s
1:50-2:10

[1:50]
Minimum Cost Intervention Design for Causal Effect Identification

Spotlight
s
2:10-2:45

[2:10]
Causal structure-based root cause analysis of outliers

[2:15]
Instrumental Variable Regression with Confounder Balancing

[2:20]
Causal Transformer for Estimating Counterfactual Outcomes

[2:25]
Causal Inference Through the Structural Causal Marginal Problem

[2:30]
Functional Generalized Empirical Likelihood Estimation for Conditional Moment Restrictions

[2:35]
Matching Learned Causal Effects of Neural Networks with Domain Priors

[2:40]
Inferring Cause and Effect in the Presence of Heteroscedastic Noise

(ends 2:45 PM)

Oral
s
1:15-1:35

[1:15]
POEM: Out-of-Distribution Detection with Posterior Sampling

Spotlight
s
1:35-1:55

[1:35]
Selective Network Linearization for Efficient Private Inference

[1:40]
Efficient Computation of Higher-Order Subgraph Attribution via Message Passing

[1:45]
A Theoretical Analysis on Independence-driven Importance Weighting for Covariate-shift Generalization

[1:50]
Modular Conformal Calibration

Oral
s
1:55-2:15

[1:55]
Rethinking Image-Scaling Attacks: The Interplay Between Vulnerabilities in Machine Learning Systems

Spotlight
s
2:15-2:40

[2:15]
Context-Aware Drift Detection

[2:20]
Accelerating Shapley Explanation via Contributive Cooperator Selection

[2:25]
An Equivalence Between Data Poisoning and Byzantine Gradient Attacks

[2:30]
DAVINZ: Data Valuation using Deep Neural Networks at Initialization

[2:35]
Sample Efficient Learning of Predictors that Complement Humans

(ends 2:45 PM)

Oral
s
1:15-1:35

[1:15]
H-Consistency Bounds for Surrogate Loss Minimizers

Spotlight
s
1:35-2:00

[1:35]
Learning General Halfspaces with Adversarial Label Noise via Online Gradient Descent

[1:40]
The Teaching Dimension of Regularized Kernel Learners

[1:45]
Sparse Mixed Linear Regression with Guarantees: Taming an Intractable Problem with Invex Relaxation

[1:50]
TURF: Two-Factor, Universal, Robust, Fast Distribution Learning Algorithm

[1:55]
Multiclass learning with margin: exponential rates with no bias-variance trade-off

Oral
s
2:00-2:20

[2:00]
Refined Convergence Rates for Maximum Likelihood Estimation under Finite Mixture Models

Spotlight
s
2:20-2:45

[2:20]
High Probability Guarantees for Nonconvex Stochastic Gradient Descent with Heavy Tails

[2:25]
An Initial Alignment between Neural Network and Target is Needed for Gradient Descent to Learn

[2:30]
Inductive Biases and Variable Creation in Self-Attention Mechanisms

[2:35]
Topology-aware Generalization of Decentralized SGD

[2:40]
Understanding Gradient Descent on the Edge of Stability in Deep Learning

(ends 2:45 PM)

Spotlight
s
1:15-1:45

[1:15]
Bayesian Nonparametric Learning for Point Processes with Spatial Homogeneity: A Spatial Analysis of NBA Shot Locations

[1:20]
On the Effects of Artificial Data Modification

[1:25]
Deep Squared Euclidean Approximation to the Levenshtein Distance for DNA Storage

[1:30]
How Faithful is your Synthetic Data? Sample-level Metrics for Evaluating and Auditing Generative Models

[1:35]
Error-driven Input Modulation: Solving the Credit Assignment Problem without a Backward Pass

[1:40]
How to Train Your Wide Neural Network Without Backprop: An Input-Weight Alignment Perspective

Oral
s
1:45-2:05

[1:45]
Contrastive Mixture of Posteriors for Counterfactual Inference, Data Integration and Fairness

Spotlight
s
2:05-2:45

[2:05]
Describing Differences between Text Distributions with Natural Language

[2:10]
Distinguishing rule- and exemplar-based generalization in learning systems

[2:15]
Burst-Dependent Plasticity and Dendritic Amplification Support Target-Based Learning and Hierarchical Imitation Learning

[2:20]
A Deep Learning Approach for the Segmentation of Electroencephalography Data in Eye Tracking Applications

[2:25]
Minimizing Control for Credit Assignment with Strong Feedback

[2:30]
Self-Supervised Models of Audio Effectively Explain Human Cortical Responses to Speech

[2:35]
Towards Scaling Difference Target Propagation by Learning Backprop Targets

[2:40]
Content Addressable Memory Without Catastrophic Forgetting by Heteroassociation with a Fixed Scaffold

(ends 2:45 PM)

Oral
s
1:15-1:35

[1:15]
Scalable MCMC Sampling for Nonsymmetric Determinantal Point Processes

Spotlight
s
1:35-2:00

[1:35]
Robust SDE-Based Variational Formulations for Solving Linear PDEs via Deep Learning

[1:40]
Hessian-Free High-Resolution Nesterov Acceleration For Sampling

[1:45]
LSB: Local Self-Balancing MCMC in Discrete Spaces

[1:50]
A Langevin-like Sampler for Discrete Distributions

[1:55]
Scalable Spike-and-Slab

Oral
s
2:00-2:20

[2:00]
Nonparametric Involutive Markov Chain Monte Carlo

Spotlight
s
2:20-2:45

[2:20]
Continual Repeated Annealed Flow Transport Monte Carlo

[2:25]
Algorithms for the Communication of Samples

[2:30]
Low-Precision Stochastic Gradient Langevin Dynamics

[2:35]
Fast Relative Entropy Coding with A* coding

[2:40]
Accurate Quantization of Measures via Interacting Particle-based Optimization

(ends 2:45 PM)

Spotlight
s
1:15-1:45

[1:15]
Neural Network Weights Do Not Converge to Stationary Points: An Invariant Measure Perspective

[1:20]
Convergence and Recovery Guarantees of the K-Subspaces Method for Subspace Clustering

[1:25]
Restarted Nonconvex Accelerated Gradient Descent: No More Polylogarithmic Factor in the $O(\epsilon^{-7/4})$ Complexity

[1:30]
Understanding the unstable convergence of gradient descent

[1:35]
Federated Minimax Optimization: Improved Convergence Analyses and Algorithms

[1:40]
Inductive Matrix Completion: No Bad Local Minima and a Fast Algorithm

Oral
s
1:45-2:05

[1:45]
FedNest: Federated Bilevel, Minimax, and Compositional Optimization

Spotlight
s
2:05-2:35

[2:05]
AdaGrad Avoids Saddle Points

[2:10]
Fast and Provable Nonconvex Tensor RPCA

[2:15]
On Convergence of Gradient Descent Ascent: A Tight Local Analysis

[2:20]
Convergence Rates of Non-Convex Stochastic Gradient Descent Under a Generic Lojasiewicz Condition and Local Smoothness

[2:25]
A Single-Loop Gradient Descent and Perturbed Ascent Algorithm for Nonconvex Functional Constrained Optimization

[2:30]
Anticorrelated Noise Injection for Improved Generalization

(ends 2:45 PM)

Spotlight
s
1:15-1:45

[1:15]
Model-Free Opponent Shaping

[1:20]
Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline Reinforcement Learning

[1:25]
Efficient Model-based Multi-agent Reinforcement Learning via Optimistic Equilibrium Computation

[1:30]
Disentangling Sources of Risk for Distributional Multi-Agent Reinforcement Learning

[1:35]
Scalable Deep Reinforcement Learning Algorithms for Mean Field Games

[1:40]
Revisiting Some Common Practices in Cooperative Multi-Agent Reinforcement Learning

Oral
s
1:45-2:05

[1:45]
Independent Policy Gradient for Large-Scale Markov Potential Games: Sharper Rates, Function Approximation, and Game-Agnostic Convergence

Spotlight
s
2:05-2:45

[2:05]
Self-Organized Polynomial-Time Coordination Graphs

[2:10]
Individual Reward Assisted Multi-Agent Reinforcement Learning

[2:15]
Generalized Beliefs for Cooperative AI

[2:20]
Greedy when Sure and Conservative when Uncertain about the Opponents

[2:25]
Deconfounded Value Decomposition for Multi-Agent Reinforcement Learning

[2:30]
Welfare Maximization in Competitive Equilibrium: Reinforcement Learning for Markov Exchange Economy

[2:35]
Simplex Neural Population Learning: Any-Mixture Bayes-Optimality in Symmetric Zero-sum Games

[2:40]
Sample and Communication-Efficient Decentralized Actor-Critic Algorithms with Finite-Time Analysis

(ends 2:45 PM)

Spotlight
s
1:15-1:45

[1:15]
Modeling Irregular Time Series with Continuous Recurrent Units

[1:20]
TACTiS: Transformer-Attentional Copulas for Time Series

[1:25]
CerDEQ: Certifiable Deep Equilibrium Model

[1:30]
Approximately Equivariant Networks for Imperfectly Symmetric Dynamics

[1:35]
IDYNO: Learning Nonparametric DAGs from Interventional Dynamic Data

[1:40]
GSmooth: Certified Robustness against Semantic Transformations via Generalized Randomized Smoothing

Oral
s
1:45-2:05

[1:45]
Neural Laplace: Learning diverse classes of differential equations in the Laplace domain

Spotlight
s
2:05-2:45

[2:05]
Improving Language Models by Retrieving from Trillions of Tokens

[2:10]
Closed-Form Diffeomorphic Transformations for Time Series Alignment

[2:15]
Removing Batch Normalization Boosts Adversarial Training

[2:20]
Forget-free Continual Learning with Winning Subnetworks

[2:25]
FEDformer: Frequency Enhanced Decomposed Transformer for Long-term Series Forecasting

[2:30]
Adversarial Robustness against Multiple and Single $l_p$-Threat Models via Quick Fine-Tuning of Robust Classifiers

[2:35]
On the Practicality of Deterministic Epistemic Uncertainty

[2:40]
Combining Diverse Feature Priors

(ends 2:45 PM)

Oral
s
1:15-1:35

[1:15]
Cooperative Online Learning in Stochastic and Adversarial MDPs

Spotlight
s
1:35-2:00

[1:35]
Simple and near-optimal algorithms for hidden stratification and multi-group learning

[1:40]
Being Properly Improper

[1:45]
Neural Network Pruning Denoises the Features and Makes Local Connectivity Emerge in Visual Tasks

[1:50]
On the Finite-Time Complexity and Practical Computation of Approximate Stationarity Concepts of Lipschitz Functions

[1:55]
Nearly Optimal Policy Optimization with Stable at Any Time Guarantee

Oral
s
2:00-2:20

[2:00]
Contextual Bandits with Smooth Regret: Efficient Learning in Continuous Action Spaces

Spotlight
s
2:20-2:45

[2:20]
Minimax M-estimation under Adversarial Contamination

[2:25]
Adaptive Best-of-Both-Worlds Algorithm for Heavy-Tailed Multi-Armed Bandits

[2:30]
Efficiently Learning the Topology and Behavior of a Networked Dynamical System Via Active Queries

[2:35]
Boosting Graph Structure Learning with Dummy Nodes

[2:40]
Lazy Estimation of Variable Importance for Large Neural Networks

(ends 2:45 PM)

3:30 p.m.

DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale

DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with Prototypical Representations

(ends 5:30 PM)

4 p.m.

WED 20 JUL

3:30 a.m.

Breakfast on your own

4 a.m.

(ends 4:00 PM)

6 a.m.

Invited Talk:

Regina Barzilay

(ends 7:00 AM)

7 a.m.

Coffee Break

7:30 a.m.

Spotlight
s
7:30-8:05

[7:30]
Towards understanding how momentum improves generalization in deep learning

[7:35]
What Can Linear Interpolation of Neural Network Loss Landscapes Tell Us?

[7:40]
Deep equilibrium networks are sensitive to initialization statistics

[7:45]
Scaling-up Diverse Orthogonal Convolutional Networks by a Paraunitary Framework

[7:50]
Stability Based Generalization Bounds for Exponential Family Langevin Dynamics

[7:55]
Local Augmentation for Graph Neural Networks

[8:00]
On Non-local Convergence Analysis of Deep Linear Networks

Oral
s
8:05-8:25

[8:05]
Adaptive Inertia: Disentangling the Effects of Adaptive Learning Rate and Momentum

Spotlight
s
8:25-9:00

[8:25]
Diversified Adversarial Attacks based on Conjugate Gradient Method

[8:30]
On the Optimization Landscape of Neural Collapse under MSE Loss: Global Optimality with Unconstrained Features

[8:35]
On the Equivalence Between Temporal and Static Equivariant Graph Representations

[8:40]
Robust Training under Label Noise by Over-parameterization

[8:45]
Implicit Bias of the Step Size in Linear Diagonal Neural Networks

[8:50]
Extended Unconstrained Features Model for Exploring Deep Neural Collapse

[8:55]
Score-Guided Intermediate Level Optimization: Fast Langevin Mixing for Inverse Problems

(ends 9:00 AM)

Spotlight
s
7:30-8:05

[7:30]
Weisfeiler-Lehman Meets Gromov-Wasserstein

[7:35]
GenLabel: Mixup Relabeling using Generative Models

[7:40]
When and How Mixup Improves Calibration

[7:45]
On Transportation of Mini-batches: A Hierarchical Approach

[7:50]
VariGrow: Variational Architecture Growing for Task-Agnostic Continual Learning based on Bayesian Novelty

[7:55]
Beyond Images: Label Noise Transition Matrix Estimation for Tasks with Lower-Quality Features

[8:00]
A Model-Agnostic Randomized Learning Framework based on Random Hypothesis Subspace Sampling

Oral
s
8:05-8:25

[8:05]
Stable Conformal Prediction Sets

Spotlight
s
8:25-9:00

[8:25]
Rethinking Fanoâ€™s Inequality in Ensemble Learning

[8:30]
FITNESS: (Fine Tune on New and Similar Samples) to detect anomalies in streams with drift and outliers

[8:35]
Improving Mini-batch Optimal Transport via Partial Transportation

[8:40]
Near-optimal rate of consistency for linear models with missing values

[8:45]
Permutation Search of Tensor Network Structures via Local Sampling

[8:50]
Revisiting Label Smoothing and Knowledge Distillation Compatibility: What was Missing?

[8:55]
DNNR: Differential Nearest Neighbors Regression

(ends 9:00 AM)

Spotlight
s
7:30-8:00

[7:30]
Learning Domain Adaptive Object Detection with Probabilistic Teacher

[7:35]
Adaptive Data Analysis with Correlated Observations

[7:40]
Efficient PAC Learning from the Crowd with Pairwise Comparisons

[7:45]
On the Statistical Benefits of Curriculum Learning

[7:50]
Feature and Parameter Selection in Stochastic Linear Bandits

[7:55]
Disentangled Federated Learning for Tackling Attributes Skew via Invariant Aggregation and Diversity Transferring

Oral
s
8:00-8:20

[8:00]
A new similarity measure for covariate shift with applications to nonparametric regression

Spotlight
s
8:20-9:00

[8:20]
Contextual Bandits with Large Action Spaces: Made Practical

[8:25]
Identifiability Conditions for Domain Adaptation

[8:30]
Streaming Algorithms for High-Dimensional Robust Statistics

[8:35]
Popular decision tree algorithms are provably noise tolerant

[8:40]
Understanding and Improving Knowledge Graph Embedding for Entity Alignment

[8:45]
Perfectly Balanced: Improving Transfer and Robustness of Supervised Contrastive Learning

[8:50]
Robust Fine-Tuning of Deep Neural Networks with Hessian-based Generalization Guarantees

[8:55]
Understanding Gradual Domain Adaptation: Improved Analysis, Optimal Path and Beyond

(ends 9:00 AM)

Spotlight
s
7:30-8:05

[7:30]
Skin Deep Unlearning: Artefact and Instrument Debiasing in the Context of Melanoma Classification

[7:35]
One-Pass Diversified Sampling with Application to Terabyte-Scale Genomic Sequence Streams

[7:40]
Unsupervised Flow-Aligned Sequence-to-Sequence Learning for Video Restoration

[7:45]
ME-GAN: Learning Panoptic Electrocardio Representations for Multi-view ECG Synthesis Conditioned on Heart Diseases

[7:50]
Variational Mixtures of ODEs for Inferring Cellular Gene Expression Dynamics

[7:55]
Bayesian Imitation Learning for End-to-End Mobile Manipulation

[8:00]
De novo mass spectrometry peptide sequencing with a transformer model

Oral
s
8:05-8:25

[8:05]
Learning inverse folding from millions of predicted structures

Spotlight
s
8:25-9:00

[8:25]
Guided-TTS: A Diffusion Model for Text-to-Speech via Classifier Guidance

[8:30]
MAE-DET: Revisiting Maximum Entropy Principle in Zero-Shot NAS for Efficient Object Detection

[8:35]
Proximal Exploration for Model-guided Protein Sequence Design

[8:40]
Tranception: Protein Fitness Prediction with Autoregressive Transformers and Inference-time Retrieval

[8:45]
How to Fill the Optimum Set? Population Gradient Descent with Harmless Diversity

[8:50]
Examining Scaling and Transfer of Language Model Architectures for Machine Translation

[8:55]
State Transition of Dendritic Spines Improves Learning of Sparse Spiking Neural Networks

(ends 9:00 AM)

Oral
s
7:30-7:50

[7:30]
How Tempering Fixes Data Augmentation in Bayesian Neural Networks

Spotlight
s
7:50-8:15

[7:50]
Surrogate Likelihoods for Variational Annealed Importance Sampling

[7:55]
Nonparametric Sparse Tensor Factorization with Hierarchical Gamma Processes

[8:00]
Fatâ€“Tailed Variational Inference with Anisotropic Tail Adaptive Flows

[8:05]
Variational Sparse Coding with Learned Thresholding

[8:10]
Structured Stochastic Gradient MCMC

Oral
s
8:15-8:35

[8:15]
BAMDT: Bayesian Additive Semi-Multivariate Decision Trees for Nonparametric Regression

Spotlight
s
8:35-8:50

[8:35]
Variational Inference with Locally Enhanced Bounds for Hierarchical Models

[8:40]
Centroid Approximation for Bootstrap: Improving Particle Quality at Inference

[8:45]
Deep Reference Priors: What is the best way to pretrain a model?

(ends 9:00 AM)

Spotlight
s
7:30-8:05

[7:30]
Modeling Strong and Human-Like Gameplay with KL-Regularized Search

[7:35]
Showing Your Offline Reinforcement Learning Work: Online Evaluation Budget Matters

[7:40]
Phasic Self-Imitative Reduction for Sparse-Reward Goal-Conditioned Reinforcement Learning

[7:45]
Model-based Meta Reinforcement Learning using Graph Structured Surrogate Models and Amortized Policy Search

[7:50]
Generalized Data Distribution Iteration

[7:55]
Optimizing Tensor Network Contraction Using Reinforcement Learning

[8:00]
History Compression via Language Models in Reinforcement Learning

Oral
s
8:05-8:25

[8:05]
REvolveR: Continuous Evolutionary Models for Robot-to-robot Policy Transfer

Spotlight
s
8:25-9:00

[8:25]
LeNSE: Learning To Navigate Subgraph Embeddings for Large-Scale Combinatorial Optimisation

[8:30]
Efficient Learning for AlphaZero via Path Consistency

[8:35]
A data-driven approach for learning to control computers

[8:40]
Zero-Shot Reward Specification via Grounded Natural Language

[8:45]
How to Stay Curious while avoiding Noisy TVs using Aleatoric Uncertainty Estimation

[8:50]
Model-Value Inconsistency as a Signal for Epistemic Uncertainty

[8:55]
Improving Policy Optimization with Generalist-Specialist Learning

(ends 9:00 AM)

Spotlight
s
7:30-8:05

[7:30]
On Numerical Integration in Neural Ordinary Differential Equations

[7:35]
Reverse Engineering the Neural Tangent Kernel

[7:40]
Principled Knowledge Extrapolation with GANs

[7:45]
Informed Learning by Wide Neural Networks: Convergence, Generalization and Sampling Complexity

[7:50]
Data Augmentation as Feature Manipulation

[7:55]
Convolutional and Residual Networks Provably Contain Lottery Tickets

[8:00]
Feature Learning and Signal Propagation in Deep Neural Networks

Oral
s
8:05-8:25

[8:05]
Robust Training of Neural Networks Using Scale Invariant Architectures

Spotlight
s
8:25-9:00

[8:25]
Understanding Contrastive Learning Requires Incorporating Inductive Biases

[8:30]
Implicit Regularization with Polynomial Growth in Deep Tensor Factorization

[8:35]
Deep Network Approximation in Terms of Intrinsic Parameters

[8:40]
Coin Flipping Neural Networks

[8:45]
Benefits of Overparameterized Convolutional Residual Networks: Function Approximation under Smoothness Constraint

[8:50]
More Than a Toy: Random Matrix Models Predict How Real-World Neural Representations Generalize

[8:55]
SE(3) Equivariant Graph Neural Networks with Complete Local Frames

(ends 9:00 AM)

Spotlight
s
7:30-8:05

[7:30]
Interpretable Neural Networks with Frank-Wolfe: Sparse Relevance Maps and Relevance Orderings

[7:35]
Label-Free Explainability for Unsupervised Models

[7:40]
Towards Theoretical Analysis of Transformation Complexity of ReLU DNNs

[7:45]
A Study of Face Obfuscation in ImageNet

[7:50]
Fair Representation Learning through Implicit Path Alignment

[7:55]
Mitigating Neural Network Overconfidence with Logit Normalization

[8:00]
Learning fair representation with a parametric integral probability metric

Oral
s
8:05-8:25

[8:05]
Privacy for Free: How does Dataset Condensation Help Privacy?

Spotlight
s
8:25-9:00

[8:25]
Fair Generalized Linear Models with a Convex Penalty

[8:30]
HyperPrompt: Prompt-based Task-Conditioning of Transformers

[8:35]
Validating Causal Inference Methods

[8:40]
The Multivariate Community Hawkes Model for Dependent Relational Events in Continuous-time Networks

[8:45]
Scalable Deep Gaussian Markov Random Fields for General Graphs

[8:50]
Anytime Information Cascade Popularity Prediction via Self-Exciting Processes

[8:55]
Deep Variational Graph Convolutional Recurrent Network for Multivariate Time Series Anomaly Detection

(ends 9:00 AM)

Oral
s
7:30-7:50

[7:30]
Adapting to Mixing Time in Stochastic Optimization with Markovian Data

Spotlight
s
7:50-8:15

[7:50]
Fast Composite Optimization and Statistical Recovery in Federated Learning

[7:55]
Beyond Worst-Case Analysis in Stochastic Approximation: Moment Estimation Improves Instance Complexity

[8:00]
Personalization Improves Privacy-Accuracy Tradeoffs in Federated Learning

[8:05]
Optimal Algorithms for Stochastic Multi-Level Compositional Optimization

[8:10]
Finite-Sum Coupled Compositional Stochastic Optimization: Theory and Applications

Oral
s
8:15-8:35

[8:15]
Towards Noise-adaptive, Problem-adaptive (Accelerated) Stochastic Gradient Descent

Spotlight
s
8:35-9:00

[8:35]
Statistical inference with implicit SGD: proximal Robbins-Monro vs. Polyak-Ruppert

[8:40]
ProxSkip: Yes! Local Gradient Steps Provably Lead to Communication Acceleration! Finally!

[8:45]
Communication-Efficient Adaptive Federated Learning

[8:50]
RECAPP: Crafting a More Efficient Catalyst for Convex Optimization

[8:55]
Kill a Bird with Two Stones: Closing the Convergence Gaps in Non-Strongly Convex Optimization by Directly Accelerated SVRG with Double Compensation and Snapshots

(ends 9:00 AM)

Oral
s
7:30-7:50

[7:30]
A Minimax Learning Approach to Off-Policy Evaluation in Confounded Partially Observable Markov Decision Processes

Spotlight
s
7:50-8:10

[7:50]
The Power of Exploiter: Provable Multi-Agent RL in Large State Spaces

[7:55]
Extracting Latent State Representations with Linear Dynamics from Rich Observations

[8:00]
For Learning in Symmetric Teams, Local Optima are Global Nash Equilibria

[8:05]
Consensus Multiplicative Weights Update: Learning to Learn using Projector-based Game Signatures

Oral
s
8:10-8:30

[8:10]
Learning Markov Games with Adversarial Opponents: Efficient Algorithms and Fundamental Limits

Spotlight
s
8:30-8:55

[8:30]
Strategic Instrumental Variable Regression: Recovering Causal Relationships From Strategic Responses

[8:35]
Learning to Infer Structures of Network Games

[8:40]
Nearly Minimax Optimal Reinforcement Learning with Linear Function Approximation

[8:45]
Near-Optimal Learning of Extensive-Form Games with Imperfect Information

[8:50]
Guarantees for Epsilon-Greedy Reinforcement Learning with Function Approximation

(ends 9:00 AM)

8 a.m.

9 a.m.

Lunch Break On Your Own

10:15 a.m.

Spotlight
s
10:15-10:50

[10:15]
From data to functa: Your data point is a function and you can treat it like one

[10:20]
DisPFL: Towards Communication-Efficient Personalized Federated Learning via Decentralized Sparse Training

[10:25]
Differentiable Top-k Classification Learning

[10:30]
Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural Networks

[10:35]
Characterizing and Overcoming the Greedy Nature of Learning in Multi-modal Deep Neural Networks

[10:40]
Training Your Sparse Neural Network Better with Any Mask

[10:45]
Federated Learning with Positive and Unlabeled Data

Oral
s
10:50-11:10

[10:50]
Generating 3D Molecules for Target Protein Binding

Spotlight
s
11:10-11:45

[11:10]
Sparse Double Descent: Where Network Pruning Aggravates Overfitting

[11:15]
Collaboration of Experts: Achieving 80% Top-1 Accuracy on ImageNet with 100M FLOPs

[11:20]
Revisiting Consistency Regularization for Deep Partial Label Learning

[11:25]
Stochastic smoothing of the top-K calibrated hinge loss for deep imbalanced classification

[11:30]
A Unified Weight Initialization Paradigm for Tensorial Convolutional Neural Networks

[11:35]
PLATINUM: Semi-Supervised Model Agnostic Meta-Learning using Submodular Mutual Information

[11:40]
Multicoated Supermasks Enhance Hidden Networks

(ends 11:45 AM)

Spotlight
s
10:15-10:50

[10:15]
Choosing Answers in Epsilon-Best-Answer Identification for Linear Bandits

[10:20]
On the Finite-Time Performance of the Knowledge Gradient Algorithm

[10:25]
Expression might be enough: representing pressure and demand for reinforcement learning based traffic signal control

[10:30]
Generalization Bounds using Lower Tail Exponents in Stochastic Optimizers

[10:35]
No-Regret Learning in Time-Varying Zero-Sum Games

[10:40]
Achieving Minimax Rates in Pool-Based Batch Active Learning

[10:45]
Active Multi-Task Representation Learning

Oral
s
10:50-11:10

[10:50]
Active fairness auditing

Spotlight
s
11:10-11:45

[11:10]
Metric-Fair Active Learning

[11:15]
Metric-Fair Classifier Derandomization

[11:20]
Interactively Learning Preference Constraints in Linear Bandits

[11:25]
Convergence of Uncertainty Sampling for Active Learning

[11:30]
Thompson Sampling for Robust Transfer in Multi-Task Bandits

[11:35]
Constants Matter: The Performance Gains of Active Learning

[11:40]
Cross-Space Active Learning on Graph Convolutional Networks

(ends 11:45 AM)

Spotlight
s
10:15-10:45

[10:15]
MemSR: Training Memory-efficient Lightweight Model for Image Super-Resolution

[10:20]
PINs: Progressive Implicit Networks for Multi-Scale Neural Representations

[10:25]
Accelerating Bayesian Optimization for Biological Sequence Design with Denoising Autoencoders

[10:30]
Generative Coarse-Graining of Molecular Conformations

[10:35]
LIMO: Latent Inceptionism for Targeted Molecule Generation

[10:40]
Learning to Separate Voices by Spatial Regions

Oral
s
10:45-11:05

[10:45]
3DLinker: An E(3) Equivariant Variational Autoencoder for Molecular Linker Design

Spotlight
s
11:05-11:40

[11:05]
3D Infomax improves GNNs for Molecular Property Prediction

[11:10]
Biological Sequence Design with GFlowNets

[11:15]
Pocket2Mol: Efficient Molecular Sampling Based on 3D Protein Pockets

[11:20]
Retroformer: Pushing the Limits of End-to-end Retrosynthesis Transformer

[11:25]
Constrained Optimization with Dynamic Bound-scaling for Effective NLP Backdoor Defense

[11:30]
Path-Aware and Structure-Preserving Generation of Synthetically Accessible Molecules

[11:35]
EquiBind: Geometric Deep Learning for Drug Binding Structure Prediction

(ends 11:45 AM)

Spotlight
s
10:15-10:45

[10:15]
Decomposing Temporal High-Order Interactions via Latent ODEs

[10:20]
Log-Euclidean Signatures for Intrinsic Distances Between Unaligned Datasets

[10:25]
DRIBO: Robust Deep Reinforcement Learning via Multi-View Information Bottleneck

[10:30]
End-to-End Balancing for Causal Continuous Treatment-Effect Estimation

[10:35]
Role-based Multiplex Network Embedding

[10:40]
Measure Estimation in the Barycentric Coding Model

Oral
s
10:45-11:05

[10:45]
RieszNet and ForestRiesz: Automatic Debiased Machine Learning with Neural Nets and Random Forests

Spotlight
s
11:05-11:35

[11:05]
Counterfactual Transportability: A Formal Approach

[11:10]
Identification of Linear Non-Gaussian Latent Hierarchical Structure

[11:15]
COAT: Measuring Object Compositionality in Emergent Representations

[11:20]
Generalization and Robustness Implications in Object-Centric Learning

[11:25]
NAFS: A Simple yet Tough-to-beat Baseline for Graph Representation Learning

[11:30]
Action-Sufficient State Representation Learning for Control with Structural Constraints

(ends 11:45 AM)

Oral
s
10:15-10:35

[10:15]
Bayesian Continuous-Time Tucker Decomposition

Spotlight
s
10:35-11:00

[10:35]
Approximate Bayesian Computation with Domain Expert in the Loop

[10:40]
Discrete Probabilistic Inverse Optimal Transport

[10:45]
Easy Variational Inference for Categorical Models via an Independent Binary Approximation

[10:50]
Streaming Inference for Infinite Feature Models

[10:55]
Optimizing Sequential Experimental Design with Deep Reinforcement Learning

Oral
s
11:00-11:20

[11:00]
Function-space Inference with Sparse Implicit Processes

Spotlight
s
11:20-11:45

[11:20]
Variational Inference for Infinitely Deep Neural Networks

[11:25]
Personalized Federated Learning via Variational Bayesian Inference

[11:30]
Wide Bayesian neural networks have a simple weight posterior: theory and accelerated sampling

[11:35]
Bayesian Deep Embedding Topic Meta-Learner

[11:40]
Efficient Approximate Inference for Stationary Kernel on Frequency Domain

(ends 11:45 AM)

Spotlight
s
10:15-10:45

[10:15]
Biased Gradient Estimate with Drastic Variance Reduction for Meta Reinforcement Learning

[10:20]
Analysis of Stochastic Processes through Replay Buffers

[10:25]
Cascaded Gaps: Towards Logarithmic Regret for Risk-Sensitive Reinforcement Learning

[10:30]
Communicating via Markov Decision Processes

[10:35]
PAGE-PG: A Simple and Loopless Variance-Reduced Policy Gradient Method with Probabilistic Gradient Estimation

[10:40]
DNS: Determinantal Point Process Based Neural Network Sampler for Ensemble Reinforcement Learning

Oral
s
10:45-11:05

[10:45]
Planning with Diffusion for Flexible Behavior Synthesis

Spotlight
s
11:05-11:40

[11:05]
A Temporal-Difference Approach to Policy Gradient Estimation

[11:10]
MASER: Multi-Agent Reinforcement Learning with Subgoals Generated from Experience Replay Buffer

[11:15]
Reinforcement Learning from Partial Observation: Linear Function Approximation with Provable Sample Efficiency

[11:20]
Actor-Critic based Improper Reinforcement Learning

[11:25]
On the Sample Complexity of Learning Infinite-horizon Discounted Linear Kernel MDPs

[11:30]
The Geometry of Robust Value Functions

[11:35]
Denoised MDPs: Learning World Models Better Than the World Itself

(ends 11:45 AM)

Oral
s
10:15-10:35

[10:15]
Tight and Robust Private Mean Estimation with Few Users

Spotlight
s
10:35-11:00

[10:35]
QSFL: A Two-Level Uplink Communication Optimization Framework for Federated Learning

[10:40]
Robustness and Accuracy Could Be Reconcilable by (Proper) Definition

[10:45]
Sanity Simulations for Saliency Methods

[10:50]
Out-of-Distribution Detection with Deep Nearest Neighbors

[10:55]
Differentially Private Maximal Information Coefficients

Oral
s
11:00-11:20

[11:00]
Improved Rates for Differentially Private Stochastic Convex Optimization with Heavy-Tailed Data

Spotlight
s
11:20-11:45

[11:20]
On the Difficulty of Defending Self-Supervised Learning against Model Extraction

[11:25]
Adversarial Attack and Defense for Non-Parametric Two-Sample Tests

[11:30]
Certified Adversarial Robustness Under the Bounded Support Set

[11:35]
Predicting Out-of-Distribution Error with the Projection Norm

[11:40]
Adversarially Robust Models may not Transfer Better: Sufficient Conditions for Domain Transferability from the View of Regularization

(ends 11:45 AM)

Spotlight
s
10:15-10:50

[10:15]
Generating Distributional Adversarial Examples to Evade Statistical Detectors

[10:20]
Improving Out-of-Distribution Robustness via Selective Augmentation

[10:25]
Modeling Adversarial Noise for Adversarial Training

[10:30]
Improving Adversarial Robustness via Mutual Information Estimation

[10:35]
FOCUS: Familiar Objects in Common and Uncommon Settings

[10:40]
Query-Efficient and Scalable Black-Box Adversarial Attacks on Discrete Sequential Data via Bayesian Optimization

[10:45]
Test-Time Training Can Close the Natural Distribution Shift Performance Gap in Deep Learning Based Compressed Sensing

Oral
s
10:50-11:10

[10:50]
A Dynamical System Perspective for Lipschitz Neural Networks

Spotlight
s
11:10-11:45

[11:10]
Data Determines Distributional Robustness in Contrastive Language Image Pre-training (CLIP)

[11:15]
Neurotoxin: Durable Backdoors in Federated Learning

[11:20]
Bayesian Learning with Information Gain Provably Bounds Risk for a Robust Adversarial Defense

[11:25]
Maximum Likelihood Training for Score-based Diffusion ODEs by High Order Denoising Score Matching

[11:30]
Fast Lossless Neural Compression with Integer-Only Discrete Flows

[11:35]
SQ-VAE: Variational Bayes on Discrete Representation with Self-annealed Stochastic Quantization

[11:40]
SCHA-VAE: Hierarchical Context Aggregation for Few-Shot Generation

(ends 11:45 AM)

Oral
s
10:15-10:35

[10:15]
Generative Trees: Adversarial and Copycat

Spotlight
s
10:35-11:00

[10:35]
A Resilient Distributed Boosting Algorithm

[10:40]
Online Learning and Pricing with Reusable Resources: Linear Bandits with Sub-Exponential Rewards

[10:45]
On Well-posedness and Minimax Optimal Rates of Nonparametric Q-function Estimation in Off-policy Evaluation

[10:50]
Congested Bandits: Optimal Routing via Short-term Resets

[10:55]
Stochastic Rising Bandits

Oral
s
11:00-11:20

[11:00]
Agnostic Learnability of Halfspaces via Logistic Loss

Spotlight
s
11:20-11:45

[11:20]
Fluctuations, Bias, Variance & Ensemble of Learners: Exact Asymptotics for Convex Losses in High-Dimension

[11:25]
PDE-Based Optimal Strategy for Unconstrained Online Learning

[11:30]
Provable Acceleration of Heavy Ball beyond Quadratics for a Class of Polyak-Lojasiewicz Functions when the Non-Convexity is Averaged-Out

[11:35]
On Learning Mixture of Linear Regressions in the Non-Realizable Setting

[11:40]
Random Forest Density Estimation

(ends 11:45 AM)

Spotlight
s
10:15-10:50

[10:15]
DAdaQuant: Doubly-adaptive quantization for communication-efficient Federated Learning

[10:20]
Unsupervised Time-Series Representation Learning with Iterative Bilinear Temporal-Spectral Fusion

[10:25]
RetrievalGuard: Provably Robust 1-Nearest Neighbor Image Retrieval

[10:30]
Modeling Structure with Undirected Neural Networks

[10:35]
Certified Neural Network Watermarks with Randomized Smoothing

[10:40]
Improved Certified Defenses against Data Poisoning with (Deterministic) Finite Aggregation

[10:45]
Adversarial Vulnerability of Randomized Ensembles

Oral
s
10:50-11:10

[10:50]
Robustness Verification for Contrastive Learning

Spotlight
s
11:10-11:45

[11:10]
The CLRS Algorithmic Reasoning Benchmark

[11:15]
Finding Global Homophily in Graph Neural Networks When Meeting Heterophily

[11:20]
Understanding Robust Generalization in Learning Regular Languages

[11:25]
Improving Robustness against Real-World and Worst-Case Distribution Shifts through Decision Region Quantification

[11:30]
AdAUC: End-to-end Adversarial AUC Optimization Against Long-tail Problems

[11:35]
A Modern Self-Referential Weight Matrix That Learns to Modify Itself

[11:40]
Short-Term Plasticity Neurons Learning to Learn and Forget

(ends 11:45 AM)

11:45 a.m.

Coffee Break

12:15 p.m.

1 p.m.

1:15 p.m.

Short Break

1:30 p.m.

Spotlight
s
1:30-2:05

[1:30]
$p$-Laplacian Based Graph Neural Networks

[1:35]
Equivariant Quantum Graph Circuits

[1:40]
A Theoretical Comparison of Graph Neural Network Extensions

[1:45]
Variational On-the-Fly Personalization

[1:50]
Deep symbolic regression for recurrence prediction

[1:55]
Geometric Multimodal Contrastive Representation Learning

[2:00]
Universality of Winning Tickets: A Renormalization Group Perspective

Oral
s
2:05-2:25

[2:05]
Partial and Asymmetric Contrastive Learning for Out-of-Distribution Detection in Long-Tailed Recognition

Spotlight
s
2:25-3:00

[2:25]
Loss Function Learning for Domain Generalization by Implicit Gradient

[2:30]
GraphFM: Improving Large-Scale GNN Training via Feature Momentum

[2:35]
Generalization Guarantee of Training Graph Convolutional Networks with Graph Topology Sampling

[2:40]
A Differential Entropy Estimator for Training Neural Networks

[2:45]
Scaling Out-of-Distribution Detection for Real-World Settings

[2:50]
Score-based Generative Modeling of Graphs via the System of Stochastic Differential Equations

[2:55]
SPECTRE: Spectral Conditioning Helps to Overcome the Expressivity Limits of One-shot Graph Generators

(ends 3:00 PM)

Spotlight
s
1:30-2:05

[1:30]
The dynamics of representation learning in shallow, non-linear autoencoders

[1:35]
Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks

[1:40]
Estimation in Rotationally Invariant Generalized Linear Models via Approximate Message Passing

[1:45]
Failure and success of the spectral bias prediction for Laplace Kernel Ridge Regression: the case of low-dimensional data

[1:50]
Regret Bounds for Stochastic Shortest Path Problems with Linear Function Approximation

[1:55]
Universal Joint Approximation of Manifolds and Densities by Simple Injective Flows

[2:00]
Bounding the Width of Neural Networks via Coupled Initialization - A Worst Case Analysis

Oral
s
2:05-2:25

[2:05]
Last Iterate Risk Bounds of SGD with Decaying Stepsize for Overparameterized Linear Regression

Spotlight
s
2:25-3:00

[2:25]
The Neural Race Reduction: Dynamics of Abstraction in Gated Networks

[2:30]
Efficient Learning of CNNs using Patch Based Features

[2:35]
Neural Tangent Kernel Analysis of Deep Narrow Neural Networks

[2:40]
Modality Competition: What Makes Joint Training of Multi-modal Network Fail in Deep Learning? (Provably)

[2:45]
Fully-Connected Network on Noncompact Symmetric Space and Ridgelet Transform based on Helgason-Fourier Analysis

[2:50]
Non-Vacuous Generalisation Bounds for Shallow Neural Networks

[2:55]
Maslow's Hammer in Catastrophic Forgetting: Node Re-Use vs. Node Activation

(ends 3:00 PM)

Spotlight
s
1:30-2:05

[1:30]
SoQal: Selective Oracle Questioning for Consistency Based Active Learning of Cardiac Signals

[1:35]
Matching Structure for Dual Learning

[1:40]
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

[1:45]
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for Everyone

[1:50]
Inducing Causal Structure for Interpretable Neural Networks

[1:55]
SDQ: Stochastic Differentiable Quantization with Mixed Precision

[2:00]
IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and Languages

Oral
s
2:05-2:25

[2:05]
Re-evaluating Word Mover's Distance

Spotlight
s
2:25-3:00

[2:25]
Translatotron 2: High-quality direct speech-to-speech translation with voice preservation

[2:30]
Robust alignment of cross-session recordings of neural population activity by behaviour via unsupervised domain adaptation

[2:35]
Symmetric Machine Theory of Mind

[2:40]
PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance

[2:45]
LCANets: Lateral Competition Improves Robustness Against Corruption and Attack

[2:50]
Reconstructing Nonlinear Dynamical Systems from Multi-Modal Time Series

[2:55]
Neural Language Models are not Born Equal to Fit Brain Data, but Training Helps

(ends 3:00 PM)