Show Detail |
Timezone: America/Los_Angeles |
Filter Rooms:
SUN 17 JUL
6 a.m.
7 a.m.
(ends 4:00 PM)
9 a.m.
9:30 a.m.
Expo Talk Panel:
(ends 10:15 AM)
10:20 a.m.
Expo Demonstration:
(ends 11:20 AM)
11 a.m.
Cancelled:
(ends 11:45 AM)
11:30 a.m.
Coffee Break
12:15 p.m.
Expo Demonstration:
(ends 1:00 PM)
1:15 p.m.
2:30 p.m.
MON 18 JUL
4 a.m.
(ends 3:00 PM)
5 a.m.
5:30 a.m.
6 a.m.
6:30 a.m.
Tutorial:
(ends 8:45 AM)
7 a.m.
Coffee Break
8 a.m.
9 a.m.
Lunch Break - on your own
10 a.m.
Tutorial:
(ends 12:00 PM)
11 a.m.
noon
Coffee Break
12:30 p.m.
Tutorial:
(ends 2:50 PM)
4 p.m.
TUE 19 JUL
3:30 a.m.
Breakfast on your own
4 a.m.
(ends 4:00 PM)
5:45 a.m.
6 a.m.
7 a.m.
Coffee Break
7:30 a.m.
Spotlights 7:30-8:05
[7:30]
Differentially Private Approximate Quantiles
[7:35]
Fairness Interventions as (Dis)Incentives for Strategic Manipulation
[7:40]
Robust Models Are More Interpretable Because Attributions Look Normal
[7:45]
Sequential Covariate Shift Detection Using Classifier Two-Sample Tests
[7:50]
A Joint Exponential Mechanism For Differentially Private Top-$k$
[7:55]
Transfer Learning In Differential Privacy's Hybrid-Model
[8:00]
Robust Kernel Density Estimation with Median-of-Means principle
Orals 8:05-8:25
[8:05]
Bounding Training Data Reconstruction in Private (Deep) Learning
Spotlights 8:25-9:00
[8:25]
Plug & Play Attacks: Towards Robust and Flexible Model Inversion Attacks
[8:30]
FriendlyCore: Practical Differentially Private Aggregation
[8:35]
ViT-NeT: Interpretable Vision Transformers with Neural Tree Decoder
[8:40]
Fishing for User Data in Large-Batch Federated Learning via Gradient Magnification
[8:45]
Public Data-Assisted Mirror Descent for Private Model Training
[8:50]
Low-Complexity Deep Convolutional Neural Networks on Fully Homomorphic Encryption Using Multiplexed Parallel Convolutions
[8:55]
Robin Hood and Matthew Effects: Differential Privacy Has Disparate Impact on Synthetic Data
(ends 9:00 AM)
Orals 7:30-7:50
[7:30]
Tackling covariate shift with node-based Bayesian neural networks
Spotlights 7:50-8:10
[7:50]
Why the Rich Get Richer? On the Balancedness of Random Partition Models
[7:55]
A Completely Tuning-Free and Robust Approach to Sparse Precision Matrix Estimation
[8:00]
Markov Chain Monte Carlo for Continuous-Time Switching Dynamical Systems
[8:05]
Calibrated Learning to Defer with One-vs-All Classifiers
Orals 8:10-8:30
[8:10]
Tractable Uncertainty for Structure Learning
Spotlights 8:30-8:55
[8:30]
DNA: Domain Generalization with Diversified Neural Averaging
[8:35]
Unified Fourier-based Kernel and Nonlinearity Design for Equivariant Networks on Homogeneous Spaces
[8:40]
DynaMixer: A Vision MLP Architecture with Dynamic Mixing
[8:45]
Channel Importance Matters in Few-Shot Image Classification
[8:50]
Fishr: Invariant Gradient Variances for Out-of-Distribution Generalization
(ends 9:00 AM)
Spotlights 7:30-8:00
[7:30]
Dynamic Regret of Online Markov Decision Processes
[7:35]
On the Impossibility of Learning to Cooperate with Adaptive Partner Strategies in Repeated Games
[7:40]
Distributional Hamilton-Jacobi-Bellman Equations for Continuous-Time Reinforcement Learning
[7:45]
Provable Reinforcement Learning with a Short-Term Memory
[7:50]
Optimistic Linear Support and Successor Features as a Basis for Optimal Policy Transfer
[7:55]
Mirror Learning: A Unifying Framework of Policy Optimisation
Orals 8:00-8:20
[8:00]
Improved No-Regret Algorithms for Stochastic Shortest Path with Linear MDP
Spotlights 8:20-8:50
[8:20]
Learning Infinite-horizon Average-reward Markov Decision Process with Constraints
[8:25]
A State-Distribution Matching Approach to Non-Episodic Reinforcement Learning
[8:30]
Langevin Monte Carlo for Contextual Bandits
[8:35]
Prompting Decision Transformer for Few-Shot Policy Generalization
[8:40]
Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning
[8:45]
Human-in-the-loop: Provably Efficient Preference-based Reinforcement Learning with General Function Approximation
(ends 9:00 AM)
Orals 7:30-7:50
[7:30]
Detecting Adversarial Examples Is (Nearly) As Hard As Classifying Them
Spotlights 7:50-8:10
[7:50]
ShiftAddNAS: Hardware-Inspired Search for More Accurate and Efficient Neural Networks
[7:55]
Provably Adversarially Robust Nearest Prototype Classifiers
[8:00]
Certifying Out-of-Domain Generalization for Blackbox Functions
[8:05]
Intriguing Properties of Input-Dependent Randomized Smoothing
Orals 8:10-8:30
[8:10]
To Smooth or Not? When Label Smoothing Meets Noisy Labels
Spotlights 8:30-8:55
[8:30]
Evaluating the Adversarial Robustness of Adaptive Test-time Defenses
[8:35]
On the Generalization Analysis of Adversarial Learning
[8:40]
Demystifying the Adversarial Robustness of Random Transformation Defenses
[8:45]
Double Sampling Randomized Smoothing
[8:50]
TPC: Transformation-Specific Smoothing for Point Cloud Models
(ends 9:00 AM)
Spotlights 7:30-8:00
[7:30]
Certified Robustness Against Natural Language Attacks by Causal Intervention
[7:35]
A$^3$T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing
[7:40]
On the Learning of Non-Autoregressive Transformers
[7:45]
Latent Diffusion Energy-Based Model for Interpretable Text Modelling
[7:50]
UNIREX: A Unified Learning Framework for Language Model Rationale Extraction
[7:55]
Black-Box Tuning for Language-Model-as-a-Service
Orals 8:00-8:20
[8:00]
Understanding Dataset Difficulty with $\mathcal{V}$-Usable Information
Spotlights 8:20-9:00
[8:20]
Co-training Improves Prompt-based Learning for Large Language Models
[8:25]
Directed Acyclic Transformer for Non-Autoregressive Machine Translation
[8:30]
StreamingQA: A Benchmark for Adaptation to New Knowledge over Time in Question Answering Models
[8:35]
Unsupervised Detection of Contextualized Embedding Bias with Application to Ideology
[8:40]
Generative Cooperative Networks for Natural Language Generation
[8:45]
What Language Model Architecture and Pretraining Objective Works Best for Zero-Shot Generalization?
[8:50]
Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding
[8:55]
ROCK: Causal Inference Principles for Reasoning about Commonsense Causality
(ends 9:00 AM)
Orals 7:30-7:50
[7:30]
Exact Optimal Accelerated Complexity for Fixed-Point Iterations
Spotlights 7:50-8:15
[7:50]
Fast Convex Optimization for Two-Layer ReLU Networks: Equivalent Model Classes and Cone Decompositions
[7:55]
NysADMM: faster composite convex optimization via low-rank approximation
[8:00]
FedNew: A Communication-Efficient and Privacy-Preserving Newton-Type Method for Federated Learning
[8:05]
Unraveling Attention via Convex Duality: Analysis and Interpretations of Vision Transformers
[8:10]
Pairwise Conditional Gradients without Swap Steps and Sparser Kernel Herding
Orals 8:15-8:35
[8:15]
Continuous-Time Analysis of Accelerated Gradient Methods via Conservation Laws in Dilated Coordinate Systems
Spotlights 8:35-9:00
[8:35]
Only tails matter: Average-Case Universality and Robustness in the Convex Regime
[8:40]
Batch Greenkhorn Algorithm for Entropic-Regularized Multimarginal Optimal Transport: Linear Rate of Convergence and Iteration Complexity
[8:45]
Approximate Frank-Wolfe Algorithms over Graph-structured Support Sets
[8:50]
Neural Fisher Discriminant Analysis: Optimal Neural Network Embeddings in Polynomial Time
[8:55]
Active Sampling for Min-Max Fairness
(ends 9:00 AM)
Orals 7:30-7:50
[7:30]
Online Learning for Min Sum Set Cover and Pandora’s Box
Spotlights 7:50-8:15
[7:50]
Smoothed Adversarial Linear Contextual Bandits with Knapsacks
[7:55]
Simultaneously Learning Stochastic and Adversarial Bandits with General Graph Feedback
[8:00]
Thompson Sampling for (Combinatorial) Pure Exploration
[8:05]
Revisiting Online Submodular Minimization: Gap-Dependent Regret Bounds, Best of Both Worlds and Adversarial Robustness
[8:10]
Rotting Infinitely Many-Armed Bandits
Orals 8:15-8:35
[8:15]
Batched Dueling Bandits
Spotlights 8:35-9:00
[8:35]
Equivalence Analysis between Counterfactual Regret Minimization and Online Mirror Descent
[8:40]
Consistent Polyhedral Surrogates for Top-k Classification and Variants
[8:45]
Stochastic Contextual Dueling Bandits under Linear Stochastic Transitivity Models
[8:50]
Optimal and Efficient Dynamic Regret Algorithms for Non-Stationary Dueling Bandits
[8:55]
Online Nonsubmodular Minimization with Delayed Costs: From Full Information to Bandit Feedback
(ends 9:00 AM)
Spotlights 7:30-8:05
[7:30]
Multi-Task Learning as a Bargaining Game
[7:35]
Frustratingly Easy Transferability Estimation
[7:40]
Transformer Neural Processes: Uncertainty-Aware Meta Learning Via Sequence Modeling
[7:45]
A Difference Standardization Method for Mutual Transfer Learning
[7:50]
Improving Task-free Continual Learning by Distributionally Robust Memory Evolution
[7:55]
A Multi-objective / Multi-task Learning Framework Induced by Pareto Stationarity
[8:00]
Sparse Invariant Risk Minimization
Orals 8:05-8:25
[8:05]
Head2Toe: Utilizing Intermediate Representations for Better Transfer Learning
Spotlights 8:25-9:00
[8:25]
A Closer Look at Smoothness in Domain Adversarial Training
[8:30]
Balancing Discriminability and Transferability for Source-Free Domain Adaptation
[8:35]
Model Agnostic Sample Reweighting for Out-of-Distribution Learning
[8:40]
Zero-shot AutoML with Pretrained Models
[8:45]
Efficient Variance Reduction for Meta-learning
[8:50]
Generalizing to Evolving Domains with Latent Structure-Aware Sequential Autoencoder
[8:55]
Partial disentanglement for domain adaptation
(ends 9:00 AM)
Spotlights 7:30-8:05
[7:30]
Structural Entropy Guided Graph Hierarchical Pooling
[7:35]
Self-Supervised Representation Learning via Latent Graph Prediction
[7:40]
DSTAGNN: Dynamic Spatial-Temporal Aware Graph Neural Network for Traffic Flow Forecasting
[7:45]
Coarsening the Granularity: Towards Structurally Sparse Lottery Tickets
[7:50]
Omni-Granular Ego-Semantic Propagation for Self-Supervised Graph Representation Learning
[7:55]
Analyzing and Mitigating Interference in Neural Architecture Search
[8:00]
Reverse Engineering $\ell_p$ attacks: A block-sparse optimization approach with recovery guarantees
Orals 8:05-8:25
[8:05]
Unified Scaling Laws for Routed Language Models
Spotlights 8:25-9:00
[8:25]
DRAGONN: Distributed Randomized Approximate Gradients of Neural Networks
[8:30]
A deep convolutional neural network that is invariant to time rescaling
[8:35]
LyaNet: A Lyapunov Framework for Training Neural ODEs
[8:40]
Transfer and Marginalize: Explaining Away Label Noise with Privileged Information
[8:45]
On Collective Robustness of Bagging Against Data Poisoning
[8:50]
Hindering Adversarial Attacks with Implicit Neural Representations
[8:55]
From Noisy Prediction to True Label: Noisy Prediction Calibration via Generative Model
(ends 9:00 AM)
Spotlights 7:30-8:05
[7:30]
Exploring and Exploiting Hubness Priors for High-Quality GAN Latent Sampling
[7:35]
ButterflyFlow: Building Invertible Layers with Butterfly Matrices
[7:40]
Controlling Conditional Language Models without Catastrophic Forgetting
[7:45]
GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models
[7:50]
Structure-preserving GANs
[7:55]
DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale
[8:00]
Estimating the Optimal Covariance with Imperfect Mean in Diffusion Probabilistic Models
Orals 8:05-8:25
[8:05]
Equivariant Diffusion for Molecule Generation in 3D
Spotlights 8:25-9:00
[8:25]
Forward Operator Estimation in Generative Models with Kernel Transfer Operators
[8:30]
Conditional GANs with Auxiliary Discriminative Classifier
[8:35]
Improved StyleGAN-v2 based Inversion for Out-of-Distribution Images
[8:40]
Matching Normalizing Flows and Probability Paths on Manifolds
[8:45]
Marginal Distribution Adaptation for Discrete Sets via Module-Oriented Divergence Minimization
[8:50]
Learning to Incorporate Texture Saliency Adaptive Attention to Image Cartoonization
[8:55]
Region-Based Semantic Factorization in GANs
(ends 9:00 AM)
9 a.m.
Lunch Break - on your own
10:30 a.m.
Spotlights 10:30-11:00
[10:30]
Online Continual Learning through Mutual Information Maximization
[10:35]
Learning Iterative Reasoning through Energy Minimization
[10:40]
DepthShrinker: A New Compression Paradigm Towards Boosting Real-Hardware Efficiency of Compact Neural Networks
[10:45]
PoF: Post-Training of Feature Extractor for Improving Generalization
[10:50]
Improving Ensemble Distillation With Weight Averaging and Diversifying Perturbation
[10:55]
Set Based Stochastic Subsampling
Orals 11:00-11:20
[11:00]
Monarch: Expressive Structured Matrices for Efficient and Accurate Training
Spotlights 11:20-11:55
[11:20]
Generalizing to New Physical Systems via Context-Informed Dynamics Model
[11:25]
Self-conditioning Pre-Trained Language Models
[11:30]
TAM: Topology-Aware Margin Loss for Class-Imbalanced Node Classification
[11:35]
Bitwidth Heterogeneous Federated Learning with Progressive Weight Dequantization
[11:40]
Penalizing Gradient Norm for Efficiently Improving Generalization in Deep Learning
[11:45]
Knowledge Base Question Answering by Case-based Reasoning over Subgraphs
[11:50]
When AUC meets DRO: Optimizing Partial AUC for Deep Learning with Non-Convex Convergence Guarantee
(ends 12:00 PM)
Spotlights 10:30-11:05
[10:30]
Meaningfully debugging model mistakes using conceptual counterfactual explanations
[10:35]
Measuring the Effect of Training Data on Deep Learning Predictions via Randomized Experiments
[10:40]
Robust Counterfactual Explanations for Tree-Based Ensembles
[10:45]
A Rigorous Study of Integrated Gradients Method and Extensions to Internal Neuron Attributions
[10:50]
Estimating and Penalizing Induced Preference Shifts in Recommender Systems
[10:55]
Framework for Evaluating Faithfulness of Local Explanations
[11:00]
A Consistent and Efficient Evaluation Strategy for Attribution Methods
Orals 11:05-11:25
[11:05]
Training Characteristic Functions with Reinforcement Learning: XAI-methods play Connect Four
Spotlights 11:25-12:00
[11:25]
Label-Descriptive Patterns and Their Application to Characterizing Classification Errors
[11:30]
XAI for Transformers: Better Explanations through Conservative Propagation
[11:35]
Quantification and Analysis of Layer-wise and Pixel-wise Information Discarding
[11:40]
Interpretable Off-Policy Learning via Hyperbox Search
[11:45]
Neuron Dependency Graphs: A Causal Abstraction of Neural Networks
[11:50]
On the Adversarial Robustness of Causal Algorithmic Recourse
[11:55]
Knowledge-Grounded Self-Rationalization via Extractive and Natural Language Explanations
(ends 12:00 PM)
Spotlights 10:30-11:00
[10:30]
Robust Group Synchronization via Quadratic Programming
[10:35]
UAST: Uncertainty-Aware Siamese Tracking
[10:40]
You Only Cut Once: Boosting Data Augmentation with a Single Cut
[10:45]
Generative Modeling for Multi-task Visual Learning
[10:50]
HyperTransformer: Model Generation for Supervised and Semi-Supervised Few-Shot Learning
[10:55]
Parametric Visual Program Induction with Function Modularization
Orals 11:00-11:20
[11:00]
Path-Gradient Estimators for Continuous Normalizing Flows
Spotlights 11:20-11:55
[11:20]
Variational Feature Pyramid Networks
[11:25]
Deep Neural Network Fusion via Graph Matching with Applications to Model Ensemble and Federated Learning
[11:30]
VLMixer: Unpaired Vision-Language Pre-training via Cross-Modal CutMix
[11:35]
Neural Implicit Dictionary Learning via Mixture-of-Expert Training
[11:40]
Time Is MattEr: Temporal Self-supervision for Video Transformers
[11:45]
Benchmarking and Analyzing Point Cloud Classification under Corruptions
[11:50]
Understanding The Robustness in Vision Transformers
(ends 12:00 PM)
Orals 10:30-10:50
[10:30]
Learning Mixtures of Linear Dynamical Systems
Spotlights 10:50-11:15
[10:50]
Massively Parallel $k$-Means Clustering for Perturbation Resilient Instances
[10:55]
Residual-Based Sampling for Online Outlier-Robust PCA
[11:00]
Scaling Gaussian Process Optimization by Evaluating a Few Unique Candidates Multiple Times
[11:05]
Streaming Algorithms for Support-Aware Histograms
[11:10]
Power-Law Escape Rate of SGD
Orals 11:15-11:35
[11:15]
Generalized Results for the Existence and Consistency of the MLE in the Bradley-Terry-Luce Model
Spotlights 11:35-12:00
[11:35]
Faster Algorithms for Learning Convex Functions
[11:40]
Feature selection using e-values
[11:45]
ActiveHedge: Hedge meets Active Learning
[11:50]
One-Pass Algorithms for MAP Inference of Nonsymmetric Determinantal Point Processes
[11:55]
Deciphering Lasso-based Classification Through a Large Dimensional Analysis of the Iterative Soft-Thresholding Algorithm
(ends 12:00 PM)
Spotlights 10:30-11:00
[10:30]
An iterative clustering algorithm for the Contextual Stochastic Block Model with optimality guarantees
[10:35]
Smoothed Adaptive Weighting for Imbalanced Semi-Supervised Learning: Improve Reliability Against Unknown Distribution Data
[10:40]
Class-Imbalanced Semi-Supervised Learning with Adaptive Thresholding
[10:50]
Meta-Learning Hypothesis Spaces for Sequential Decision-making
[10:55]
A Tighter Analysis of Spectral Clustering, and Beyond
Orals 11:00-11:20
[11:00]
Online Active Regression
Spotlights 11:20-11:55
[11:20]
On Finite-Sample Identifiability of Contrastive Learning-Based Nonlinear Independent Component Analysis
[11:25]
Revisiting Contrastive Learning through the Lens of Neighborhood Component Analysis: an Integrated Framework
[11:30]
Open-Sampling: Exploring Out-of-Distribution data for Re-balancing Long-tailed datasets
[11:35]
Confidence Score for Source-Free Unsupervised Domain Adaptation
[11:40]
Gradient Based Clustering
[11:45]
Global Optimization of K-Center Clustering
[11:50]
Latent Outlier Exposure for Anomaly Detection with Contaminated Data
(ends 12:00 PM)
Spotlights 10:30-11:00
[10:30]
Additive Gaussian Processes Revisited
[10:35]
Probabilistic ODE Solutions in Millions of Dimensions
[10:40]
Adaptive Gaussian Process Change Point Detection
[10:45]
Volatility Based Kernels and Moving Average Means for Accurate Forecasting with Gaussian Processes
[10:50]
Fenrir: Physics-Enhanced Regression for Initial Value Problems
[10:55]
Variational nearest neighbor Gaussian process
Orals 11:00-11:20
[11:00]
Preconditioning for Scalable Gaussian Process Hyperparameter Optimization
Spotlights 11:20-11:50
[11:20]
Spectral Representation of Robustness Measures for Optimization Under Input Uncertainty
[11:25]
Bayesian Optimization under Stochastic Delayed Feedback
[11:30]
Bayesian Optimization for Distributionally Robust Chance-constrained Problem
[11:35]
Efficient Distributionally Robust Bayesian Optimization with Worst-case Sensitivity
[11:40]
Improved Convergence Rates for Sparse Approximation Methods in Kernel-Based Learning
[11:45]
Scalable First-Order Bayesian Optimization via Structured Automatic Differentiation
(ends 12:00 PM)
Orals 10:30-10:50
[10:30]
Align-RUDDER: Learning From Few Demonstrations by Reward Redistribution
Spotlights 10:50-11:15
[10:50]
AnyMorph: Learning Transferable Polices By Inferring Agent Morphology
[10:55]
DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with Prototypical Representations
[11:00]
Stabilizing Off-Policy Deep Reinforcement Learning from Pixels
[11:05]
Influence-Augmented Local Simulators: a Scalable Solution for Fast Deep RL in Large Networked Systems
[11:10]
CtrlFormer: Learning Transferable State Representation for Visual Control via Transformer
Orals 11:15-11:35
[11:15]
Offline RL Policies Should Be Trained to be Adaptive
Spotlights 11:35-12:00
[11:35]
Lyapunov Density Models: Constraining Distribution Shift in Learning-Based Control
[11:40]
PMIC: Improving Multi-Agent Reinforcement Learning with Progressive Mutual Information Collaboration
[11:45]
Supervised Off-Policy Ranking
[11:50]
The Primacy Bias in Deep Reinforcement Learning
[11:55]
Regularizing a Model-based Policy Stationary Distribution to Stabilize Offline Reinforcement Learning
(ends 12:00 PM)
Orals 10:30-10:50
[10:30]
Topology-Aware Network Pruning using Multi-stage Graph Embedding and Reinforcement Learning
Spotlights 10:50-11:10
[10:50]
Stochastic Reweighted Gradient Descent
[10:55]
Sharpened Quasi-Newton Methods: Faster Superlinear Rate and Larger Local Convergence Neighborhood
[11:00]
Image-to-Image Regression with Distribution-Free Uncertainty Quantification and Applications in Imaging
[11:05]
FedNL: Making Newton-Type Methods Applicable to Federated Learning
Orals 11:10-11:30
[11:10]
Solving Stackelberg Prediction Game with Least Squares Loss via Spherically Constrained Least Squares Reformulation
Spotlights 11:30-11:55
[11:30]
Dimension-free Complexity Bounds for High-order Nonconvex Finite-sum Optimization
[11:35]
Value Function based Difference-of-Convex Algorithm for Bilevel Hyperparameter Selection Problems
[11:40]
Probabilistic Bilevel Coreset Selection
[11:45]
Linear-Time Gromov Wasserstein Distances using Low Rank Couplings and Costs
[11:50]
On Implicit Bias in Overparameterized Bilevel Optimization
(ends 12:00 PM)
Spotlights 10:30-11:05
[10:30]
pathGCN: Learning General Graph Spatial Operators from Paths
[10:35]
Graph-Coupled Oscillator Networks
[10:40]
HousE: Knowledge Graph Embedding with Householder Parameterization
[10:45]
Interpretable and Generalizable Graph Learning via Stochastic Attention Mechanism
[10:50]
ProGCL: Rethinking Hard Negative Mining in Graph Contrastive Learning
[10:55]
G$^2$CN: Graph Gaussian Convolution Networks with Concentrated Graph Filters
[11:00]
SpeqNets: Sparsity-aware permutation-equivariant graph networks
Orals 11:05-11:25
[11:05]
data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language
Spotlights 11:25-11:55
[11:25]
Position Prediction as an Effective Pretraining Strategy
[11:30]
Orchestra: Unsupervised Federated Learning via Globally Consistent Clustering
[11:35]
Deep and Flexible Graph Neural Architecture Search
[11:40]
GNNRank: Learning Global Rankings from Pairwise Comparisons via Directed Graph Neural Networks
[11:45]
Large-Scale Graph Neural Architecture Search
[11:50]
Optimization-Induced Graph Implicit Nonlinear Diffusion
(ends 12:00 PM)
Orals 10:30-10:50
[10:30]
Robustness Implies Generalization via Data-Dependent Generalization Bounds
Spotlights 10:50-11:15
[10:50]
Learning to Hash Robustly, Guaranteed
[10:55]
Policy Gradient Method For Robust Reinforcement Learning
[11:00]
A query-optimal algorithm for finding counterfactuals
[11:05]
Linear Bandit Algorithms with Sublinear Time Complexity
[11:10]
Quantum-Inspired Algorithms from Randomized Numerical Linear Algebra
Orals 11:15-11:35
[11:15]
Individual Preference Stability for Clustering
Spotlights 11:35-12:00
[11:35]
Correlated Quantization for Distributed Mean Estimation and Optimization
[11:40]
Multiple-Play Stochastic Bandits with Shareable Finite-Capacity Arms
[11:45]
Coordinated Attacks against Contextual Bandits: Fundamental Limits and Defense Mechanisms
[11:50]
The Algebraic Path Problem for Graph Metrics
[11:55]
Steerable 3D Spherical Neurons
(ends 12:00 PM)
11 a.m.
noon
Coffee Break
12:30 p.m.
1 p.m.
Short Break
1:15 p.m.
Spotlights 1:15-1:50
[1:15]
Prototype Based Classification from Hierarchy to Fairness
[1:20]
Neural-Symbolic Models for Logical Queries on Knowledge Graphs
[1:25]
Deep Probability Estimation
[1:30]
Uncertainty Modeling in Generative Compressed Sensing
[1:35]
Going Deeper into Permutation-Sensitive Graph Neural Networks
[1:40]
Learning from Counterfactual Links for Link Prediction
[1:45]
Training Discrete Deep Generative Models via Gapped Straight-Through Estimator
Orals 1:50-2:10
[1:50]
Correct-N-Contrast: a Contrastive Approach for Improving Robustness to Spurious Correlations
Spotlights 2:10-2:45
[2:10]
Principal Component Flows
[2:15]
Bit Prioritization in Variational Autoencoders via Progressive Coding
[2:20]
Generative Flow Networks for Discrete Probabilistic Modeling
[2:25]
Diffusion bridges vector quantized variational autoencoders
[2:30]
Mitigating Modality Collapse in Multimodal VAEs via Impartial Optimization
[2:35]
Soft Truncation: A Universal Training Technique of Score-based Diffusion Model for High Precision Score Estimation
[2:40]
Fast and Reliable Evaluation of Adversarial Robustness with Minimum-Margin Attack
(ends 2:45 PM)
Spotlights 1:15-1:50
[1:15]
Coordinated Double Machine Learning
[1:20]
Exploiting Independent Instruments: Identification and Distribution Generalization
[1:25]
Partial Counterfactual Identification from Observational and Experimental Data
[1:30]
On Measuring Causal Contributions via do-interventions
[1:35]
The Role of Deconfounding in Meta-learning
[1:40]
CITRIS: Causal Identifiability from Temporal Intervened Sequences
[1:45]
Online Balanced Experimental Design
Orals 1:50-2:10
[1:50]
Minimum Cost Intervention Design for Causal Effect Identification
Spotlights 2:10-2:45
[2:10]
Causal structure-based root cause analysis of outliers
[2:15]
Instrumental Variable Regression with Confounder Balancing
[2:20]
Causal Transformer for Estimating Counterfactual Outcomes
[2:25]
Causal Inference Through the Structural Causal Marginal Problem
[2:30]
Functional Generalized Empirical Likelihood Estimation for Conditional Moment Restrictions
[2:35]
Matching Learned Causal Effects of Neural Networks with Domain Priors
[2:40]
Inferring Cause and Effect in the Presence of Heteroscedastic Noise
(ends 2:45 PM)
Orals 1:15-1:35
[1:15]
POEM: Out-of-Distribution Detection with Posterior Sampling
Spotlights 1:35-1:55
[1:35]
Selective Network Linearization for Efficient Private Inference
[1:40]
Efficient Computation of Higher-Order Subgraph Attribution via Message Passing
[1:45]
A Theoretical Analysis on Independence-driven Importance Weighting for Covariate-shift Generalization
[1:50]
Modular Conformal Calibration
Orals 1:55-2:15
[1:55]
Rethinking Image-Scaling Attacks: The Interplay Between Vulnerabilities in Machine Learning Systems
Spotlights 2:15-2:40
[2:15]
Context-Aware Drift Detection
[2:20]
Accelerating Shapley Explanation via Contributive Cooperator Selection
[2:25]
An Equivalence Between Data Poisoning and Byzantine Gradient Attacks
[2:30]
DAVINZ: Data Valuation using Deep Neural Networks at Initialization
[2:35]
Sample Efficient Learning of Predictors that Complement Humans
(ends 2:45 PM)
Orals 1:15-1:35
[1:15]
H-Consistency Bounds for Surrogate Loss Minimizers
Spotlights 1:35-2:00
[1:35]
Learning General Halfspaces with Adversarial Label Noise via Online Gradient Descent
[1:40]
The Teaching Dimension of Regularized Kernel Learners
[1:45]
Sparse Mixed Linear Regression with Guarantees: Taming an Intractable Problem with Invex Relaxation
[1:50]
TURF: Two-Factor, Universal, Robust, Fast Distribution Learning Algorithm
[1:55]
Multiclass learning with margin: exponential rates with no bias-variance trade-off
Orals 2:00-2:20
[2:00]
Refined Convergence Rates for Maximum Likelihood Estimation under Finite Mixture Models
Spotlights 2:20-2:45
[2:20]
High Probability Guarantees for Nonconvex Stochastic Gradient Descent with Heavy Tails
[2:25]
An Initial Alignment between Neural Network and Target is Needed for Gradient Descent to Learn
[2:30]
Inductive Biases and Variable Creation in Self-Attention Mechanisms
[2:35]
Topology-aware Generalization of Decentralized SGD
[2:40]
Understanding Gradient Descent on the Edge of Stability in Deep Learning
(ends 2:45 PM)
Spotlights 1:15-1:45
[1:15]
Bayesian Nonparametric Learning for Point Processes with Spatial Homogeneity: A Spatial Analysis of NBA Shot Locations
[1:20]
On the Effects of Artificial Data Modification
[1:25]
Deep Squared Euclidean Approximation to the Levenshtein Distance for DNA Storage
[1:30]
How Faithful is your Synthetic Data? Sample-level Metrics for Evaluating and Auditing Generative Models
[1:35]
Error-driven Input Modulation: Solving the Credit Assignment Problem without a Backward Pass
[1:40]
How to Train Your Wide Neural Network Without Backprop: An Input-Weight Alignment Perspective
Orals 1:45-2:05
[1:45]
Contrastive Mixture of Posteriors for Counterfactual Inference, Data Integration and Fairness
Spotlights 2:05-2:45
[2:05]
Describing Differences between Text Distributions with Natural Language
[2:10]
Distinguishing rule- and exemplar-based generalization in learning systems
[2:15]
Burst-Dependent Plasticity and Dendritic Amplification Support Target-Based Learning and Hierarchical Imitation Learning
[2:20]
A Deep Learning Approach for the Segmentation of Electroencephalography Data in Eye Tracking Applications
[2:25]
Minimizing Control for Credit Assignment with Strong Feedback
[2:30]
Self-Supervised Models of Audio Effectively Explain Human Cortical Responses to Speech
[2:35]
Towards Scaling Difference Target Propagation by Learning Backprop Targets
[2:40]
Content Addressable Memory Without Catastrophic Forgetting by Heteroassociation with a Fixed Scaffold
(ends 2:45 PM)
Orals 1:15-1:35
[1:15]
Scalable MCMC Sampling for Nonsymmetric Determinantal Point Processes
Spotlights 1:35-2:00
[1:35]
Robust SDE-Based Variational Formulations for Solving Linear PDEs via Deep Learning
[1:40]
Hessian-Free High-Resolution Nesterov Acceleration For Sampling
[1:45]
LSB: Local Self-Balancing MCMC in Discrete Spaces
[1:50]
A Langevin-like Sampler for Discrete Distributions
[1:55]
Scalable Spike-and-Slab
Orals 2:00-2:20
[2:00]
Nonparametric Involutive Markov Chain Monte Carlo
Spotlights 2:20-2:45
[2:20]
Continual Repeated Annealed Flow Transport Monte Carlo
[2:25]
Algorithms for the Communication of Samples
[2:30]
Low-Precision Stochastic Gradient Langevin Dynamics
[2:35]
Fast Relative Entropy Coding with A* coding
[2:40]
Accurate Quantization of Measures via Interacting Particle-based Optimization
(ends 2:45 PM)
Spotlights 1:15-1:45
[1:15]
Neural Network Weights Do Not Converge to Stationary Points: An Invariant Measure Perspective
[1:20]
Convergence and Recovery Guarantees of the K-Subspaces Method for Subspace Clustering
[1:25]
Restarted Nonconvex Accelerated Gradient Descent: No More Polylogarithmic Factor in the $O(\epsilon^{-7/4})$ Complexity
[1:30]
Understanding the unstable convergence of gradient descent
[1:35]
Federated Minimax Optimization: Improved Convergence Analyses and Algorithms
[1:40]
Inductive Matrix Completion: No Bad Local Minima and a Fast Algorithm
Orals 1:45-2:05
[1:45]
FedNest: Federated Bilevel, Minimax, and Compositional Optimization
Spotlights 2:05-2:35
[2:05]
AdaGrad Avoids Saddle Points
[2:10]
Fast and Provable Nonconvex Tensor RPCA
[2:15]
On Convergence of Gradient Descent Ascent: A Tight Local Analysis
[2:20]
Convergence Rates of Non-Convex Stochastic Gradient Descent Under a Generic Lojasiewicz Condition and Local Smoothness
[2:25]
A Single-Loop Gradient Descent and Perturbed Ascent Algorithm for Nonconvex Functional Constrained Optimization
[2:30]
Anticorrelated Noise Injection for Improved Generalization
(ends 2:45 PM)
Spotlights 1:15-1:45
[1:15]
Model-Free Opponent Shaping
[1:20]
Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline Reinforcement Learning
[1:25]
Efficient Model-based Multi-agent Reinforcement Learning via Optimistic Equilibrium Computation
[1:30]
Disentangling Sources of Risk for Distributional Multi-Agent Reinforcement Learning
[1:35]
Scalable Deep Reinforcement Learning Algorithms for Mean Field Games
[1:40]
Revisiting Some Common Practices in Cooperative Multi-Agent Reinforcement Learning
Orals 1:45-2:05
[1:45]
Independent Policy Gradient for Large-Scale Markov Potential Games: Sharper Rates, Function Approximation, and Game-Agnostic Convergence
Spotlights 2:05-2:45
[2:05]
Self-Organized Polynomial-Time Coordination Graphs
[2:10]
Individual Reward Assisted Multi-Agent Reinforcement Learning
[2:15]
Generalized Beliefs for Cooperative AI
[2:20]
Greedy when Sure and Conservative when Uncertain about the Opponents
[2:25]
Deconfounded Value Decomposition for Multi-Agent Reinforcement Learning
[2:30]
Welfare Maximization in Competitive Equilibrium: Reinforcement Learning for Markov Exchange Economy
[2:35]
Simplex Neural Population Learning: Any-Mixture Bayes-Optimality in Symmetric Zero-sum Games
[2:40]
Sample and Communication-Efficient Decentralized Actor-Critic Algorithms with Finite-Time Analysis
(ends 2:45 PM)
Spotlights 1:15-1:45
[1:15]
Modeling Irregular Time Series with Continuous Recurrent Units
[1:20]
TACTiS: Transformer-Attentional Copulas for Time Series
[1:25]
CerDEQ: Certifiable Deep Equilibrium Model
[1:30]
Approximately Equivariant Networks for Imperfectly Symmetric Dynamics
[1:35]
IDYNO: Learning Nonparametric DAGs from Interventional Dynamic Data
[1:40]
GSmooth: Certified Robustness against Semantic Transformations via Generalized Randomized Smoothing
Orals 1:45-2:05
[1:45]
Neural Laplace: Learning diverse classes of differential equations in the Laplace domain
Spotlights 2:05-2:45
[2:05]
Improving Language Models by Retrieving from Trillions of Tokens
[2:10]
Closed-Form Diffeomorphic Transformations for Time Series Alignment
[2:15]
Removing Batch Normalization Boosts Adversarial Training
[2:20]
Forget-free Continual Learning with Winning Subnetworks
[2:25]
FEDformer: Frequency Enhanced Decomposed Transformer for Long-term Series Forecasting
[2:30]
Adversarial Robustness against Multiple and Single $l_p$-Threat Models via Quick Fine-Tuning of Robust Classifiers
[2:35]
On the Practicality of Deterministic Epistemic Uncertainty
[2:40]
Combining Diverse Feature Priors
(ends 2:45 PM)
Orals 1:15-1:35
[1:15]
Cooperative Online Learning in Stochastic and Adversarial MDPs
Spotlights 1:35-2:00
[1:35]
Simple and near-optimal algorithms for hidden stratification and multi-group learning
[1:40]
Being Properly Improper
[1:45]
Neural Network Pruning Denoises the Features and Makes Local Connectivity Emerge in Visual Tasks
[1:50]
On the Finite-Time Complexity and Practical Computation of Approximate Stationarity Concepts of Lipschitz Functions
[1:55]
Nearly Optimal Policy Optimization with Stable at Any Time Guarantee
Orals 2:00-2:20
[2:00]
Contextual Bandits with Smooth Regret: Efficient Learning in Continuous Action Spaces
Spotlights 2:20-2:45
[2:20]
Minimax M-estimation under Adversarial Contamination
[2:25]
Adaptive Best-of-Both-Worlds Algorithm for Heavy-Tailed Multi-Armed Bandits
[2:30]
Efficiently Learning the Topology and Behavior of a Networked Dynamical System Via Active Queries
[2:35]
Boosting Graph Structure Learning with Dummy Nodes
[2:40]
Lazy Estimation of Variable Importance for Large Neural Networks
(ends 2:45 PM)
3:30 p.m.
Posters 3:30-5:30
DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale
DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with Prototypical Representations
(ends 5:30 PM)
4 p.m.
WED 20 JUL
3:30 a.m.
Breakfast on your own
4 a.m.
(ends 4:00 PM)
6 a.m.
Invited Talk:
Regina Barzilay
(ends 7:00 AM)
7 a.m.
Coffee Break
7:30 a.m.
Spotlights 7:30-8:05
[7:30]
Towards understanding how momentum improves generalization in deep learning
[7:35]
What Can Linear Interpolation of Neural Network Loss Landscapes Tell Us?
[7:40]
Deep equilibrium networks are sensitive to initialization statistics
[7:45]
Scaling-up Diverse Orthogonal Convolutional Networks by a Paraunitary Framework
[7:50]
Stability Based Generalization Bounds for Exponential Family Langevin Dynamics
[7:55]
Local Augmentation for Graph Neural Networks
[8:00]
On Non-local Convergence Analysis of Deep Linear Networks
Orals 8:05-8:25
[8:05]
Adaptive Inertia: Disentangling the Effects of Adaptive Learning Rate and Momentum
Spotlights 8:25-9:00
[8:25]
Diversified Adversarial Attacks based on Conjugate Gradient Method
[8:30]
On the Optimization Landscape of Neural Collapse under MSE Loss: Global Optimality with Unconstrained Features
[8:35]
On the Equivalence Between Temporal and Static Equivariant Graph Representations
[8:40]
Robust Training under Label Noise by Over-parameterization
[8:45]
Implicit Bias of the Step Size in Linear Diagonal Neural Networks
[8:50]
Extended Unconstrained Features Model for Exploring Deep Neural Collapse
[8:55]
Score-Guided Intermediate Level Optimization: Fast Langevin Mixing for Inverse Problems
(ends 9:00 AM)
Spotlights 7:30-8:05
[7:30]
Weisfeiler-Lehman Meets Gromov-Wasserstein
[7:35]
GenLabel: Mixup Relabeling using Generative Models
[7:40]
When and How Mixup Improves Calibration
[7:45]
On Transportation of Mini-batches: A Hierarchical Approach
[7:50]
VariGrow: Variational Architecture Growing for Task-Agnostic Continual Learning based on Bayesian Novelty
[7:55]
Beyond Images: Label Noise Transition Matrix Estimation for Tasks with Lower-Quality Features
[8:00]
A Model-Agnostic Randomized Learning Framework based on Random Hypothesis Subspace Sampling
Orals 8:05-8:25
[8:05]
Stable Conformal Prediction Sets
Spotlights 8:25-9:00
[8:25]
Rethinking Fano’s Inequality in Ensemble Learning
[8:30]
FITNESS: (Fine Tune on New and Similar Samples) to detect anomalies in streams with drift and outliers
[8:35]
Improving Mini-batch Optimal Transport via Partial Transportation
[8:40]
Near-optimal rate of consistency for linear models with missing values
[8:45]
Permutation Search of Tensor Network Structures via Local Sampling
[8:50]
Revisiting Label Smoothing and Knowledge Distillation Compatibility: What was Missing?
[8:55]
DNNR: Differential Nearest Neighbors Regression
(ends 9:00 AM)
Spotlights 7:30-8:00
[7:30]
Learning Domain Adaptive Object Detection with Probabilistic Teacher
[7:35]
Adaptive Data Analysis with Correlated Observations
[7:40]
Efficient PAC Learning from the Crowd with Pairwise Comparisons
[7:45]
On the Statistical Benefits of Curriculum Learning
[7:50]
Feature and Parameter Selection in Stochastic Linear Bandits
[7:55]
Disentangled Federated Learning for Tackling Attributes Skew via Invariant Aggregation and Diversity Transferring
Orals 8:00-8:20
[8:00]
A new similarity measure for covariate shift with applications to nonparametric regression
Spotlights 8:20-9:00
[8:20]
Contextual Bandits with Large Action Spaces: Made Practical
[8:25]
Identifiability Conditions for Domain Adaptation
[8:30]
Streaming Algorithms for High-Dimensional Robust Statistics
[8:35]
Popular decision tree algorithms are provably noise tolerant
[8:40]
Understanding and Improving Knowledge Graph Embedding for Entity Alignment
[8:45]
Perfectly Balanced: Improving Transfer and Robustness of Supervised Contrastive Learning
[8:50]
Robust Fine-Tuning of Deep Neural Networks with Hessian-based Generalization Guarantees
[8:55]
Understanding Gradual Domain Adaptation: Improved Analysis, Optimal Path and Beyond
(ends 9:00 AM)
Spotlights 7:30-8:05
[7:30]
Skin Deep Unlearning: Artefact and Instrument Debiasing in the Context of Melanoma Classification
[7:35]
One-Pass Diversified Sampling with Application to Terabyte-Scale Genomic Sequence Streams
[7:40]
Unsupervised Flow-Aligned Sequence-to-Sequence Learning for Video Restoration
[7:45]
ME-GAN: Learning Panoptic Electrocardio Representations for Multi-view ECG Synthesis Conditioned on Heart Diseases
[7:50]
Variational Mixtures of ODEs for Inferring Cellular Gene Expression Dynamics
[7:55]
Bayesian Imitation Learning for End-to-End Mobile Manipulation
[8:00]
De novo mass spectrometry peptide sequencing with a transformer model
Orals 8:05-8:25
[8:05]
Learning inverse folding from millions of predicted structures
Spotlights 8:25-9:00
[8:25]
Guided-TTS: A Diffusion Model for Text-to-Speech via Classifier Guidance
[8:30]
MAE-DET: Revisiting Maximum Entropy Principle in Zero-Shot NAS for Efficient Object Detection
[8:35]
Proximal Exploration for Model-guided Protein Sequence Design
[8:40]
Tranception: Protein Fitness Prediction with Autoregressive Transformers and Inference-time Retrieval
[8:45]
How to Fill the Optimum Set? Population Gradient Descent with Harmless Diversity
[8:50]
Examining Scaling and Transfer of Language Model Architectures for Machine Translation
[8:55]
State Transition of Dendritic Spines Improves Learning of Sparse Spiking Neural Networks
(ends 9:00 AM)
Orals 7:30-7:50
[7:30]
How Tempering Fixes Data Augmentation in Bayesian Neural Networks
Spotlights 7:50-8:15
[7:50]
Surrogate Likelihoods for Variational Annealed Importance Sampling
[7:55]
Nonparametric Sparse Tensor Factorization with Hierarchical Gamma Processes
[8:00]
Fat–Tailed Variational Inference with Anisotropic Tail Adaptive Flows
[8:05]
Variational Sparse Coding with Learned Thresholding
[8:10]
Structured Stochastic Gradient MCMC
Orals 8:15-8:35
[8:15]
BAMDT: Bayesian Additive Semi-Multivariate Decision Trees for Nonparametric Regression
Spotlights 8:35-8:50
[8:35]
Variational Inference with Locally Enhanced Bounds for Hierarchical Models
[8:40]
Centroid Approximation for Bootstrap: Improving Particle Quality at Inference
[8:45]
Deep Reference Priors: What is the best way to pretrain a model?
(ends 9:00 AM)
Spotlights 7:30-8:05
[7:30]
Modeling Strong and Human-Like Gameplay with KL-Regularized Search
[7:35]
Showing Your Offline Reinforcement Learning Work: Online Evaluation Budget Matters
[7:40]
Phasic Self-Imitative Reduction for Sparse-Reward Goal-Conditioned Reinforcement Learning
[7:45]
Model-based Meta Reinforcement Learning using Graph Structured Surrogate Models and Amortized Policy Search
[7:50]
Generalized Data Distribution Iteration
[7:55]
Optimizing Tensor Network Contraction Using Reinforcement Learning
[8:00]
History Compression via Language Models in Reinforcement Learning
Orals 8:05-8:25
[8:05]
REvolveR: Continuous Evolutionary Models for Robot-to-robot Policy Transfer
Spotlights 8:25-9:00
[8:25]
LeNSE: Learning To Navigate Subgraph Embeddings for Large-Scale Combinatorial Optimisation
[8:30]
Efficient Learning for AlphaZero via Path Consistency
[8:35]
A data-driven approach for learning to control computers
[8:40]
Zero-Shot Reward Specification via Grounded Natural Language
[8:45]
How to Stay Curious while avoiding Noisy TVs using Aleatoric Uncertainty Estimation
[8:50]
Model-Value Inconsistency as a Signal for Epistemic Uncertainty
[8:55]
Improving Policy Optimization with Generalist-Specialist Learning
(ends 9:00 AM)
Spotlights 7:30-8:05
[7:30]
On Numerical Integration in Neural Ordinary Differential Equations
[7:35]
Reverse Engineering the Neural Tangent Kernel
[7:40]
Principled Knowledge Extrapolation with GANs
[7:45]
Informed Learning by Wide Neural Networks: Convergence, Generalization and Sampling Complexity
[7:50]
Data Augmentation as Feature Manipulation
[7:55]
Convolutional and Residual Networks Provably Contain Lottery Tickets
[8:00]
Feature Learning and Signal Propagation in Deep Neural Networks
Orals 8:05-8:25
[8:05]
Robust Training of Neural Networks Using Scale Invariant Architectures
Spotlights 8:25-9:00
[8:25]
Understanding Contrastive Learning Requires Incorporating Inductive Biases
[8:30]
Implicit Regularization with Polynomial Growth in Deep Tensor Factorization
[8:35]
Deep Network Approximation in Terms of Intrinsic Parameters
[8:40]
Coin Flipping Neural Networks
[8:45]
Benefits of Overparameterized Convolutional Residual Networks: Function Approximation under Smoothness Constraint
[8:50]
More Than a Toy: Random Matrix Models Predict How Real-World Neural Representations Generalize
[8:55]
SE(3) Equivariant Graph Neural Networks with Complete Local Frames
(ends 9:00 AM)
Spotlights 7:30-8:05
[7:30]
Interpretable Neural Networks with Frank-Wolfe: Sparse Relevance Maps and Relevance Orderings
[7:35]
Label-Free Explainability for Unsupervised Models
[7:40]
Towards Theoretical Analysis of Transformation Complexity of ReLU DNNs
[7:45]
A Study of Face Obfuscation in ImageNet
[7:50]
Fair Representation Learning through Implicit Path Alignment
[7:55]
Mitigating Neural Network Overconfidence with Logit Normalization
[8:00]
Learning fair representation with a parametric integral probability metric
Orals 8:05-8:25
[8:05]
Privacy for Free: How does Dataset Condensation Help Privacy?
Spotlights 8:25-9:00
[8:25]
Fair Generalized Linear Models with a Convex Penalty
[8:30]
HyperPrompt: Prompt-based Task-Conditioning of Transformers
[8:35]
Validating Causal Inference Methods
[8:40]
The Multivariate Community Hawkes Model for Dependent Relational Events in Continuous-time Networks
[8:45]
Scalable Deep Gaussian Markov Random Fields for General Graphs
[8:50]
Anytime Information Cascade Popularity Prediction via Self-Exciting Processes
[8:55]
Deep Variational Graph Convolutional Recurrent Network for Multivariate Time Series Anomaly Detection
(ends 9:00 AM)
Orals 7:30-7:50
[7:30]
Adapting to Mixing Time in Stochastic Optimization with Markovian Data
Spotlights 7:50-8:15
[7:50]
Fast Composite Optimization and Statistical Recovery in Federated Learning
[7:55]
Beyond Worst-Case Analysis in Stochastic Approximation: Moment Estimation Improves Instance Complexity
[8:00]
Personalization Improves Privacy-Accuracy Tradeoffs in Federated Learning
[8:05]
Optimal Algorithms for Stochastic Multi-Level Compositional Optimization
[8:10]
Finite-Sum Coupled Compositional Stochastic Optimization: Theory and Applications
Orals 8:15-8:35
[8:15]
Towards Noise-adaptive, Problem-adaptive (Accelerated) Stochastic Gradient Descent
Spotlights 8:35-9:00
[8:35]
Statistical inference with implicit SGD: proximal Robbins-Monro vs. Polyak-Ruppert
[8:40]
ProxSkip: Yes! Local Gradient Steps Provably Lead to Communication Acceleration! Finally!
[8:45]
Communication-Efficient Adaptive Federated Learning
[8:50]
RECAPP: Crafting a More Efficient Catalyst for Convex Optimization
[8:55]
Kill a Bird with Two Stones: Closing the Convergence Gaps in Non-Strongly Convex Optimization by Directly Accelerated SVRG with Double Compensation and Snapshots
(ends 9:00 AM)
Orals 7:30-7:50
[7:30]
A Minimax Learning Approach to Off-Policy Evaluation in Confounded Partially Observable Markov Decision Processes
Spotlights 7:50-8:10
[7:50]
The Power of Exploiter: Provable Multi-Agent RL in Large State Spaces
[7:55]
Extracting Latent State Representations with Linear Dynamics from Rich Observations
[8:00]
For Learning in Symmetric Teams, Local Optima are Global Nash Equilibria
[8:05]
Consensus Multiplicative Weights Update: Learning to Learn using Projector-based Game Signatures
Orals 8:10-8:30
[8:10]
Learning Markov Games with Adversarial Opponents: Efficient Algorithms and Fundamental Limits
Spotlights 8:30-8:55
[8:30]
Strategic Instrumental Variable Regression: Recovering Causal Relationships From Strategic Responses
[8:35]
Learning to Infer Structures of Network Games
[8:40]
Nearly Minimax Optimal Reinforcement Learning with Linear Function Approximation
[8:45]
Near-Optimal Learning of Extensive-Form Games with Imperfect Information
[8:50]
Guarantees for Epsilon-Greedy Reinforcement Learning with Function Approximation
(ends 9:00 AM)
8 a.m.
9 a.m.
Lunch Break On Your Own
10:15 a.m.
Spotlights 10:15-10:50
[10:15]
From data to functa: Your data point is a function and you can treat it like one
[10:20]
DisPFL: Towards Communication-Efficient Personalized Federated Learning via Decentralized Sparse Training
[10:25]
Differentiable Top-k Classification Learning
[10:30]
Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural Networks
[10:35]
Characterizing and Overcoming the Greedy Nature of Learning in Multi-modal Deep Neural Networks
[10:40]
Training Your Sparse Neural Network Better with Any Mask
[10:45]
Federated Learning with Positive and Unlabeled Data
Orals 10:50-11:10
[10:50]
Generating 3D Molecules for Target Protein Binding
Spotlights 11:10-11:45
[11:10]
Sparse Double Descent: Where Network Pruning Aggravates Overfitting
[11:15]
Collaboration of Experts: Achieving 80% Top-1 Accuracy on ImageNet with 100M FLOPs
[11:20]
Revisiting Consistency Regularization for Deep Partial Label Learning
[11:25]
Stochastic smoothing of the top-K calibrated hinge loss for deep imbalanced classification
[11:30]
A Unified Weight Initialization Paradigm for Tensorial Convolutional Neural Networks
[11:35]
PLATINUM: Semi-Supervised Model Agnostic Meta-Learning using Submodular Mutual Information
[11:40]
Multicoated Supermasks Enhance Hidden Networks
(ends 11:45 AM)
Spotlights 10:15-10:50
[10:15]
Choosing Answers in Epsilon-Best-Answer Identification for Linear Bandits
[10:20]
On the Finite-Time Performance of the Knowledge Gradient Algorithm
[10:25]
Expression might be enough: representing pressure and demand for reinforcement learning based traffic signal control
[10:30]
Generalization Bounds using Lower Tail Exponents in Stochastic Optimizers
[10:35]
No-Regret Learning in Time-Varying Zero-Sum Games
[10:40]
Achieving Minimax Rates in Pool-Based Batch Active Learning
[10:45]
Active Multi-Task Representation Learning
Orals 10:50-11:10
[10:50]
Active fairness auditing
Spotlights 11:10-11:45
[11:10]
Metric-Fair Active Learning
[11:15]
Metric-Fair Classifier Derandomization
[11:20]
Interactively Learning Preference Constraints in Linear Bandits
[11:25]
Convergence of Uncertainty Sampling for Active Learning
[11:30]
Thompson Sampling for Robust Transfer in Multi-Task Bandits
[11:35]
Constants Matter: The Performance Gains of Active Learning
[11:40]
Cross-Space Active Learning on Graph Convolutional Networks
(ends 11:45 AM)
Spotlights 10:15-10:45
[10:15]
MemSR: Training Memory-efficient Lightweight Model for Image Super-Resolution
[10:20]
PINs: Progressive Implicit Networks for Multi-Scale Neural Representations
[10:25]
Accelerating Bayesian Optimization for Biological Sequence Design with Denoising Autoencoders
[10:30]
Generative Coarse-Graining of Molecular Conformations
[10:35]
LIMO: Latent Inceptionism for Targeted Molecule Generation
[10:40]
Learning to Separate Voices by Spatial Regions
Orals 10:45-11:05
[10:45]
3DLinker: An E(3) Equivariant Variational Autoencoder for Molecular Linker Design
Spotlights 11:05-11:40
[11:05]
3D Infomax improves GNNs for Molecular Property Prediction
[11:10]
Biological Sequence Design with GFlowNets
[11:15]
Pocket2Mol: Efficient Molecular Sampling Based on 3D Protein Pockets
[11:20]
Retroformer: Pushing the Limits of End-to-end Retrosynthesis Transformer
[11:25]
Constrained Optimization with Dynamic Bound-scaling for Effective NLP Backdoor Defense
[11:30]
Path-Aware and Structure-Preserving Generation of Synthetically Accessible Molecules
[11:35]
EquiBind: Geometric Deep Learning for Drug Binding Structure Prediction
(ends 11:45 AM)
Spotlights 10:15-10:45
[10:15]
Decomposing Temporal High-Order Interactions via Latent ODEs
[10:20]
Log-Euclidean Signatures for Intrinsic Distances Between Unaligned Datasets
[10:25]
DRIBO: Robust Deep Reinforcement Learning via Multi-View Information Bottleneck
[10:30]
End-to-End Balancing for Causal Continuous Treatment-Effect Estimation
[10:35]
Role-based Multiplex Network Embedding
[10:40]
Measure Estimation in the Barycentric Coding Model
Orals 10:45-11:05
[10:45]
RieszNet and ForestRiesz: Automatic Debiased Machine Learning with Neural Nets and Random Forests
Spotlights 11:05-11:35
[11:05]
Counterfactual Transportability: A Formal Approach
[11:10]
Identification of Linear Non-Gaussian Latent Hierarchical Structure
[11:15]
COAT: Measuring Object Compositionality in Emergent Representations
[11:20]
Generalization and Robustness Implications in Object-Centric Learning
[11:25]
NAFS: A Simple yet Tough-to-beat Baseline for Graph Representation Learning
[11:30]
Action-Sufficient State Representation Learning for Control with Structural Constraints
(ends 11:45 AM)
Orals 10:15-10:35
[10:15]
Bayesian Continuous-Time Tucker Decomposition
Spotlights 10:35-11:00
[10:35]
Approximate Bayesian Computation with Domain Expert in the Loop
[10:40]
Discrete Probabilistic Inverse Optimal Transport
[10:45]
Easy Variational Inference for Categorical Models via an Independent Binary Approximation
[10:50]
Streaming Inference for Infinite Feature Models
[10:55]
Optimizing Sequential Experimental Design with Deep Reinforcement Learning
Orals 11:00-11:20
[11:00]
Function-space Inference with Sparse Implicit Processes
Spotlights 11:20-11:45
[11:20]
Variational Inference for Infinitely Deep Neural Networks
[11:25]
Personalized Federated Learning via Variational Bayesian Inference
[11:30]
Wide Bayesian neural networks have a simple weight posterior: theory and accelerated sampling
[11:35]
Bayesian Deep Embedding Topic Meta-Learner
[11:40]
Efficient Approximate Inference for Stationary Kernel on Frequency Domain
(ends 11:45 AM)
Spotlights 10:15-10:45
[10:15]
Biased Gradient Estimate with Drastic Variance Reduction for Meta Reinforcement Learning
[10:20]
Analysis of Stochastic Processes through Replay Buffers
[10:25]
Cascaded Gaps: Towards Logarithmic Regret for Risk-Sensitive Reinforcement Learning
[10:30]
Communicating via Markov Decision Processes
[10:35]
PAGE-PG: A Simple and Loopless Variance-Reduced Policy Gradient Method with Probabilistic Gradient Estimation
[10:40]
DNS: Determinantal Point Process Based Neural Network Sampler for Ensemble Reinforcement Learning
Orals 10:45-11:05
[10:45]
Planning with Diffusion for Flexible Behavior Synthesis
Spotlights 11:05-11:40
[11:05]
A Temporal-Difference Approach to Policy Gradient Estimation
[11:10]
MASER: Multi-Agent Reinforcement Learning with Subgoals Generated from Experience Replay Buffer
[11:15]
Reinforcement Learning from Partial Observation: Linear Function Approximation with Provable Sample Efficiency
[11:20]
Actor-Critic based Improper Reinforcement Learning
[11:25]
On the Sample Complexity of Learning Infinite-horizon Discounted Linear Kernel MDPs
[11:30]
The Geometry of Robust Value Functions
[11:35]
Denoised MDPs: Learning World Models Better Than the World Itself
(ends 11:45 AM)
Orals 10:15-10:35
[10:15]
Tight and Robust Private Mean Estimation with Few Users
Spotlights 10:35-11:00
[10:35]
QSFL: A Two-Level Uplink Communication Optimization Framework for Federated Learning
[10:40]
Robustness and Accuracy Could Be Reconcilable by (Proper) Definition
[10:45]
Sanity Simulations for Saliency Methods
[10:50]
Out-of-Distribution Detection with Deep Nearest Neighbors
[10:55]
Differentially Private Maximal Information Coefficients
Orals 11:00-11:20
[11:00]
Improved Rates for Differentially Private Stochastic Convex Optimization with Heavy-Tailed Data
Spotlights 11:20-11:45
[11:20]
On the Difficulty of Defending Self-Supervised Learning against Model Extraction
[11:25]
Adversarial Attack and Defense for Non-Parametric Two-Sample Tests
[11:30]
Certified Adversarial Robustness Under the Bounded Support Set
[11:35]
Predicting Out-of-Distribution Error with the Projection Norm
[11:40]
Adversarially Robust Models may not Transfer Better: Sufficient Conditions for Domain Transferability from the View of Regularization
(ends 11:45 AM)
Spotlights 10:15-10:50
[10:15]
Generating Distributional Adversarial Examples to Evade Statistical Detectors
[10:20]
Improving Out-of-Distribution Robustness via Selective Augmentation
[10:25]
Modeling Adversarial Noise for Adversarial Training
[10:30]
Improving Adversarial Robustness via Mutual Information Estimation
[10:35]
FOCUS: Familiar Objects in Common and Uncommon Settings
[10:40]
Query-Efficient and Scalable Black-Box Adversarial Attacks on Discrete Sequential Data via Bayesian Optimization
[10:45]
Test-Time Training Can Close the Natural Distribution Shift Performance Gap in Deep Learning Based Compressed Sensing
Orals 10:50-11:10
[10:50]
A Dynamical System Perspective for Lipschitz Neural Networks
Spotlights 11:10-11:45
[11:10]
Data Determines Distributional Robustness in Contrastive Language Image Pre-training (CLIP)
[11:15]
Neurotoxin: Durable Backdoors in Federated Learning
[11:20]
Bayesian Learning with Information Gain Provably Bounds Risk for a Robust Adversarial Defense
[11:25]
Maximum Likelihood Training for Score-based Diffusion ODEs by High Order Denoising Score Matching
[11:30]
Fast Lossless Neural Compression with Integer-Only Discrete Flows
[11:35]
SQ-VAE: Variational Bayes on Discrete Representation with Self-annealed Stochastic Quantization
[11:40]
SCHA-VAE: Hierarchical Context Aggregation for Few-Shot Generation
(ends 11:45 AM)
Orals 10:15-10:35
[10:15]
Generative Trees: Adversarial and Copycat
Spotlights 10:35-11:00
[10:35]
A Resilient Distributed Boosting Algorithm
[10:40]
Online Learning and Pricing with Reusable Resources: Linear Bandits with Sub-Exponential Rewards
[10:45]
On Well-posedness and Minimax Optimal Rates of Nonparametric Q-function Estimation in Off-policy Evaluation
[10:50]
Congested Bandits: Optimal Routing via Short-term Resets
[10:55]
Stochastic Rising Bandits
Orals 11:00-11:20
[11:00]
Agnostic Learnability of Halfspaces via Logistic Loss
Spotlights 11:20-11:45
[11:20]
Fluctuations, Bias, Variance & Ensemble of Learners: Exact Asymptotics for Convex Losses in High-Dimension
[11:25]
PDE-Based Optimal Strategy for Unconstrained Online Learning
[11:30]
Provable Acceleration of Heavy Ball beyond Quadratics for a Class of Polyak-Lojasiewicz Functions when the Non-Convexity is Averaged-Out
[11:35]
On Learning Mixture of Linear Regressions in the Non-Realizable Setting
[11:40]
Random Forest Density Estimation
(ends 11:45 AM)
Spotlights 10:15-10:50
[10:15]
DAdaQuant: Doubly-adaptive quantization for communication-efficient Federated Learning
[10:20]
Unsupervised Time-Series Representation Learning with Iterative Bilinear Temporal-Spectral Fusion
[10:25]
RetrievalGuard: Provably Robust 1-Nearest Neighbor Image Retrieval
[10:30]
Modeling Structure with Undirected Neural Networks
[10:35]
Certified Neural Network Watermarks with Randomized Smoothing
[10:40]
Improved Certified Defenses against Data Poisoning with (Deterministic) Finite Aggregation
[10:45]
Adversarial Vulnerability of Randomized Ensembles
Orals 10:50-11:10
[10:50]
Robustness Verification for Contrastive Learning
Spotlights 11:10-11:45
[11:10]
The CLRS Algorithmic Reasoning Benchmark
[11:15]
Finding Global Homophily in Graph Neural Networks When Meeting Heterophily
[11:20]
Understanding Robust Generalization in Learning Regular Languages
[11:25]
Improving Robustness against Real-World and Worst-Case Distribution Shifts through Decision Region Quantification
[11:30]
AdAUC: End-to-end Adversarial AUC Optimization Against Long-tail Problems
[11:35]
A Modern Self-Referential Weight Matrix That Learns to Modify Itself
[11:40]
Short-Term Plasticity Neurons Learning to Learn and Forget
(ends 11:45 AM)
11:45 a.m.
Coffee Break
12:15 p.m.
1 p.m.
1:15 p.m.
Short Break
1:30 p.m.
Spotlights 1:30-2:05
[1:30]
$p$-Laplacian Based Graph Neural Networks
[1:35]
Equivariant Quantum Graph Circuits
[1:40]
A Theoretical Comparison of Graph Neural Network Extensions
[1:45]
Variational On-the-Fly Personalization
[1:50]
Deep symbolic regression for recurrence prediction
[1:55]
Geometric Multimodal Contrastive Representation Learning
[2:00]
Universality of Winning Tickets: A Renormalization Group Perspective
Orals 2:05-2:25
[2:05]
Partial and Asymmetric Contrastive Learning for Out-of-Distribution Detection in Long-Tailed Recognition
Spotlights 2:25-3:00
[2:25]
Loss Function Learning for Domain Generalization by Implicit Gradient
[2:30]
GraphFM: Improving Large-Scale GNN Training via Feature Momentum
[2:35]
Generalization Guarantee of Training Graph Convolutional Networks with Graph Topology Sampling
[2:40]
A Differential Entropy Estimator for Training Neural Networks
[2:45]
Scaling Out-of-Distribution Detection for Real-World Settings
[2:50]
Score-based Generative Modeling of Graphs via the System of Stochastic Differential Equations
[2:55]
SPECTRE: Spectral Conditioning Helps to Overcome the Expressivity Limits of One-shot Graph Generators
(ends 3:00 PM)
Spotlights 1:30-2:05
[1:30]
The dynamics of representation learning in shallow, non-linear autoencoders
[1:35]
Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks
[1:40]
Estimation in Rotationally Invariant Generalized Linear Models via Approximate Message Passing
[1:45]
Failure and success of the spectral bias prediction for Laplace Kernel Ridge Regression: the case of low-dimensional data
[1:50]
Regret Bounds for Stochastic Shortest Path Problems with Linear Function Approximation
[1:55]
Universal Joint Approximation of Manifolds and Densities by Simple Injective Flows
[2:00]
Bounding the Width of Neural Networks via Coupled Initialization - A Worst Case Analysis
Orals 2:05-2:25
[2:05]
Last Iterate Risk Bounds of SGD with Decaying Stepsize for Overparameterized Linear Regression
Spotlights 2:25-3:00
[2:25]
The Neural Race Reduction: Dynamics of Abstraction in Gated Networks
[2:30]
Efficient Learning of CNNs using Patch Based Features
[2:35]
Neural Tangent Kernel Analysis of Deep Narrow Neural Networks
[2:40]
Modality Competition: What Makes Joint Training of Multi-modal Network Fail in Deep Learning? (Provably)
[2:45]
Fully-Connected Network on Noncompact Symmetric Space and Ridgelet Transform based on Helgason-Fourier Analysis
[2:50]
Non-Vacuous Generalisation Bounds for Shallow Neural Networks
[2:55]
Maslow's Hammer in Catastrophic Forgetting: Node Re-Use vs. Node Activation
(ends 3:00 PM)
Spotlights 1:30-2:05
[1:30]
SoQal: Selective Oracle Questioning for Consistency Based Active Learning of Cardiac Signals
[1:35]
Matching Structure for Dual Learning
[1:40]
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
[1:45]
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for Everyone
[1:50]
Inducing Causal Structure for Interpretable Neural Networks
[1:55]
SDQ: Stochastic Differentiable Quantization with Mixed Precision
[2:00]
IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and Languages
Orals 2:05-2:25
[2:05]
Re-evaluating Word Mover's Distance
Spotlights 2:25-3:00
[2:25]
Translatotron 2: High-quality direct speech-to-speech translation with voice preservation
[2:30]
Robust alignment of cross-session recordings of neural population activity by behaviour via unsupervised domain adaptation
[2:35]
Symmetric Machine Theory of Mind
[2:40]
PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance
[2:45]
LCANets: Lateral Competition Improves Robustness Against Corruption and Attack
[2:50]
Reconstructing Nonlinear Dynamical Systems from Multi-Modal Time Series
[2:55]
Neural Language Models are not Born Equal to Fit Brain Data, but Training Helps
(ends 3:00 PM)
Spotlights 1:30-2:05
[1:30]
Greedy based Value Representation for Optimal Coordination in Multi-agent Reinforcement Learning
[1:35]
Bayesian Nonparametrics for Offline Skill Discovery
[1:40]
Convergence of Policy Gradient for Entropy Regularized MDPs with Neural Network Approximation in the Mean-Field Regime
[1:45]
Curriculum Reinforcement Learning via Constrained Optimal Transport
[1:50]
Recurrent Model-Free RL Can Be a Strong Baseline for Many POMDPs
[1:55]
Stabilizing Q-learning with Linear Architectures for Provable Efficient Learning
[2:00]
Constrained Offline Policy Optimization
Orals 2:05-2:25
[2:05]
Causal Dynamics Learning for Task-Independent State Abstraction
Spotlights 2:25-3:05
[2:25]
Leveraging Approximate Symbolic Models for Reinforcement Learning via Skill Diversity
[2:30]
Reinforcement Learning with Action-Free Pre-Training from Videos
[2:35]
Towards Evaluating Adaptivity of Model-Based Reinforcement Learning Methods
[2:40]
Delayed Reinforcement Learning by Imitation
[2:45]
Reachability Constrained Reinforcement Learning
[2:50]
Adaptive Model Design for Markov Decision Process
[2:55]
Goal Misgeneralization in Deep Reinforcement Learning
[3:00]
Translating Robot Skills: Learning Unsupervised Skill Correspondences Across Robots
(ends 3:05 PM)
Spotlights 1:30-2:05
[1:30]
The Infinite Contextual Graph Markov Model
[1:35]
RankSim: Ranking Similarity Regularization for Deep Imbalanced Regression
[1:40]
Detached Error Feedback for Distributed SGD with Random Sparsification
[1:45]
Training OOD Detectors in their Natural Habitats
[1:50]
Constrained Gradient Descent: A Powerful and Principled Evasion Attack Against Neural Networks
[1:55]
Neural Tangent Kernel Empowered Federated Learning
[2:00]
Probabilistically Robust Learning: Balancing Average- and Worst-case Performance
Orals 2:05-2:25
[2:05]
Adversarially trained neural representations are already as robust as biological neural representations
Spotlights 2:25-3:00
[2:25]
Feature Space Particle Inference for Neural Network Ensembles
[2:30]
A Study on the Ramanujan Graph Property of Winning Lottery Tickets
[2:35]
PAC-Net: A Model Pruning Approach to Inductive Transfer Learning
[2:40]
EDEN: Communication-Efficient and Robust Distributed Mean Estimation for Federated Learning
[2:45]
Fisher SAM: Information Geometry and Sharpness Aware Minimisation
[2:50]
Deep Networks on Toroids: Removing Symmetries Reveals the Structure of Flat Regions in the Landscape Geometry
[2:55]
Towards Understanding Sharpness-Aware Minimization
(ends 3:00 PM)
Spotlights 1:30-2:05
[1:30]
Improved Regret for Differentially Private Exploration in Linear MDP
[1:35]
Differentially Private Community Detection for Stochastic Block Models
[1:40]
Understanding Clipping for Federated Learning: Convergence and Client-Level Differential Privacy
[1:45]
Hermite Polynomial Features for Private Data Generation
[1:50]
How to Steer Your Adversary: Targeted and Efficient Model Stealing Defenses with Gradient Redirection
[1:55]
Deduplicating Training Data Mitigates Privacy Risks in Language Models
[2:00]
Private frequency estimation via projective geometry
Orals 2:05-2:25
[2:05]
The Poisson Binomial Mechanism for Unbiased Federated Learning with Secure Aggregation
Spotlights 2:25-3:00
[2:25]
Faster Privacy Accounting via Evolving Discretization
[2:30]
The Fundamental Price of Secure Aggregation in Differentially Private Federated Learning
[2:35]
Private Adaptive Optimization with Side information
[2:40]
Secure Quantized Training for Deep Learning
[2:45]
Private optimization in the interpolation regime: faster rates and hardness results
[2:50]
Differentially Private Coordinate Descent for Composite Empirical Risk Minimization
[2:55]
Private Streaming SCO in $\ell_p$ geometry with Applications in High Dimensional Online Decision Making
(ends 3:00 PM)
Spotlights 1:30-2:05
[1:30]
Neural Tangent Kernel Beyond the Infinite-Width Limit: Effects of Depth and Initialization
[1:35]
Implicit Bias of Linear Equivariant Networks
[1:40]
The State of Sparse Training in Deep Reinforcement Learning
[1:45]
Set Norm and Equivariant Skip Connections: Putting the Deep in Deep Sets
[1:50]
Datamodels: Understanding Predictions with Data and Data with Predictions
[1:55]
Revisiting and Advancing Fast Adversarial Training Through The Lens of Bi-Level Optimization
[2:00]
Deep Causal Metric Learning
Orals 2:05-2:25
[2:05]
Not All Poisons are Created Equal: Robust Training against Data Poisoning
Spotlights 2:25-3:00
[2:25]
Learning Symmetric Embeddings for Equivariant World Models
[2:30]
Accelerated Federated Learning with Decoupled Adaptive Optimization
[2:35]
Byzantine Machine Learning Made Easy By Resilient Averaging of Momentums
[2:40]
TSPipe: Learn from Teacher Faster with Pipelines
[2:45]
Personalized Federated Learning through Local Memorization
[2:50]
Three-stage Evolution and Fast Equilibrium for SGD with Non-degerate Critical Points
[2:55]
Optimization-Derived Learning with Essential Convergence Analysis of Training and Hyper-training
(ends 3:00 PM)
Spotlights 1:30-2:05
[1:30]
Gradient Descent on Neurons and its Link to Approximate Second-order Optimization
[1:35]
A Tree-based Model Averaging Approach for Personalized Treatment Effect Estimation from Heterogeneous Data Sources
[1:40]
Efficient Online ML API Selection for Multi-Label Classification Tasks
[1:45]
Entropic Causal Inference: Graph Identifiability
[1:50]
Architecture Agnostic Federated Learning for Neural Networks
[1:55]
Conformal Prediction Sets with Limited False Positives
[2:00]
Scalable Computation of Causal Bounds
Orals 2:05-2:25
[2:05]
LIDL: Local Intrinsic Dimension Estimation Using Approximate Likelihood
Spotlights 2:25-3:00
[2:25]
Learning Pseudometric-based Action Representations for Offline Reinforcement Learning
[2:30]
A Statistical Manifold Framework for Point Cloud Data
[2:35]
HyperImpute: Generalized Iterative Imputation with Automatic Model Selection
[2:40]
A Natural Actor-Critic Framework for Zero-Sum Markov Games
[2:45]
Distributionally Robust $Q$-Learning
[2:50]
Sparsity in Partially Controllable Linear Systems
[2:55]
Saute RL: Almost Surely Safe Reinforcement Learning Using State Augmentation
(ends 3:00 PM)
Spotlights 1:30-2:00
[1:30]
NISPA: Neuro-Inspired Stability-Plasticity Adaptation for Continual Learning in Sparse Networks
[1:35]
Synergy and Symmetry in Deep Learning: Interactions between the Data, Model, and Inference Algorithm
[1:40]
Auxiliary Learning with Joint Task and Data Scheduling
[1:45]
Large-scale Stochastic Optimization of NDCG Surrogates for Deep Learning with Provable Convergence
[1:50]
Gating Dropout: Communication-efficient Regularization for Sparsely Activated Transformers
[1:55]
Generalizing Gaussian Smoothing for Random Search
Orals 2:00-2:20
[2:00]
A General Recipe for Likelihood-free Bayesian Optimization
Spotlights 2:20-2:55
[2:20]
Constrained Discrete Black-Box Optimization using Mixed-Integer Programming
[2:25]
Risk-Averse No-Regret Learning in Online Convex Games
[2:30]
Improve Single-Point Zeroth-Order Optimization Using High-Pass and Low-Pass Filters
[2:35]
Robust Multi-Objective Bayesian Optimization Under Input Noise
[2:40]
Gradient-Free Method for Heavily Constrained Nonconvex Optimization
[2:45]
Sequential- and Parallel- Constrained Max-value Entropy Search via Information Lower Bound
[2:50]
The power of first-order smooth optimization for black-box non-smooth problems
(ends 3:00 PM)
Spotlights 1:30-2:00
[1:30]
A New Perspective on the Effects of Spectrum in Graph Neural Networks
[1:35]
Molecular Representation Learning via Heterogeneous Motif Graph Neural Networks
[1:40]
Partial Label Learning via Label Influence Function
[1:45]
Minimax Classification under Concept Drift with Multidimensional Adaptation and Performance Guarantees
[1:50]
Understanding Robust Overfitting of Adversarial Training and Beyond
[1:55]
A Random Matrix Analysis of Data Stream Clustering: Coping With Limited Memory Resources
Orals 2:00-2:20
[2:00]
Hierarchical Shrinkage: Improving the accuracy and interpretability of tree-based models.
Spotlights 2:20-2:55
[2:20]
Supervised Learning with General Risk Functionals
[2:25]
Locally Sparse Neural Networks for Tabular Biomedical Data
[2:30]
Dual Perspective of Label-Specific Feature Learning for Multi-Label Classification
[2:35]
Detecting Corrupted Labels Without Training a Model to Predict
[2:40]
Prototype-Anchored Learning for Learning with Imperfect Annotations
[2:45]
Learning to Predict Graphs with Fused Gromov-Wasserstein Barycenters
[2:50]
Deep Safe Incomplete Multi-view Clustering: Theorem and Algorithm
(ends 3:00 PM)
3:30 p.m.
4 p.m.
THU 21 JUL
3:30 a.m.
Breakfast on your own
6 a.m.
7 a.m.
Coffee Break
7:30 a.m.
Spotlights 7:30-8:00
[7:30]
Does the Data Induce Capacity Control in Deep Learning?
[7:35]
Fighting Fire with Fire: Avoiding DNN Shortcuts through Priming
[7:40]
Memory-Based Model Editing at Scale
[7:45]
Winning the Lottery Ahead of Time: Efficient Early Network Pruning
[7:50]
Active Learning on a Budget: Opposite Strategies Suit High and Low Budgets
[7:55]
AutoSNN: Towards Energy-Efficient Spiking Neural Networks
Orals 8:00-8:20
[8:00]
Overcoming Oscillations in Quantization-Aware Training
Spotlights 8:20-8:55
[8:20]
Dataset Condensation via Efficient Synthetic-Data Parameterization
[8:25]
Searching for BurgerFormer with Micro-Meso-Macro Space Design
[8:30]
Multi-scale Feature Learning Dynamics: Insights for Double Descent
[8:35]
Dataset Condensation with Contrastive Signals
[8:40]
Equivariant Priors for compressed sensing with unknown orientation
[8:45]
Injecting Logical Constraints into Neural Networks via Straight-Through Estimators
[8:50]
Prioritized Training on Points that are Learnable, Worth Learning, and not yet Learnt
(ends 9:00 AM)
Orals 7:30-7:50
[7:30]
First-Order Regret in Reinforcement Learning with Linear Function Approximation: A Robust Estimation Approach
Spotlights 7:50-8:10
[7:50]
Generic Coreset for Scalable Learning of Monotonic Kernels: Logistic Regression, Sigmoid and more
[7:55]
Shuffle Private Linear Contextual Bandits
[8:00]
Pessimistic Q-Learning for Offline Reinforcement Learning: Towards Optimal Sample Complexity
[8:05]
Reward-Free RL is No Harder Than Reward-Aware RL in Linear Markov Decision Processes
Orals 8:10-8:30
[8:10]
Label Ranking through Nonparametric Regression
Spotlights 8:30-9:00
[8:30]
Sample-Efficient Reinforcement Learning with loglog(T) Switching Cost
[8:35]
A Simple Unified Framework for High Dimensional Bandit Problems
[8:40]
A Reduction from Linear Contextual Bandits Lower Bounds to Estimations Lower Bounds
[8:45]
Branching Reinforcement Learning
[8:50]
Fast rates for noisy interpolation require rethinking the effect of inductive bias
[8:55]
Near-Optimal Algorithms for Autonomous Exploration and Multi-Goal Stochastic Shortest Path
(ends 9:00 AM)
Spotlights 7:30-8:05
[7:30]
Structure Preserving Neural Networks: A Case Study in the Entropy Closure of the Boltzmann Equation
[7:35]
Composing Partial Differential Equations with Physics-Aware Neural Networks
[7:40]
Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval
[7:45]
Towards Coherent and Consistent Use of Entities in Narrative Generation
[7:50]
Pure Noise to the Rescue of Insufficient Data: Improving Imbalanced Classification by Training on Random Noise Images
[7:55]
Optimally Controllable Perceptual Lossy Compression
[8:00]
Learning to Solve PDE-constrained Inverse Problems with Graph Networks
Orals 8:05-8:25
[8:05]
ModLaNets: Learning Generalisable Dynamics via Modularity and Physical Inductive Bias
Spotlights 8:25-9:00
[8:25]
Learning to Estimate and Refine Fluid Motion with Physical Dynamics
[8:30]
Tractable Dendritic RNNs for Reconstructing Nonlinear Dynamical Systems
[8:35]
An Intriguing Property of Geophysics Inversion
[8:40]
Particle Transformer for Jet Tagging
[8:45]
BabelTower: Learning to Auto-parallelized Program Translation
[8:50]
ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers
[8:55]
On Distribution Shift in Learning-based Bug Detectors
(ends 9:00 AM)
Orals 7:30-7:50
[7:30]
The Importance of Non-Markovianity in Maximum State Entropy Exploration
Spotlights 7:50-8:15
[7:50]
Continuous Control with Action Quantization from Demonstrations
[7:55]
Plan Your Target and Learn Your Skills: Transferable State-Only Imitation Learning via Decoupled Policy Optimization
[8:00]
Inverse Contextual Bandits: Learning How Behavior Evolves over Time
[8:05]
Balancing Sample Efficiency and Suboptimality in Inverse Reinforcement Learning
[8:10]
Towards Uniformly Superhuman Autonomy via Subdominance Minimization
Orals 8:15-8:35
[8:15]
Causal Imitation Learning under Temporally Correlated Noise
Spotlights 8:35-9:00
[8:35]
Interactive Inverse Reinforcement Learning for Cooperative Games
[8:40]
A Hierarchical Bayesian Approach to Inverse Reinforcement Learning with Symbolic Reward Machines
[8:45]
Robust Imitation Learning against Variations in Environment Dynamics
[8:50]
Discriminator-Weighted Offline Imitation Learning from Suboptimal Demonstrations
[8:55]
Learning from Demonstration: Provably Efficient Adversarial Policy Imitation with Linear Function Approximation
(ends 9:00 AM)
Spotlights 7:30-8:05
[7:30]
A Neural Tangent Kernel Perspective of GANs
[7:35]
Style Equalization: Unsupervised Learning of Controllable Generative Sequence Models
[7:40]
Neural Inverse Transform Sampler
[7:45]
Antibody-Antigen Docking and Design via Hierarchical Structure Refinement
[7:50]
Diffusion Models for Adversarial Purification
[7:55]
Gaussian Mixture Variational Autoencoder with Contrastive Learning for Multi-Label Classification
[8:00]
VarScene: A Deep Generative Model for Realistic Scene Graph Synthesis
Orals 8:05-8:25
[8:05]
It’s Raw! Audio Generation with State-Space Models
Spotlights 8:25-9:00
[8:25]
Unsupervised Image Representation Learning with Deep Latent Particles
[8:30]
Learning Efficient and Robust Ordinary Differential Equations via Invertible Neural Networks
[8:35]
Neuro-Symbolic Hierarchical Rule Induction
[8:40]
General-purpose, long-context autoregressive modeling with Perceiver AR
[8:45]
Marginal Tail-Adaptive Normalizing Flows
[8:50]
SkexGen: Autoregressive Generation of CAD Construction Sequences with Disentangled Codebooks
[8:55]
NeuroFluid: Fluid Dynamics Grounding with Particle-Driven Neural Radiance Fields
(ends 9:00 AM)
Orals 7:30-7:50
[7:30]
Federated Reinforcement Learning: Linear Speedup Under Markovian Sampling
Spotlights 7:50-8:15
[7:50]
Entropic Gromov-Wasserstein between Gaussian Distributions
[7:55]
No-Regret Learning in Partially-Informed Auctions
[8:00]
On Last-Iterate Convergence Beyond Zero-Sum Games
[8:05]
Kernelized Multiplicative Weights for 0/1-Polyhedral Games: Bridging the Gap Between Learning in Extensive-Form and Normal-Form Games
[8:10]
Fictitious Play and Best-Response Dynamics in Identical Interest and Zero-Sum Stochastic Games
Orals 8:15-8:35
[8:15]
On the Convergence of Inexact Predictor-Corrector Methods for Linear Programming
Spotlights 8:35-9:00
[8:35]
Nested Bandits
[8:40]
Information Discrepancy in Strategic Learning
[8:45]
A Psychological Theory of Explainability
[8:50]
Task-aware Privacy Preservation for Multi-dimensional Data
[8:55]
Strategic Representation
(ends 9:00 AM)
Spotlights 7:30-7:55
[7:30]
Estimating Instance-dependent Bayes-label Transition Matrix using a Deep Neural Network
[7:35]
Invariant Ancestry Search
[7:40]
Unaligned Supervision for Automatic Music Transcription in The Wild
[7:45]
Fourier Learning with Cyclical Data
[7:50]
Linear Adversarial Concept Erasure
Orals 7:55-8:15
[7:55]
Score Matching Enables Causal Discovery of Nonlinear Additive Noise Models
Spotlights 8:15-8:50
[8:15]
Provable Domain Generalization via Invariant-Feature Subspace Recovery
[8:20]
Subspace Learning for Effective Meta-Learning
[8:25]
Continual Learning via Sequential Function-Space Variational Inference
[8:30]
Efficient Test-Time Model Adaptation without Forgetting
[8:35]
Gaussian Process Uniform Error Bounds with Unknown Hyperparameters for Safety-Critical Applications
[8:40]
Input Dependent Sparse Gaussian Processes
[8:45]
AutoIP: A United Framework to Integrate Physics into Gaussian Processes
(ends 9:00 AM)
Spotlights 7:30-8:00
[7:30]
Equivariance versus Augmentation for Spherical Images
[7:35]
Optimal Clipping and Magnitude-aware Differentiation for Improved Quantization-aware Training
[7:40]
Neural Network Poisson Models for Behavioural and Neural Spike Train Data
[7:45]
A Branch and Bound Framework for Stronger Adversarial Attacks of ReLU Networks
[7:50]
GACT: Activation Compressed Training for Generic Network Architectures
[7:55]
Fast Finite Width Neural Tangent Kernel
Orals 8:00-8:20
[8:00]
G-Mixup: Graph Data Augmentation for Graph Classification
Spotlights 8:20-9:00
[8:20]
Universal Hopfield Networks: A General Framework for Single-Shot Associative Memory Models
[8:25]
Multi-Grained Vision Language Pre-Training: Aligning Texts with Visual Concepts
[8:30]
Continual Learning with Guarantees via Weight Interval Constraints
[8:35]
Faster Fundamental Graph Algorithms via Learned Predictions
[8:40]
Practical Almost-Linear-Time Approximation Algorithms for Hybrid and Overlapping Graph Clustering
[8:45]
Fair and Fast k-Center Clustering for Data Summarization
[8:50]
Online and Consistent Correlation Clustering
[8:55]
Generalized Leverage Scores: Geometric Interpretation and Applications
(ends 9:00 AM)
Spotlights 7:30-8:00
[7:30]
Blurs Behave Like Ensembles: Spatial Smoothings to Improve Accuracy, Uncertainty, and Robustness
[7:35]
Breaking Down Out-of-Distribution Detection: Many Methods Based on OOD Training Data Estimate a Combination of the Same Core Quantities
[7:40]
Comprehensive Analysis of Negative Sampling in Knowledge Graph Representation Learning
[7:45]
Linearity Grafting: Relaxed Neuron Pruning Helps Certifiable Robustness
[7:50]
A Hierarchical Transitive-Aligned Graph Kernel for Un-attributed Graphs
[7:55]
Leverage Score Sampling for Tensor Product Matrices in Input Sparsity Time
Orals 8:00-8:20
[8:00]
Random Gegenbauer Features for Scalable Kernel Methods
Spotlights 8:20-8:55
[8:20]
Robust Meta-learning with Sampling Noise and Label Noise via Eigen-Reptile
[8:25]
Functional Output Regression with Infimal Convolution: Exploring the Huber and $\epsilon$-insensitive Losses
[8:30]
Measuring dissimilarity with diffeomorphism invariance
[8:35]
Importance Weighted Kernel Bayes' Rule
[8:40]
An Asymptotic Test for Conditional Independence using Analytic Kernel Embeddings
[8:45]
Nyström Kernel Mean Embeddings
[8:50]
Distribution Regression with Sliced Wasserstein Kernels
(ends 9:00 AM)
Spotlights 7:30-8:05
[7:30]
Adapting k-means Algorithms for Outliers
[7:35]
Accelerated, Optimal and Parallel: Some results on model-based stochastic optimization
[7:40]
Online Algorithms with Multiple Predictions
[7:45]
Parsimonious Learning-Augmented Caching
[7:50]
RUMs from Head-to-Head Contests
[7:55]
Quant-BnB: A Scalable Branch-and-Bound Method for Optimal Decision Trees with Continuous Features
[8:00]
Robustness in Multi-Objective Submodular Optimization: a Quantile Approach
Orals 8:05-8:25
[8:05]
The Unsurprising Effectiveness of Pre-Trained Vision Models for Control
Spotlights 8:25-9:00
[8:25]
COLA: Consistent Learning with Opponent-Learning Awareness
[8:30]
A Self-Play Posterior Sampling Algorithm for Zero-Sum Markov Games
[8:35]
A Framework for Learning to Request Rich and Contextually Useful Information from Humans
[8:40]
Learning Stochastic Shortest Path with Linear Function Approximation
[8:45]
Difference Advantage Estimation for Multi-Agent Policy Gradients
[8:50]
Plan Better Amid Conservatism: Offline Multi-Agent Reinforcement Learning with Actor Rectification
[8:55]
Pessimistic Minimax Value Iteration: Provably Efficient Equilibrium Learning from Offline Datasets
(ends 9:00 AM)
9 a.m.
Lunch Break On Your Own
10:30 a.m.
Spotlights 10:30-11:05
[10:30]
Adversarial Masking for Self-Supervised Learning
[10:35]
Provable Stochastic Optimization for Global Contrastive Learning: Small Batch Does Not Harm Performance
[10:40]
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
[10:45]
Multirate Training of Neural Networks
[10:50]
Variational Wasserstein gradient flow
[10:55]
Building Robust Ensembles via Margin Boosting
[11:00]
Investigating Generalization by Controlling Normalized Margin
Orals 11:05-11:25
[11:05]
Connect, Not Collapse: Explaining Contrastive Learning for Unsupervised Domain Adaptation
Spotlights 11:25-12:00
[11:25]
VLUE: A Multi-Task Multi-Dimension Benchmark for Evaluating Vision-Language Pre-training
[11:30]
Let Invariant Rationale Discovery Inspire Graph Contrastive Learning
[11:35]
Graph Neural Architecture Search Under Distribution Shifts
[11:40]
How Powerful are Spectral Graph Neural Networks
[11:45]
Constraint-based graph network simulator
[11:50]
PACE: A Parallelizable Computation Encoder for Directed Acyclic Graphs
[11:55]
Structure-Aware Transformer for Graph Representation Learning
(ends 12:00 PM)
Orals 10:30-10:50
[10:30]
UnderGrad: A Universal Black-Box Optimization Method with Almost Dimension-Free Convergence Rate Guarantees
Spotlights 10:50-11:15
[10:50]
Safe Learning in Tree-Form Sequential Decision Making: Handling Hard and Soft Constraints
[10:55]
A Marriage between Adversarial Team Games and 2-player Games: Enabling Abstractions, No-regret Learning, and Subgame Solving
[11:00]
Exact Learning of Preference Structure: Single-peaked Preferences and Beyond
[11:05]
Selling Data To a Machine Learner: Pricing via Costly Signaling
[11:10]
Hardness and Algorithms for Robust and Sparse Optimization
Orals 11:15-11:35
[11:15]
A Convergent and Dimension-Independent Min-Max Optimization Algorithm
Spotlights 11:35-12:00
[11:35]
Stochastic Continuous Submodular Maximization: Boosting via Non-oblivious Function
[11:40]
Accelerated Gradient Methods for Geodesically Convex Optimization: Tractable Algorithms and Convergence Analysis
[11:45]
The Complexity of k-Means Clustering when Little is Known
[11:50]
Iterative Hard Thresholding with Adaptive Regularization: Sparser Solutions Without Sacrificing Runtime
[11:55]
3PC: Three Point Compressors for Communication-Efficient Distributed Training and a Better Theory for Lazy Aggregation
(ends 12:00 PM)
Spotlights 10:30-11:05
[10:30]
Ripple Attention for Visual Perception with Sub-quadratic Complexity
[10:35]
Self-supervised Models are Good Teaching Assistants for Vision Transformers
[10:40]
Plug-In Inversion: Model-Agnostic Inversion for Vision with Data Augmentations
[10:45]
In defense of dual-encoders for neural ranking
[10:50]
From block-Toeplitz matrices to differential equations on graphs: towards a general theory for scalable masked Transformers
[10:55]
Linear Complexity Randomized Self-attention Mechanism
[11:00]
Efficient Representation Learning via Adaptive Context Pooling
Orals 11:05-11:25
[11:05]
Toward Compositional Generalization in Object-Oriented World Modeling
Spotlights 11:25-12:00
[11:25]
Fast Population-Based Reinforcement Learning on a Single Machine
[11:30]
NeuralEF: Deconstructing Kernels by Deep Neural Networks
[11:35]
Visual Attention Emerges from Recurrent Sparse Reconstruction
[11:40]
Transformer Quality in Linear Time
[11:45]
What Dense Graph Do You Need for Self-Attention?
[11:50]
Dual Decomposition of Convex Optimization Layers for Consistent Attention in Medical Images
[11:55]
Multi Resolution Analysis (MRA) for Approximate Self-Attention
(ends 12:00 PM)
Spotlights 10:30-11:05
[10:30]
A Context-Integrated Transformer-Based Neural Network for Auction Design
[10:35]
Domain Adaptation for Time Series Forecasting via Attention Sharing
[10:40]
Continuous-Time Modeling of Counterfactual Outcomes Using Neural Controlled Differential Equations
[10:45]
Disentangling Disease-related Representation from Obscure for Disease Prediction
[10:50]
Design-Bench: Benchmarks for Data-Driven Offline Model-Based Optimization
[10:55]
Blocks Assemble! Learning to Assemble with Large-Scale Structured Reinforcement Learning
[11:00]
Learning of Cluster-based Feature Importance for Electronic Health Record Time-series
Orals 11:05-11:25
[11:05]
Do Differentiable Simulators Give Better Policy Gradients?
Spotlights 11:25-12:00
[11:25]
Adaptive Conformal Predictions for Time Series
[11:30]
Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents
[11:35]
Rethinking Graph Neural Networks for Anomaly Detection
[11:40]
Fast Aquatic Swimmer Optimization with Differentiable Projective Dynamics and Neural Network Hydrodynamic Models
[11:45]
Proving Theorems using Incremental Learning and Hindsight Experience Replay
[11:50]
Discovering Generalizable Spatial Goal Representations via Graph-based Active Reward Learning
[11:55]
Neural Inverse Kinematic
(ends 12:00 PM)
Orals 10:30-10:50
[10:30]
Learning Bellman Complete Representations for Offline Policy Evaluation
Spotlights 10:50-11:15
[10:50]
Doubly Robust Distributionally Robust Off-Policy Evaluation and Learning
[10:55]
A Simple Reward-free Approach to Constrained Reinforcement Learning
[11:00]
Versatile Offline Imitation from Observations and Examples via Regularized State-Occupancy Matching
[11:05]
Temporal Difference Learning for Model Predictive Control
[11:10]
Model Selection in Batch Policy Optimization
Orals 11:15-11:35
[11:15]
Adversarially Trained Actor Critic for Offline Reinforcement Learning
Spotlights 11:35-12:00
[11:35]
Optimal Estimation of Policy Gradient via Double Fitted Iteration
[11:40]
Provably Efficient Offline Reinforcement Learning for Partially Observable Markov Decision Processes
[11:45]
Off-Policy Fitted Q-Evaluation with Differentiable Function Approximators: Z-Estimation and Inference Theory
[11:50]
Lagrangian Method for Q-Function Learning (with Applications to Machine Translation)
[11:55]
On the Role of Discount Factor in Offline Reinforcement Learning
(ends 12:00 PM)
Spotlights 10:30-11:05
[10:30]
Learning Stable Classifiers by Transferring Unstable Features
[10:35]
Data-Efficient Double-Win Lottery Tickets from Robust Pre-training
[10:40]
Attentional Meta-learners for Few-shot Polythetic Classification
[10:45]
C*-algebra Net: A New Approach Generalizing Neural Network Parameters to C*-algebra
[10:50]
Nonlinear Feature Diffusion on Hypergraphs
[10:55]
Kernel Methods for Radial Transformed Compositional Data with Many Zeros
[11:00]
Robust Task Representations for Offline Meta-Reinforcement Learning via Contrastive Learning
Orals 11:05-11:25
[11:05]
Causal Conceptions of Fairness and their Consequences
Spotlights 11:25-11:55
[11:25]
Fairness with Adaptive Weights
[11:30]
Understanding Instance-Level Impact of Fairness Constraints
[11:35]
Achieving Fairness at No Utility Cost via Data Reweighing with Influence
[11:40]
Mitigating Gender Bias in Face Recognition using the von Mises-Fisher Mixture Model
[11:45]
Selective Regression under Fairness Criteria
[11:50]
Input-agnostic Certified Group Fairness via Gaussian Parameter Smoothing
(ends 12:00 PM)
Spotlights 10:30-11:00
[10:30]
Dynamic Topic Models for Temporal Document Networks
[10:35]
A Functional Information Perspective on Model Interpretation
[10:40]
Be Like Water: Adaptive Floating Point for Machine Learning
[10:45]
Lie Point Symmetry Data Augmentation for Neural PDE Solvers
[10:50]
Fast Provably Robust Decision Trees and Boosting
[10:55]
Order Constraints in Optimal Transport
Orals 11:00-11:20
[11:00]
Sublinear-Time Clustering Oracle for Signed Graphs
Spotlights 11:20-11:55
[11:20]
PAC-Bayesian Bounds on Rate-Efficient Classifiers
[11:25]
More Efficient Sampling for Tensor Decomposition With Worst-Case Guarantees
[11:30]
Sharp-MAML: Sharpness-Aware Model-Agnostic Meta Learning
[11:35]
On the Convergence of Local Stochastic Compositional Gradient Descent with Momentum
[11:40]
SPDY: Accurate Pruning with Speedup Guarantees
[11:45]
Flashlight: Enabling Innovation in Tools for Machine Learning
[11:50]
On the Robustness of CountSketch to Adaptive Inputs
(ends 12:00 PM)
Orals 10:30-10:50
[10:30]
Streaming Algorithm for Monotone k-Submodular Maximization with Cardinality Constraints
Spotlights 10:50-11:10
[10:50]
Adaptive Accelerated (Extra-)Gradient Methods with Variance Reduction
[10:55]
Adaptive Second Order Coresets for Data-efficient Machine Learning
[11:00]
Nesterov Accelerated Shuffling Gradient Method for Convex Optimization
[11:05]
Efficient Low Rank Convex Bounds for Pairwise Discrete Graphical Models
Orals 11:10-11:30
[11:10]
Deletion Robust Submodular Maximization over Matroids
Spotlights 11:30-11:55
[11:30]
The Combinatorial Brain Surgeon: Pruning Weights That Cancel One Another in Neural Networks
[11:35]
Instance Dependent Regret Analysis of Kernelized Bandits
[11:40]
EAT-C: Environment-Adversarial sub-Task Curriculum for Efficient Reinforcement Learning
[11:45]
Tell me why! Explanations support learning relational and causal structure
[11:50]
Koopman Q-learning: Offline Reinforcement Learning via Symmetries of Dynamics
(ends 12:00 PM)
Orals 10:30-10:50
[10:30]
Generalised Policy Improvement with Geometric Policy Composition
Spotlights 10:50-11:15
[10:50]
Offline Meta-Reinforcement Learning with Online Self-Supervision
[10:55]
Divergence-Regularized Multi-Agent Actor-Critic
[11:00]
Understanding Policy Gradient Algorithms: A Sensitivity-Based Approach
[11:05]
Off-Policy Reinforcement Learning with Delayed Rewards
[11:10]
Direct Behavior Specification via Constrained Reinforcement Learning
Orals 11:15-11:35
[11:15]
Large Batch Experience Replay
Spotlights 11:35-12:00
[11:35]
Evolving Curricula with Regret-Based Environment Design
[11:40]
Robust Deep Reinforcement Learning through Bootstrapped Opportunistic Curriculum
[11:45]
Transformers are Meta-Reinforcement Learners
[11:50]
Reducing Variance in Temporal-Difference Value Estimation via Ensemble of Deep Networks
[11:55]
Constrained Variational Policy Optimization for Safe Reinforcement Learning
(ends 12:00 PM)
Orals 10:30-10:50
[10:30]
Stochastic Deep Networks with Linear Competing Units for Model-Agnostic Meta-Learning
Spotlights 10:50-11:10
[10:50]
Nonparametric Factor Trajectory Learning for Dynamic Tensor Decomposition
[10:55]
Nonparametric Embeddings of Sparse High-Order Interaction Events
[11:00]
Adapting the Linearised Laplace Model Evidence for Modern Deep Learning
[11:05]
NOMU: Neural Optimization-based Model Uncertainty
Orals 11:10-11:30
[11:10]
Bayesian Model Selection, the Marginal Likelihood, and Generalization
Spotlights 11:30-12:00
[11:30]
Fast-Rate PAC-Bayesian Generalization Bounds for Meta-Learning
[11:35]
Wide Neural Networks Forget Less Catastrophically
[11:40]
A Unified View on PAC-Bayes Bounds for Meta-Learning
[11:45]
MAML and ANIL Provably Learn Representations
[11:50]
C-MinHash: Improving Minwise Hashing with Circulant Permutation
[11:55]
Proximal Denoiser for Convergent Plug-and-Play Optimization with Nonconvex Regularization
(ends 12:00 PM)
11 a.m.
(ends 10:00 PM)
noon
Coffee Break
12:30 p.m.
Spotlights 12:30-1:05
[12:30]
Bregman Neural Networks
[12:35]
Quantifying and Learning Linear Symmetry-Based Disentanglement
[12:40]
Exploiting Redundancy: Separable Group Convolutional Networks on Lie Groups
[12:45]
PDO-s3DCNNs: Partial Differential Operator Based Steerable 3D CNNs
[12:50]
Utilizing Expert Features for Contrastive Learning of Time-Series Representations
[12:55]
(Non-)Convergence Results for Predictive Coding Networks
[1:00]
Representation Topology Divergence: A Method for Comparing Neural Network Representations.
Orals 1:05-1:25
[1:05]
Measuring Representational Robustness of Neural Networks Through Shared Invariances
Spotlights 1:25-2:00
[1:25]
The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns via Spotlights of Attention
[1:30]
Flowformer: Linearizing Transformers with Conservation Flows
[1:35]
Spatial-Channel Token Distillation for Vision MLPs
[1:40]
Neurocoder: General-Purpose Computation Using Stored Neural Programs
[1:45]
Improving Transformers with Probabilistic Attention Keys
[1:50]
Rethinking Attention-Model Explainability through Faithfulness Violation Test
[1:55]
AGNAS: Attention-Guided Micro- and Macro-Architecture Search
(ends 2:00 PM)
Spotlights 12:30-1:05
[12:30]
Nearly Optimal Catoni’s M-estimator for Infinite Variance
[12:35]
Strategies for Safe Multi-Armed Bandits with Logarithmic Regret and Risk
[12:40]
Local Linear Convergence of Douglas-Rachford for Linear Programming: a Probabilistic Analysis
[12:45]
Contextual Information-Directed Sampling
[12:50]
Breaking the $\sqrt{T}$ Barrier: Instance-Independent Logarithmic Regret in Stochastic Contextual Linear Bandits
[12:55]
Universal and data-adaptive algorithms for model selection in linear contextual bandits
[1:00]
Regret Minimization with Performative Feedback
Orals 1:05-1:25
[1:05]
A Simple yet Universal Strategy for Online Convex Optimization
Spotlights 1:25-2:00
[1:25]
Deep Hierarchy in Bandits
[1:30]
Distributionally-Aware Kernelized Bandit Problems for Risk Aversion
[1:35]
Asymptotically-Optimal Gaussian Bandits with Side Observations
[1:40]
Learning from a Learning User for Optimal Recommendations
[1:45]
Thresholded Lasso Bandit
[1:50]
Versatile Dueling Bandits: Best-of-both World Analyses for Learning from Relative Preferences
[1:55]
Decentralized Online Convex Optimization in Networked Systems
(ends 2:00 PM)
Spotlights 12:30-1:05
[12:30]
Convergence of Invariant Graph Networks
[12:35]
Rich Feature Construction for the Optimization-Generalization Dilemma
[12:40]
NLP From Scratch Without Large-Scale Pretraining: A Simple and Efficient Framework
[12:45]
Resilient and Communication Efficient Learning for Heterogeneous Federated Systems
[12:50]
Augment with Care: Contrastive Learning for Combinatorial Problems
[12:55]
Cycle Representation Learning for Inductive Relation Prediction
[1:00]
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
Orals 1:05-1:25
[1:05]
Do More Negative Samples Necessarily Hurt In Contrastive Learning?
Spotlights 1:25-2:00
[1:25]
MetAug: Contrastive Learning via Meta Feature Augmentation
[1:30]
Investigating Why Contrastive Learning Benefits Robustness against Label Noise
[1:35]
Contrastive Learning with Boosted Memorization
[1:40]
Identity-Disentangled Adversarial Augmentation for Self-supervised Learning
[1:45]
Interventional Contrastive Learning with Meta Semantic Regularizer
[1:50]
On the Surrogate Gap between Contrastive and Supervised Losses
[1:55]
Exploring the Gap between Collapsed & Whitened Features in Self-Supervised Learning
(ends 2:00 PM)
Spotlights 12:30-1:00
[12:30]
GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
[12:35]
Deploying Convolutional Networks on Untrusted Platforms Using 2D Holographic Reduced Representations
[12:40]
Object Permanence Emerges in a Random Walk along Memory
[12:45]
Flow-Guided Sparse Transformer for Video Deblurring
[12:50]
N-Penetrate: Active Learning of Neural Collision Handler for Complex 3D Mesh Deformations
[12:55]
Staged Training for Transformer Language Models
Orals 1:00-1:20
[1:00]
Near-Exact Recovery for Tomographic Inverse Problems via Deep Learning
Spotlights 1:20-2:00
[1:20]
Self-supervised learning with random-projection quantizer for speech recognition
[1:25]
Learning Multiscale Transformer Models for Sequence Generation
[1:30]
NP-Match: When Neural Processes meet Semi-Supervised Learning
[1:35]
Proximal and Federated Random Reshuffling
[1:40]
Federated Learning with Partial Model Personalization
[1:45]
A Stochastic Multi-Rate Control Framework For Modeling Distributed Optimization Algorithms
[1:50]
Tackling Data Heterogeneity: A New Unified Framework for Decentralized SGD with Sample-induced Topology
[1:55]
Iterative Double Sketching for Faster Least-Squares Optimization
(ends 2:00 PM)
Spotlights 12:30-1:05
[12:30]
Revisiting End-to-End Speech-to-Text Translation From Scratch
[12:35]
Data Scaling Laws in NMT: The Effect of Noise and Architecture
[12:40]
Dialog Inpainting: Turning Documents into Dialogs
[12:45]
Safe Exploration for Efficient Policy Evaluation and Comparison
[12:50]
Adversarial Attacks on Gaussian Process Bandits
[12:55]
GALAXY: Graph-based Active Learning at the Extreme
[1:00]
When Are Linear Stochastic Bandits Attackable?
Orals 1:05-1:25
[1:05]
UniRank: Unimodal Bandit Algorithms for Online Ranking
Spotlights 1:25-2:00
[1:25]
Correlation Clustering via Strong Triadic Closure Labeling: Fast Approximation Algorithms and Practical Lower Bounds
[1:30]
Interactive Correlation Clustering with Existential Cluster Constraints
[1:35]
Simultaneous Graph Signal Clustering and Graph Learning
[1:40]
Bregman Power k-Means for Clustering Exponential Family Data
[1:45]
SpaceMAP: Visualizing High-Dimensional Data by Space Expansion
[1:50]
Unsupervised Ground Metric Learning Using Wasserstein Singular Vectors
[1:55]
Understanding Doubly Stochastic Clustering
(ends 2:00 PM)
Spotlights 12:30-1:05
[12:30]
Learning to Cut by Looking Ahead: Cutting Plane Selection via Imitation Learning
[12:35]
A Regret Minimization Approach to Multi-Agent Control
[12:40]
Multi-slots Online Matching with High Entropy
[12:45]
Decision-Focused Learning: Through the Lens of Learning to Rank
[12:50]
On the Hidden Biases of Policy Mirror Ascent in Continuous Action Spaces
[12:55]
Asking for Knowledge (AFK): Training RL Agents to Query External Knowledge Using Language
[1:00]
Addressing Optimism Bias in Sequence Modeling for Reinforcement Learning
Orals 1:05-1:25
[1:05]
An Analytical Update Rule for General Policy Optimization
Spotlights 1:25-2:00
[1:25]
Making Linear MDPs Practical via Contrastive Representation Learning
[1:30]
Flow-based Recurrent Belief State Learning for POMDPs
[1:35]
A Parametric Class of Approximate Gradient Updates for Policy Optimization
[1:40]
Retrieval-Augmented Reinforcement Learning
[1:45]
Robust Policy Learning over Multiple Uncertainty Sets
[1:50]
Policy Diagnosis via Measuring Role Diversity in Cooperative Multi-agent RL
[1:55]
Learning Dynamics and Generalization in Deep Reinforcement Learning
(ends 2:00 PM)
Orals 12:30-12:50
[12:30]
From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses
Spotlights 12:50-1:15
[12:50]
Why Should I Trust You, Bellman? The Bellman Error is a Poor Replacement for Value Error
[12:55]
EqR: Equivariant Representations for Data-Efficient Reinforcement Learning
[1:00]
Imitation Learning by Estimating Expertise of Demonstrators
[1:05]
Cliff Diving: Exploring Reward Surfaces in Reinforcement Learning Environments
[1:10]
Off-Policy Evaluation for Large Action Spaces via Embeddings
Orals 1:15-1:35
[1:15]
Online Decision Transformer
Spotlights 1:35-2:00
[1:35]
Learning-based Optimisation of Particle Accelerators Under Partial Observability Without Real-World Training
[1:40]
How to Leverage Unlabeled Data in Offline Reinforcement Learning
[1:45]
Bisimulation Makes Analogies in Goal-Conditioned Reinforcement Learning
[1:50]
Lightweight Projective Derivative Codes for Compressed Asynchronous Gradient Descent
[1:55]
Compressed-VFL: Communication-Efficient Learning with Vertically Partitioned Data
(ends 2:00 PM)
Orals 12:30-12:50
[12:30]
Generalized Strategic Classification and the Case of Aligned Incentives
Spotlights 12:50-1:10
[12:50]
Improving Screening Processes via Calibrated Subset Selection
[12:55]
On the Convergence of the Shapley Value in Parametric Bayesian Learning Games
[1:00]
Data-SUITE: Data-centric identification of in-distribution incongruous examples
[1:05]
Counterfactual Prediction for Outcome-Oriented Treatments
Orals 1:10-1:30
[1:10]
Optimal Algorithms for Mean Estimation under Local Differential Privacy
Spotlights 1:30-1:55
[1:30]
Least Squares Estimation using Sketched Data with Heteroskedastic Errors
[1:35]
Debiaser Beware: Pitfalls of Centering Regularized Transport Maps
[1:40]
Bregman Proximal Langevin Monte Carlo via Bregman--Moreau Envelopes
[1:45]
Active Nearest Neighbor Regression Through Delaunay Refinement
[1:50]
A Convergence Theory for SVGD in the Population Limit under Talagrand's Inequality T1
(ends 2:00 PM)
Spotlights 12:30-1:00
[12:30]
ProgFed: Effective, Communication, and Computation Efficient Federated Learning by Progressive Training
[12:35]
Federated Learning with Label Distribution Skew via Logits Calibration
[12:40]
Adaptive Random Walk Gradient Descent for Decentralized Optimization
[12:45]
POET: Training Neural Networks on Tiny Devices with Integrated Rematerialization and Paging
[12:50]
Secure Distributed Training at Scale
[12:55]
ASAP.SGD: Instance-based Adaptiveness to Staleness in Asynchronous SGD
Orals 1:00-1:20
[1:00]
Anarchic Federated Learning
Spotlights 1:20-1:55
[1:20]
Virtual Homogeneity Learning: Defending against Data Heterogeneity in Federated Learning
[1:25]
Efficient Reinforcement Learning in Block MDPs: A Model-free Representation Learning approach
[1:30]
Sketching Algorithms and Lower Bounds for Ridge Regression
[1:35]
On Improving Model-Free Algorithms for Decentralized Multi-Agent Reinforcement Learning
[1:40]
Utility Theory for Sequential Decision Making
[1:45]
Online Learning with Knapsacks: the Best of Both Worlds
[1:50]
Optimal Clustering with Noisy Queries via Multi-Armed Bandit
(ends 2:00 PM)
Spotlights 12:30-1:00
[12:30]
Global Optimization Networks
[12:35]
Generalized Federated Learning via Sharpness Aware Minimization
[12:40]
Delay-Adaptive Step-sizes for Asynchronous Learning
[12:45]
FedScale: Benchmarking Model and System Performance of Federated Learning at Scale
[12:50]
Learning Augmented Binary Search Trees
[12:55]
Communication-efficient Distributed Learning for Large Batch Optimization
Orals 1:00-1:20
[1:00]
Born-Infeld (BI) for AI: Energy-Conserving Descent (ECD) for Optimization
Spotlights 1:20-1:55
[1:20]
A Simple Guard for Learned Optimizers
[1:25]
An Exact Symbolic Reduction of Linear Smart Predict+Optimize to Mixed Integer Linear Programming
[1:30]
Multi-Level Branched Regularization for Federated Learning
[1:35]
Revisiting the Effects of Stochasticity for Hamiltonian Samplers
[1:40]
Scaling Structured Inference with Randomization
[1:45]
Discrete Tree Flows via Tree-Structured Permutations
[1:50]
Calibrated and Sharp Uncertainties in Deep Learning via Density Estimation
(ends 2:00 PM)
2 p.m.
3 p.m.
4 p.m.
FRI 22 JUL
3:30 a.m.
Breakfast on your own
4 a.m.
(ends 4:00 PM)
5 a.m.
5:30 a.m.
5:40 a.m.
5:45 a.m.
5:50 a.m.
5:55 a.m.
6 a.m.
7 a.m.
Coffee Break
9 a.m.
Lunch
noon
Coffee Break
4 p.m.
SAT 23 JUL
3:30 a.m.
Breakfast on your own
4 a.m.
(ends 9:00 AM)
5:30 a.m.
5:45 a.m.
5:50 a.m.
Workshop:
(ends 2:30 PM)
5:55 a.m.
6 a.m.
6:15 a.m.
6:20 a.m.
7 a.m.
Coffee Break
9 a.m.
Lunch on your own
noon
Coffee Break