Show Detail » |
Timezone: |

SUN 18 JUL

5 a.m.

6 a.m.

7 a.m.

Expo Talk Panel:

(ends 8:00 AM)

8 a.m.

Expo Talk Panel:

(ends 9:00 AM)

9 a.m.

10 a.m.

5 p.m.

Expo Talk Panel:

(ends 5:59 PM)

7 p.m.

9 p.m.

MON 19 JUL

6 a.m.

8 a.m.

Tutorial:

(ends 11:15 AM)

Tutorial:

(ends 10:59 AM)

Tutorial:

(ends 11:00 AM)

noon

5 p.m.

6 p.m.

8 p.m.

TUE 20 JUL

5 a.m.

Orals 5:00-5:20

[5:00]
Scalable Computations of Wasserstein Barycenter via Input Convex Neural Networks

Spotlights 5:20-5:50

[5:20]
Outlier-Robust Optimal Transport

[5:25]
Dataset Dynamics via Gradient Flows in Probability Space

[5:30]
Sliced Iterative Normalizing Flows

[5:35]
Low-Rank Sinkhorn Factorization

[5:40]
Unbalanced minibatch Optimal Transport; applications to Domain Adaptation

[5:45]
Making transport more robust and interpretable by moving data through a small number of anchor points

Q&As 5:50-5:55

[5:50]
Q&A

(ends 6:00 AM)

Orals 5:00-5:20

[5:00]
Attention is not all you need: pure attention loses rank doubly exponentially with depth

Spotlights 5:20-5:50

[5:20]
Navigation Turing Test (NTT): Learning to Evaluate Human-Like Navigation

[5:25]
Efficient Generative Modelling of Protein Structure Fragments using a Deep Markov Model

[5:30]
Exploiting structured data for learning contagious diseases under incomplete testing

[5:35]
Strategic Classification Made Practical

[5:40]
Large-Margin Contrastive Learning with Distance Polarization Regularizer

[5:45]
SPADE: A Spectral Method for Black-Box Adversarial Robustness Evaluation

Q&As 5:50-5:55

[5:50]
Q&A

(ends 6:00 AM)

Orals 5:00-5:20

[5:00]
Phasic Policy Gradient

Spotlights 5:20-5:50

[5:20]
Reinforcement Learning with Prototypical Representations

[5:25]
Diversity Actor-Critic: Sample-Aware Entropy Regularization for Sample-Efficient Exploration

[5:30]
Muesli: Combining Improvements in Policy Optimization

[5:35]
Unsupervised Learning of Visual 3D Keypoints for Control

[5:40]
Learning Task Informed Abstractions

[5:45]
State Entropy Maximization with Random Encoders for Efficient Exploration

Q&As 5:50-5:55

[5:50]
Q&A

(ends 6:00 AM)

Orals 5:00-5:20

[5:00]
BORE: Bayesian Optimization by Density-Ratio Estimation

Spotlights 5:20-5:45

[5:20]
AutoSampling: Search for Effective Data Sampling Schedules

[5:25]
HardCoRe-NAS: Hard Constrained diffeRentiable Neural Architecture Search

[5:30]
Bias-Robust Bayesian Optimization via Dueling Bandits

[5:35]
Zeroth-Order Non-Convex Learning via Hierarchical Dual Averaging

[5:40]
Sparsifying Networks via Subdifferential Inclusion

Q&As 5:45-5:50

[5:45]
Q&A

(ends 6:00 AM)

Orals 5:00-5:20

[5:00]
Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting Pot

Spotlights 5:20-5:45

[5:20]
UneVEn: Universal Value Exploration for Multi-Agent Reinforcement Learning

[5:25]
A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning

[5:30]
Multi-Agent Training beyond Zero-Sum with Correlated Equilibrium Meta-Solvers

[5:35]
PC-MLP: Model-based Reinforcement Learning with Policy Cover Guided Exploration

[5:40]
Imitation by Predicting Observations

Q&As 5:45-5:50

[5:45]
Q&A

(ends 6:00 AM)

Orals 5:00-5:20

[5:00]
Relative Positional Encoding for Transformers with Linear Complexity

Spotlights 5:20-5:50

[5:20]
A Free Lunch From ANN: Towards Efficient, Accurate Spiking Neural Networks Calibration

[5:25]
A Unified Lottery Ticket Hypothesis for Graph Neural Networks

[5:30]
Generative Adversarial Transformers

[5:35]
Evolving Attention with Residual Convolutions

[5:40]
Zoo-Tuning: Adaptive Transfer from A Zoo of Models

[5:45]
UnICORNN: A recurrent model for learning very long time dependencies

Q&As 5:50-5:55

[5:50]
Q&A

(ends 6:00 AM)

Orals 5:00-5:20

[5:00]
Size-Invariant Graph Representations for Graph Classification Extrapolations

Spotlights 5:20-5:50

[5:20]
Consistent Nonparametric Methods for Network Assisted Covariate Estimation

[5:25]
Explainable Automated Graph Representation Learning with Hyperparameter Importance

[5:30]
Breaking the Limits of Message Passing Graph Neural Networks

[5:35]
From Local Structures to Size Generalization in Graph Neural Networks

[5:40]
Interpretable Stability Bounds for Spectral Graph Filters

[5:45]
Learning Node Representations Using Stationary Flow Prediction on Large Payment and Cash Transaction Networks

Q&As 5:50-5:55

[5:50]
Q&A

(ends 6:00 AM)

Orals 5:00-5:20

[5:00]
Deeply-Debiased Off-Policy Interval Estimation

Spotlights 5:20-5:45

[5:20]
Offline Contextual Bandits with Overparameterized Models

[5:25]
Demonstration-Conditioned Reinforcement Learning for Few-Shot Imitation

[5:30]
A New Representation of Successor Features for Transfer across Dissimilar Environments

[5:35]
Preferential Temporal Difference Learning

[5:40]
On the Optimality of Batch Policy Optimization Algorithms

Q&As 5:45-5:50

[5:45]
Q&A

(ends 6:00 AM)

Orals 5:00-5:20

[5:00]
Optimal Complexity in Decentralized Training

Spotlights 5:20-5:50

[5:20]
Stochastic Sign Descent Methods: New Algorithms and Better Theory

[5:25]
Bias-Variance Reduced Local SGD for Less Heterogeneous Federated Learning

[5:30]
A Hybrid Variance-Reduced Method for Decentralized Stochastic Non-Convex Optimization

[5:35]
Asynchronous Decentralized Optimization With Implicit Stochastic Variance Reduction

[5:40]
Newton Method over Networks is Fast up to the Statistical Precision

[5:45]
Federated Learning under Arbitrary Communication Patterns

Q&As 5:50-5:55

[5:50]
Q&A

(ends 6:00 AM)

6 a.m.

Orals 6:00-6:20

[6:00]
What Are Bayesian Neural Network Posteriors Really Like?

Spotlights 6:20-6:50

[6:20]
Scalable Marginal Likelihood Estimation for Model Selection in Deep Learning

[6:25]
Amortized Conditional Normalized Maximum Likelihood: Reliable Out of Distribution Uncertainty Estimation

[6:30]
Deep kernel processes

[6:35]
Global inducing point variational posteriors for Bayesian neural networks and deep Gaussian processes

[6:40]
Bayesian Deep Learning via Subnetwork Inference

[6:45]
Generative Particle Variational Inference via Estimation of Functional Gradients

Q&As 6:50-6:55

[6:50]
Q&A

(ends 7:00 AM)

Orals 6:00-6:20

[6:00]
Let's Agree to Degree: Comparing Graph Convolutional Networks in the Message-Passing Framework

Spotlights 6:20-6:50

[6:20]
Fundamental Tradeoffs in Distributionally Adversarial Training

[6:25]
Towards Understanding Learning in Neural Networks with Linear Teachers

[6:30]
Continual Learning in the Teacher-Student Setup: Impact of Task Similarity

[6:35]
A Functional Perspective on Learning Symmetric Functions with Neural Networks

[6:40]
Weisfeiler and Lehman Go Topological: Message Passing Simplicial Networks

[6:45]
On the Random Conjugate Kernel and Neural Tangent Kernel

Q&As 6:50-6:55

[6:50]
Q&A

(ends 7:00 AM)

Orals 6:00-6:20

[6:00]
Risk-Sensitive Reinforcement Learning with Function Approximation: A Debiasing Approach

Spotlights 6:20-6:50

[6:20]
Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample Complexity

[6:25]
Neuro-algorithmic Policies Enable Fast Combinatorial Generalization

[6:30]
PID Accelerated Value Iteration Algorithm

[6:35]
Provably Efficient Learning of Transferable Rewards

[6:40]
Reinforcement Learning for Cost-Aware Markov Decision Processes

[6:45]
Value Alignment Verification

Q&As 6:50-6:55

[6:50]
Q&A

(ends 7:00 AM)

Orals 6:00-6:20

[6:00]
PAGE: A Simple and Optimal Probabilistic Gradient Estimator for Nonconvex Optimization

Spotlights 6:20-6:50

[6:20]
Projection Robust Wasserstein Barycenters

[6:25]
Efficient Message Passing for 0–1 ILPs with Binary Decision Diagrams

[6:30]
Distributionally Robust Optimization with Markovian Data

[6:35]
Acceleration via Fractal Learning Rate Schedules

[6:40]
A Novel Sequential Coreset Method for Gradient Descent Algorithms

[6:45]
Scalable Optimal Transport in High Dimensions for Graph Distances, Embedding Alignment, and More

Q&As 6:50-6:55

[6:50]
Q&A

(ends 7:00 AM)

Orals 6:00-6:20

[6:00]
Variance Reduction via Primal-Dual Accelerated Dual Averaging for Nonsmooth Convex Finite-Sums

Spotlights 6:20-6:50

[6:20]
Dueling Convex Optimization

[6:25]
Global Optimality Beyond Two Layers: Training Deep ReLU Networks via Convex Programs

[6:30]
Parameter-free Locally Accelerated Conditional Gradients

[6:35]
Principal Component Hierarchy for Sparse Quadratic Programs

[6:40]
One-sided Frank-Wolfe algorithms for saddle problems

[6:45]
ConvexVST: A Convex Optimization Approach to Variance-stabilizing Transformation

Q&As 6:50-6:55

[6:50]
Q&A

(ends 7:00 AM)

Orals 6:00-6:20

[6:00]
Oops I Took A Gradient: Scalable Sampling for Discrete Distributions

Spotlights 6:20-6:50

[6:20]
Multiscale Invertible Generative Networks for High-Dimensional Bayesian Inference

[6:25]
GraphDF: A Discrete Flow Model for Molecular Graph Generation

[6:30]
Hierarchical VAEs Know What They Don’t Know

[6:35]
Order Matters: Probabilistic Modeling of Node Sequence for Graph Generation

[6:40]
Generative Video Transformer: Can Objects be the Words?

[6:45]
Poisson-Randomised DirBN: Large Mutation is Needed in Dirichlet Belief Networks

Q&As 6:50-6:55

[6:50]
Q&A

(ends 7:00 AM)

Orals 6:00-6:20

[6:00]
Leveraging Sparse Linear Layers for Debuggable Deep Networks

Spotlights 6:20-6:50

[6:20]
Voice2Series: Reprogramming Acoustic Models for Time Series Classification

[6:25]
Self-Tuning for Data-Efficient Deep Learning

[6:30]
How Framelets Enhance Graph Neural Networks

[6:35]
Federated Continual Learning with Weighted Inter-client Transfer

[6:40]
Self Normalizing Flows

[6:45]
Loss Surface Simplexes for Mode Connecting Volumes and Fast Ensembling

Q&As 6:50-6:55

[6:50]
Q&A

(ends 7:00 AM)

Orals 6:00-6:20

[6:00]
Principled Simplicial Neural Networks for Trajectory Prediction

Spotlights 6:20-6:50

[6:20]
Efficient Differentiable Simulation of Articulated Bodies

[6:25]
On Monotonic Linear Interpolation of Neural Network Parameters

[6:30]
Connecting Sphere Manifolds Hierarchically for Regularization

[6:35]
Controlling Graph Dynamics with Reinforcement Learning and Graph Neural Networks

[6:40]
Thinking Like Transformers

[6:45]
Federated Learning of User Verification Models Without Sharing Embeddings

Q&As 6:50-6:55

[6:50]
Q&A

(ends 7:00 AM)

Orals 6:00-6:20

[6:00]
Neural Architecture Search without Training

Spotlights 6:20-6:50

[6:20]
Is Space-Time Attention All You Need for Video Understanding?

[6:25]
A Probabilistic Approach to Neural Network Pruning

[6:30]
KNAS: Green Neural Architecture Search

[6:35]
Efficient Lottery Ticket Finding: Less Data is More

[6:40]
ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases

[6:45]
Provably Strict Generalisation Benefit for Equivariant Models

Q&As 6:50-6:55

[6:50]
Q&A

(ends 7:00 AM)

7 a.m.

Orals 7:00-7:20

[7:00]
World Model as a Graph: Learning Latent Landmarks for Planning

Spotlights 7:20-7:45

[7:20]
Revisiting Rainbow: Promoting more insightful and inclusive deep reinforcement learning research

[7:25]
Deep Reinforcement Learning amidst Continual Structured Non-Stationarity

[7:30]
Offline Reinforcement Learning with Pseudometric Learning

[7:35]
EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline and Online RL

[7:40]
Decision-Making Under Selective Labels: Optimal Finite-Domain Policies and Beyond

Q&As 7:45-7:50

[7:45]
Q&A

(ends 8:00 AM)

Orals 7:00-7:20

[7:00]
Directional Graph Networks

Spotlights 7:20-7:50

[7:20]
Winograd Algorithm for AdderNet

[7:25]
LieTransformer: Equivariant Self-Attention for Lie Groups

[7:30]
"Hey, that's not an ODE": Faster ODE Adjoints via Seminorms

[7:35]
Graph Mixture Density Networks

[7:40]
Momentum Residual Neural Networks

[7:45]
Better Training using Weight-Constrained Stochastic Dynamics

Q&As 7:50-7:55

[7:50]
Q&A

(ends 8:00 AM)

Orals 7:00-7:20

[7:00]
ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision

Spotlights 7:20-7:50

[7:20]
Learning Curves for Analysis of Deep Networks

[7:25]
GLSearch: Maximum Common Subgraph Detection via Learning to Search

[7:30]
Learning Intra-Batch Connections for Deep Metric Learning

[7:35]
Simultaneous Similarity-based Self-Distillation for Deep Metric Learning

[7:40]
Unifying Vision-and-Language Tasks via Text Generation

[7:45]
DeepWalking Backwards: From Embeddings Back to Graphs

Q&As 7:50-7:55

[7:50]
Q&A

(ends 8:00 AM)

Orals 7:00-7:20

[7:00]
Skill Discovery for Exploration and Planning using Deep Skill Graphs

Spotlights 7:20-7:50

[7:20]
Learning Routines for Effective Off-Policy Reinforcement Learning

[7:25]
PODS: Policy Optimization via Differentiable Simulation

[7:30]
Learning and Planning in Complex Action Spaces

[7:35]
Model-Based Reinforcement Learning via Latent-Space Collocation

[7:40]
Vector Quantized Models for Planning

[7:45]
LTL2Action: Generalizing LTL Instructions for Multi-Task RL

Q&As 7:50-7:55

[7:50]
Q&A

(ends 8:00 AM)

Orals 7:00-7:20

[7:00]
Spectral Smoothing Unveils Phase Transitions in Hierarchical Variational Autoencoders

Spotlights 7:20-7:50

[7:20]
Riemannian Convex Potential Maps

[7:25]
Autoencoding Under Normalization Constraints

[7:30]
PixelTransformer: Sample Conditioned Signal Generation

[7:35]
Generative Adversarial Networks for Markovian Temporal Dynamics: Stochastic Continuous Data Generation

[7:40]
Autoencoder Image Interpolation by Shaping the Latent Space

[7:45]
Improved Denoising Diffusion Probabilistic Models

Q&As 7:50-7:55

[7:50]
Q&A

(ends 8:00 AM)

Orals 7:00-7:20

[7:00]
OmniNet: Omnidirectional Representations from Transformers

Spotlights 7:20-7:45

[7:20]
Boosting the Throughput and Accelerator Utilization of Specialized CNN Inference Beyond Increasing Batch Size

[7:25]
E(n) Equivariant Graph Neural Networks

[7:30]
Grid-Functioned Neural Networks

[7:35]
MSA Transformer

[7:40]
Parallelizing Legendre Memory Unit Training

Q&As 7:45-7:50

[7:45]
Q&A

(ends 8:00 AM)

Orals 7:00-7:20

[7:00]
Not All Memories are Created Equal: Learning to Forget by Expiring

Spotlights 7:20-7:50

[7:20]
Learning Bounds for Open-Set Learning

[7:25]
Perceiver: General Perception with Iterative Attention

[7:30]
Synthesizer: Rethinking Self-Attention for Transformer Models

[7:35]
Slot Machines: Discovering Winning Combinations of Random Weights in Neural Networks

[7:40]
What's in the Box? Exploring the Inner Life of Neural Networks with Robust Rules

[7:45]
Neural-Pull: Learning Signed Distance Function from Point clouds by Learning to Pull Space onto Surface

Q&As 7:50-7:55

[7:50]
Q&A

(ends 8:00 AM)

Orals 7:00-7:20

[7:00]
Stability and Convergence of Stochastic Gradient Clipping: Beyond Lipschitz Continuity and Smoothness

Spotlights 7:20-7:50

[7:20]
Positive-Negative Momentum: Manipulating Stochastic Gradient Noise to Improve Generalization

[7:25]
Variational Data Assimilation with a Learned Inverse Observation Operator

[7:30]
Fast Projection Onto Convex Smooth Constraints

[7:35]
Decomposable Submodular Function Minimization via Maximum Flow

[7:40]
Multiplicative Noise and Heavy Tails in Stochastic Optimization

[7:45]
Distributed Second Order Methods with Fast Rates and Compressed Communication

Q&As 7:50-7:55

[7:50]
Q&A

(ends 8:00 AM)

Orals 7:00-7:20

[7:00]
Coach-Player Multi-agent Reinforcement Learning for Dynamic Team Composition

Spotlights 7:20-7:50

[7:20]
Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning

[7:25]
A New Formalism, Method and Open Issues for Zero-Shot Coordination

[7:30]
Targeted Data Acquisition for Evolving Negotiation Agents

[7:35]
Inverse Constrained Reinforcement Learning

[7:40]
Counterfactual Credit Assignment in Model-Free Reinforcement Learning

[7:45]
Interactive Learning from Activity Description

Q&As 7:50-7:55

[7:50]
Q&A

(ends 8:00 AM)

8 a.m.

9 a.m.

(ends 11:00 AM)

11 a.m.

5 p.m.

Orals 5:00-5:20

[5:00]
A Tale of Two Efficient and Informative Negative Sampling Distributions

Spotlights 5:20-5:50

[5:20]
TeraPipe: Token-Level Pipeline Parallelism for Training Large-Scale Language Models

[5:25]
Quantization Algorithms for Random Fourier Features

[5:30]
Rethinking Neural vs. Matrix-Factorization Collaborative Filtering: the Theoretical Perspectives

[5:35]
Concentric mixtures of Mallows models for top-$k$ rankings: sampling and identifiability

[5:40]
Heterogeneity for the Win: One-Shot Federated Clustering

[5:45]
Cross-Gradient Aggregation for Decentralized Learning from Non-IID Data

Q&As 5:50-5:55

[5:50]
Q&A

(ends 6:00 PM)

Orals 5:00-5:20

[5:00]
A Practical Method for Constructing Equivariant Multilayer Perceptrons for Arbitrary Matrix Groups

Spotlights 5:20-5:50

[5:20]
Accelerate CNNs from Three Dimensions: A Comprehensive Pruning Framework

[5:25]
The Earth Mover's Pinball Loss: Quantiles for Histogram-Valued Regression

[5:30]
Signatured Deep Fictitious Play for Mean Field Games with Common Noise

[5:35]
Equivariant message passing for the prediction of tensorial properties and molecular spectra

[5:40]
Improving Breadth-Wise Backpropagation in Graph Neural Networks Helps Learning Long-Range Dependencies.

[5:45]
LARNet: Lie Algebra Residual Network for Face Recognition

Q&As 5:50-5:55

[5:50]
Q&A

(ends 6:00 PM)

Orals 5:00-5:20

[5:00]
PsiPhi-Learning: Reinforcement Learning with Demonstrations using Successor Features and Inverse Temporal Difference Learning

Spotlights 5:20-5:50

[5:20]
Safe Reinforcement Learning with Linear Function Approximation

[5:25]
Shortest-Path Constrained Reinforcement Learning for Sparse Reward Tasks

[5:30]
Offline Reinforcement Learning with Fisher Divergence Critic Regularization

[5:35]
Recomposing the Reinforcement Learning Building Blocks with Hypernetworks

[5:40]
OptiDICE: Offline Policy Optimization via Stationary Distribution Correction Estimation

[5:45]
Discovering symbolic policies with deep reinforcement learning

Q&As 5:50-5:55

[5:50]
Q&A

(ends 6:00 PM)

Orals 5:00-5:20

[5:00]
Characterizing Structural Regularities of Labeled Data in Overparameterized Models

Spotlights 5:20-5:50

[5:20]
Stabilizing Equilibrium Models by Jacobian Regularization

[5:25]
On the Predictability of Pruning Across Scales

[5:30]
Lottery Ticket Preserves Weight Correlation: Is It Desirable or Not?

[5:35]
LIME: Learning Inductive Bias for Primitives of Mathematical Reasoning

[5:40]
Dense for the Price of Sparse: Improved Performance of Sparsely Initialized Networks via a Subspace Offset

[5:45]
Learning Neural Network Subspaces

Q&As 5:50-5:55

[5:50]
Q&A

(ends 6:00 PM)

Orals 5:00-5:20

[5:00]
On the price of explainability for some clustering problems

Spotlights 5:20-5:50

[5:20]
Instance Specific Approximations for Submodular Maximization

[5:25]
Adapting to Delays and Data in Adversarial Multi-Armed Bandits

[5:30]
Structured Convolutional Kernel Networks for Airline Crew Scheduling

[5:35]
Online Graph Dictionary Learning

[5:40]
Stochastic Iterative Graph Matching

[5:45]
Training Quantized Neural Networks to Global Optimality via Semidefinite Programming

Q&As 5:50-5:55

[5:50]
Q&A

(ends 6:00 PM)

Orals 5:00-5:20

[5:00]
Robust Asymmetric Learning in POMDPs

Spotlights 5:20-5:50

[5:20]
Differentiable Spatial Planning using Transformers

[5:25]
Convex Regularization in Monte-Carlo Tree Search

[5:30]
On-Policy Deep Reinforcement Learning for the Average-Reward Criterion

[5:35]
Multi-Task Reinforcement Learning with Context-based Representations

[5:40]
High Confidence Generalization for Reinforcement Learning

[5:45]
Locally Persistent Exploration in Continuous Control Tasks with Sparse Rewards

Q&As 5:50-5:55

[5:50]
Q&A

(ends 6:00 PM)

Orals 5:00-5:20

[5:00]
Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning

Spotlights 5:20-5:50

[5:20]
Re-understanding Finite-State Representations of Recurrent Policy Networks

[5:25]
Emergent Social Learning via Multi-agent Reinforcement Learning

[5:30]
From Poincaré Recurrence to Convergence in Imperfect Information Games: Finding Equilibrium via Regularization

[5:35]
Actionable Models: Unsupervised Offline Reinforcement Learning of Robotic Skills

[5:40]
Trajectory Diversity for Zero-Shot Coordination

[5:45]
FOP: Factorizing Optimal Joint Policy of Maximum-Entropy Multi-Agent Reinforcement Learning

Q&As 5:50-5:55

[5:50]
Q&A

(ends 6:00 PM)

Orals 5:00-5:20

[5:00]
NeRF-VAE: A Geometry Aware 3D Scene Generative Model

Spotlights 5:20-5:50

[5:20]
Quantitative Understanding of VAE as a Non-linearly Scaled Isometric Embedding

[5:25]
Soft then Hard: Rethinking the Quantization in Neural Image Compression

[5:30]
Improved Contrastive Divergence Training of Energy-Based Models

[5:35]
Deep Generative Learning via Schrödinger Bridge

[5:40]
Partially Observed Exchangeable Modeling

[5:45]
Understanding Failures in Out-of-Distribution Detection with Deep Generative Models

Q&As 5:50-5:55

[5:50]
Q&A

(ends 6:00 PM)

Orals 5:00-5:20

[5:00]
CATE: Computation-aware Neural Architecture Encoding with Transformers

Spotlights 5:20-5:50

[5:20]
What Does Rotation Prediction Tell Us about Classifier Accuracy under Varying Testing Environments?

[5:25]
Towards Domain-Agnostic Contrastive Learning

[5:30]
Joining datasets via data augmentation in the label space for neural networks

[5:35]
Differentiable Sorting Networks for Scalable Sorting and Ranking Supervision

[5:40]
Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation

[5:45]
Poolingformer: Long Document Modeling with Pooling Attention

Q&As 5:50-5:55

[5:50]
Q&A

(ends 6:00 PM)

6 p.m.

Orals 6:00-6:20

[6:00]
Network Inference and Influence Maximization from Samples

Spotlights 6:20-6:50

[6:20]
Regularized Submodular Maximization at Scale

[6:25]
Marginal Contribution Feature Importance - an Axiomatic Approach for Explaining Data

[6:30]
Connecting Interpretability and Robustness in Decision Trees through Separation

[6:35]
Light RUMs

[6:40]
Submodular Maximization subject to a Knapsack Constraint: Combinatorial Algorithms with Near-optimal Adaptive Complexity

[6:45]
CombOptNet: Fit the Right NP-Hard Problem by Learning Integer Programming Constraints

Q&As 6:50-6:55

[6:50]
Q&A

(ends 7:00 PM)

Orals 6:00-6:20

[6:00]
A Wasserstein Minimax Framework for Mixed Linear Regression

Spotlights 6:20-6:50

[6:20]
Weight-covariance alignment for adversarially robust neural networks

[6:25]
Rethinking Rotated Object Detection with Gaussian Wasserstein Distance Loss

[6:30]
Communication-Efficient Distributed SVD via Local Power Iterations

[6:35]
A Riemannian Block Coordinate Descent Method for Computing the Projection Robust Wasserstein Distance

[6:40]
Probabilistic Sequential Shrinking: A Best Arm Identification Algorithm for Stochastic Bandits with Corruptions

[6:45]
Leveraging Language to Learn Program Abstractions and Search Heuristics

Q&As 6:50-6:55

[6:50]
Q&A

(ends 7:00 PM)

Orals 6:00-6:20

[6:00]
Decoupling Value and Policy for Generalization in Reinforcement Learning

Spotlights 6:20-6:50

[6:20]
Prioritized Level Replay

[6:25]
SECANT: Self-Expert Cloning for Zero-Shot Generalization of Visual Policies

[6:30]
GMAC: A Distributional Perspective on Actor-Critic Framework

[6:35]
Goal-Conditioned Reinforcement Learning with Imagined Subgoals

[6:40]
Policy Gradient Bayesian Robust Optimization for Imitation Learning

[6:45]
Reinforcement Learning of Implicit and Explicit Control Flow Instructions

Q&As 6:50-6:55

[6:50]
Q&A

(ends 7:00 PM)

Orals 6:00-6:20

[6:00]
Unbiased Gradient Estimation in Unrolled Computation Graphs with Persistent Evolution Strategies

Spotlights 6:20-6:45

[6:20]
EfficientNetV2: Smaller Models and Faster Training

[6:25]
Causal Curiosity: RL Agents Discovering Self-supervised Experiments for Causal Representation Learning

[6:30]
LAMDA: Label Matching Deep Domain Adaptation

[6:35]
Temporally Correlated Task Scheduling for Sequence Learning

[6:40]
Information Obfuscation of Graph Neural Networks

Q&As 6:45-6:50

[6:45]
Q&A

(ends 7:00 PM)

Spotlights 6:00-6:15

[6:00]
iDARTS: Differentiable Architecture Search with Stochastic Implicit Gradients

[6:05]
Accurate Post Training Quantization With Small Calibration Sets

[6:10]
Optimal Transport Kernels for Sequential and Parallel Neural Architecture Search

Orals 6:15-6:35

[6:15]
Few-Shot Neural Architecture Search

Spotlights 6:35-6:50

[6:35]
AutoAttend: Automated Attention Representation Search

[6:40]
Think Global and Act Local: Bayesian Optimisation over High-Dimensional Categorical and Mixed Search Spaces

[6:45]
Auto-NBA: Efficient and Effective Search Over the Joint Space of Networks, Bitwidths, and Accelerators

Q&As 6:50-6:55

[6:50]
Q&A

(ends 7:00 PM)

Orals 6:00-6:20

[6:00]
The Emergence of Individuality

Spotlights 6:20-6:45

[6:20]
DFAC Framework: Factorizing the Value Function via Quantile Mixture for Multi-Agent Distributional Q-Learning

[6:25]
From Local to Global Norm Emergence: Dissolving Self-reinforcing Substructures with Incremental Social Instruments

[6:30]
Learning While Playing in Mean-Field Games: Convergence and Optimality

[6:35]
Learning Fair Policies in Decentralized Cooperative Multi-Agent Reinforcement Learning

[6:40]
Augmented World Models Facilitate Zero-Shot Dynamics Generalization From a Single Offline Environment

Q&As 6:45-6:50

[6:45]
Q&A

(ends 7:00 PM)

Orals 6:00-6:20

[6:00]
PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via Relabeling Experience and Unsupervised Pre-training

Spotlights 6:20-6:50

[6:20]
Uncertainty Weighted Actor-Critic for Offline Reinforcement Learning

[6:25]
Keyframe-Focused Visual Imitation Learning

[6:30]
Learning and Planning in Average-Reward Markov Decision Processes

[6:35]
Towards Better Laplacian Representation in Reinforcement Learning with Generalized Graph Drawing

[6:40]
Low-Precision Reinforcement Learning: Running Soft Actor-Critic in Half Precision

[6:45]
Emphatic Algorithms for Deep Reinforcement Learning

Q&As 6:50-6:55

[6:50]
Q&A

(ends 7:00 PM)

Orals 6:00-6:20

[6:00]
The Power of Adaptivity for Stochastic Submodular Cover

Spotlights 6:20-6:50

[6:20]
The Heavy-Tail Phenomenon in SGD

[6:25]
Federated Composite Optimization

[6:30]
On Estimation in Latent Variable Models

[6:35]
Asynchronous Distributed Learning : Adapting to Gradient Delays without Prior Knowledge

[6:40]
Randomized Algorithms for Submodular Function Maximization with a $k$-System Constraint

[6:45]
BASGD: Buffered Asynchronous SGD for Byzantine Learning

Q&As 6:50-6:55

[6:50]
Q&A

(ends 7:00 PM)

Orals 6:00-6:20

[6:00]
Generating images with sparse representations

Spotlights 6:20-6:50

[6:20]
An Identifiable Double VAE For Disentangled Representations

[6:25]
A Unified Generative Adversarial Network Training via Self-Labeling and Self-Attention

[6:30]
On Characterizing GAN Convergence Through Proximal Duality Gap

[6:35]
Scalable Normalizing Flows for Permutation Invariant Densities

[6:40]
Parallel and Flexible Sampling from Autoregressive Models via Langevin Dynamics

[6:45]
Zero-Shot Text-to-Image Generation

Q&As 6:50-6:55

[6:50]
Q&A

(ends 7:00 PM)

7 p.m.

Orals 7:00-7:20

[7:00]
Cooperative Exploration for Multi-Agent Deep Reinforcement Learning

Spotlights 7:20-7:50

[7:20]
A Deep Reinforcement Learning Approach to Marginalized Importance Sampling with the Successor Representation

[7:25]
Learning to Weight Imperfect Demonstrations

[7:30]
DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning

[7:35]
MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven Reinforcement Learning

[7:40]
RRL: Resnet as representation for Reinforcement Learning

[7:45]
SCC: an efficient deep reinforcement learning agent mastering the game of StarCraft II

Q&As 7:50-7:55

[7:50]
Q&A

(ends 8:00 PM)

Orals 7:00-7:20

[7:00]
AlphaNet: Improved Training of Supernets with Alpha-Divergence

Spotlights 7:20-7:50

[7:20]
Catformer: Designing Stable Transformers via Sensitivity Analysis

[7:25]
A Receptor Skeleton for Capsule Neural Networks

[7:30]
Explore Visual Concept Formation for Image Classification

[7:35]
K-shot NAS: Learnable Weight-Sharing for NAS with K-shot Supernets

[7:40]
High-Performance Large-Scale Image Recognition Without Normalization

[7:45]
Lipschitz normalization for self-attention layers with application to graph neural networks

Q&As 7:50-7:55

[7:50]
Q&A

(ends 8:00 PM)

Orals 7:00-7:20

[7:00]
Accelerated Algorithms for Smooth Convex-Concave Minimax Problems with O(1/k^2) Rate on Squared Gradient Norm

Spotlights 7:20-7:50

[7:20]
Communication-Efficient Distributed Optimization with Quantized Preconditioners

[7:25]
Optimal regret algorithm for Pseudo-1d Bandit Convex Optimization

[7:30]
Fast Stochastic Bregman Gradient Methods: Sharp Analysis and Variance Reduction

[7:35]
Moreau-Yosida $f$-divergences

[7:40]
Affine Invariant Analysis of Frank-Wolfe on Strongly Convex Sets

[7:45]
On a Combination of Alternating Minimization and Nesterov's Momentum

Q&As 7:50-7:55

[7:50]
Q&A

(ends 8:00 PM)

Orals 7:00-7:20

[7:00]
Inverse Decision Modeling: Learning Interpretable Representations of Behavior

Spotlights 7:20-7:50

[7:20]
On Proximal Policy Optimization's Heavy-tailed Gradients

[7:25]
Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement Learning

[7:30]
Variational Empowerment as Representation Learning for Goal-Conditioned Reinforcement Learning

[7:35]
Is Pessimism Provably Efficient for Offline RL?

[7:40]
Beyond Variance Reduction: Understanding the True Impact of Baselines on Policy Optimization

[7:45]
Density Constrained Reinforcement Learning

Q&As 7:50-7:55

[7:50]
Q&A

(ends 8:00 PM)

Orals 7:00-7:20

[7:00]
Sequential Domain Adaptation by Synthesizing Distributionally Robust Experts

Spotlights 7:20-7:50

[7:20]
Oblivious Sketching-based Central Path Method for Linear Programming

[7:25]
Bayesian Optimization over Hybrid Spaces

[7:30]
Variational (Gradient) Estimate of the Score Function in Energy-based Latent Variable Models

[7:35]
Compositional Video Synthesis with Action Graphs

[7:40]
Neural Pharmacodynamic State Space Modeling

[7:45]
Three Operator Splitting with a Nonconvex Loss Function

Q&As 7:50-7:55

[7:50]
Q&A

(ends 8:00 PM)

Orals 7:00-7:20

[7:00]
ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training

Spotlights 7:20-7:50

[7:20]
Householder Sketch for Accurate and Accelerated Least-Mean-Squares Solvers

[7:25]
Accumulated Decoupled Learning with Gradient Staleness Mitigation for Convolutional Neural Networks

[7:30]
Training Graph Neural Networks with 1000 Layers

[7:35]
1-bit Adam: Communication Efficient Large-Scale Training with Adam's Convergence Speed

[7:40]
Federated Deep AUC Maximization for Hetergeneous Data with a Constant Communication Complexity

[7:45]
Ditto: Fair and Robust Federated Learning Through Personalization

Q&As 7:50-7:55

[7:50]
Q&A

(ends 8:00 PM)

Orals 7:00-7:20

[7:00]
Out-of-Distribution Generalization via Risk Extrapolation (REx)

Spotlights 7:20-7:50

[7:20]
What Makes for End-to-End Object Detection?

[7:25]
On Explainability of Graph Neural Networks via Subgraph Explorations

[7:30]
Just How Toxic is Data Poisoning? A Unified Benchmark for Backdoor and Data Poisoning Attacks

[7:35]
Data Augmentation for Meta-Learning

[7:40]
Understanding Invariance via Feedforward Inversion of Discriminatively Trained Classifiers

[7:45]
Neural Symbolic Regression that scales

Q&As 7:50-7:55

[7:50]
Q&A

(ends 8:00 PM)

Orals 7:00-7:20

[7:00]
Hyperparameter Selection for Imitation Learning

Spotlights 7:20-7:50

[7:20]
Revisiting Peng's Q($\lambda$) for Modern Reinforcement Learning

[7:25]
Monotonic Robust Policy Optimization with Model Discrepancy

[7:30]
Taylor Expansion of Discount Factors

[7:35]
Generalizable Episodic Memory for Deep Reinforcement Learning

[7:40]
Representation Matters: Offline Pretraining for Sequential Decision Making

[7:45]
Reinforcement Learning Under Moral Uncertainty

Q&As 7:50-7:55

[7:50]
Q&A

(ends 8:00 PM)

Orals 7:00-7:20

[7:00]
Just Train Twice: Improving Group Robustness without Training Group Information

Spotlights 7:20-7:50

[7:20]
Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth

[7:25]
GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training

[7:30]
A Bit More Bayesian: Domain-Invariant Learning with Uncertainty

[7:35]
Neural Rough Differential Equations for Long Time Series

[7:40]
Whitening and Second Order Optimization Both Make Information in the Dataset Unusable During Training, and Can Reduce or Prevent Generalization

[7:45]
Data augmentation for deep learning based accelerated MRI reconstruction with limited data

Q&As 7:50-7:55

[7:50]
Q&A

(ends 8:00 PM)

8 p.m.

Invited Talk:

Xiao Cunde, Qin Dahe

(ends 9:00 PM)

9 p.m.

(ends 11:00 PM)

WED 21 JUL

5 a.m.

Orals 5:00-5:20

[5:00]
Cross-domain Imitation from Observations

Spotlights 5:20-5:50

[5:20]
SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning

[5:25]
Decoupling Exploration and Exploitation for Meta-Reinforcement Learning without Sacrifices

[5:30]
Active Feature Acquisition with Generative Surrogate Models

[5:35]
Characterizing the Gap Between Actor-Critic and Policy Gradient

[5:40]
Spectral Normalisation for Deep Reinforcement Learning: An Optimisation Perspective

[5:45]
Accelerating Safe Reinforcement Learning with Constraint-mismatched Baseline Policies

Q&As 5:50-5:55

[5:50]
Q&A

(ends 6:00 AM)

Orals 5:00-5:20

[5:00]
Near Optimal Reward-Free Reinforcement Learning

Spotlights 5:20-5:50

[5:20]
Batch Value-function Approximation with Only Realizability

[5:25]
Adversarial Combinatorial Bandits with General Non-linear Reward Functions

[5:30]
Model-Free and Model-Based Policy Evaluation when Causality is Uncertain

[5:35]
Bootstrapping Fitted Q-Evaluation for Off-Policy Inference

[5:40]
On Learnability via Gradient Method for Two-Layer ReLU Neural Networks in Teacher-Student Setting

[5:45]
Spectral vertex sparsifiers and pair-wise spanners over distributed graphs

Q&As 5:50-5:55

[5:50]
Q&A

(ends 6:00 AM)

Orals 5:00-5:20

[5:00]
On Energy-Based Models with Overparametrized Shallow Neural Networks

Spotlights 5:20-5:50

[5:20]
Uncertainty Principles of Encoding GANs

[5:25]
On the Proof of Global Convergence of Gradient Descent for Deep ReLU Networks with Linear Widths

[5:30]
Tight Bounds on the Smallest Eigenvalue of the Neural Tangent Kernel for Deep ReLU Networks

[5:35]
Functional Space Analysis of Local GAN Convergence

[5:40]
Exact Gap between Generalization Error and Uniform Convergence in Random Feature Models

[5:45]
Provable Generalization of SGD-trained Neural Networks of Any Width in the Presence of Adversarial Label Noise

Q&As 5:50-5:55

[5:50]
Q&A

(ends 6:00 AM)

Orals 5:00-5:20

[5:00]
APS: Active Pretraining with Successor Features

Spotlights 5:20-5:50

[5:20]
Guided Exploration with Proximal Policy Optimization using a Single Demonstration

[5:25]
Self-Paced Context Evaluation for Contextual Reinforcement Learning

[5:30]
Unsupervised Skill Discovery with Bottleneck Option Learning

[5:35]
TeachMyAgent: a Benchmark for Automatic Curriculum Learning in Deep RL

[5:40]
Grounding Language to Entities and Dynamics for Generalization in Reinforcement Learning

[5:45]
Data-efficient Hindsight Off-policy Option Learning

Q&As 5:50-5:55

[5:50]
Q&A

(ends 6:00 AM)

Orals 5:00-5:20

[5:00]
The Limits of Min-Max Optimization Algorithms: Convergence to Spurious Non-Critical Sets

Spotlights 5:20-5:50

[5:20]
Theory of Spectral Method for Union of Subspaces-Based Random Geometry Graph

[5:25]
Approximating a Distribution Using Weight Queries

[5:30]
Estimating $\alpha$-Rank from A Few Entries with Low Rank Matrix Completion

[5:35]
Revenue-Incentive Tradeoffs in Dynamic Reserve Pricing

[5:40]
Towards the Unification and Robustness of Perturbation and Gradient Based Explanations

[5:45]
Classifying high-dimensional Gaussian mixtures: Where kernel methods fail and neural networks succeed

Q&As 5:50-5:55

[5:50]
Q&A

(ends 6:00 AM)

Orals 5:00-5:20

[5:00]
Optimizing persistent homology based functions

Spotlights 5:20-5:50

[5:20]
Debiasing a First-order Heuristic for Approximate Bi-level Optimization

[5:25]
SMG: A Shuffling Gradient-Based Method with Momentum

[5:30]
Regret Minimization in Stochastic Non-Convex Learning via a Proximal-Gradient Approach

[5:35]
MARINA: Faster Non-Convex Distributed Learning with Compression

[5:40]
Bilevel Optimization: Convergence Analysis and Enhanced Design

[5:45]
Learning from History for Byzantine Robust Optimization

Q&As 5:50-5:55

[5:50]
Q&A

(ends 6:00 AM)

Orals 5:00-5:20

[5:00]
When All We Need is a Piece of the Pie: A Generic Framework for Optimizing Two-way Partial AUC

Spotlights 5:20-5:50

[5:20]
SiameseXML: Siamese Networks meet Extreme Classifiers with 100M Labels

[5:25]
Disentangling Sampling and Labeling Bias for Learning in Large-output Spaces

[5:30]
Learning Randomly Perturbed Structured Predictors for Direct Loss Minimization

[5:35]
Improving Molecular Graph Neural Network Explainability with Orthonormalization and Induced Sparsity

[5:40]
Evaluating Robustness of Predictive Uncertainty Estimation: Are Dirichlet-based Models Reliable?

[5:45]
Meta-learning Hyperparameter Performance Prediction with Neural Processes

Q&As 5:50-5:55

[5:50]
Q&A

(ends 6:00 AM)

Orals 5:00-5:20

[5:00]
Improving Lossless Compression Rates via Monte Carlo Bits-Back Coding

Spotlights 5:20-5:50

[5:20]
Sawtooth Factorial Topic Embeddings Guided Gamma Belief Network

[5:25]
Kernel Continual Learning

[5:30]
XOR-CD: Linearly Convergent Constrained Structure Generation

[5:35]
ARMS: Antithetic-REINFORCE-Multi-Sample Gradient for Binary Variables

[5:40]
Composing Normalizing Flows for Inverse Problems

[5:45]
Nonparametric Hamiltonian Monte Carlo

Q&As 5:50-5:55

[5:50]
Q&A

(ends 6:00 AM)

Orals 5:00-5:20

[5:00]
Robust Density Estimation from Batches: The Best Things in Life are (Nearly) Free

Spotlights 5:20-5:50

[5:20]
Generalization Bounds in the Presence of Outliers: a Median-of-Means Study

[5:25]
Meta Learning for Support Recovery in High-dimensional Precision Matrix Estimation

[5:30]
Robust Inference for High-Dimensional Linear Models via Residual Randomization

[5:35]
Don’t Just Blame Over-parametrization for Over-confidence: Theoretical Analysis of Calibration in Binary Classification

[5:40]
Generalization Guarantees for Neural Architecture Search with Train-Validation Split

[5:45]
Optimal Estimation of High Dimensional Smooth Additive Function Based on Noisy Observations

Q&As 5:50-5:55

[5:50]
Q&A

(ends 6:00 AM)

6 a.m.

Orals 6:00-6:20

[6:00]
Regret and Cumulative Constraint Violation Analysis for Online Convex Optimization with Long Term Constraints

Spotlights 6:20-6:50

[6:20]
Near-Optimal Confidence Sequences for Bounded Random Variables

[6:25]
Joint Online Learning and Decision-making via Dual Mirror Descent

[6:30]
Online A-Optimal Design and Active Linear Regression

[6:35]
Fairness and Bias in Online Selection

[6:40]
ChaCha for Online AutoML

[6:45]
An Algorithm for Stochastic and Adversarial Bandits with Switching Costs

Q&As 6:50-6:55

[6:50]
Q&A

(ends 7:00 AM)

Orals 6:00-6:20

[6:00]
Elastic Graph Neural Networks

Spotlights 6:20-6:50

[6:20]
How could Neural Networks understand Programs?

[6:25]
ProGraML: A Graph-based Program Representation for Data Flow Analysis and Compiler Optimizations

[6:30]
How Do Adam and Training Strategies Help BNNs Optimization

[6:35]
Quantifying and Reducing Bias in Maximum Likelihood Estimation of Structured Anomalies

[6:40]
Learning from Nested Data with Ornstein Auto-Encoders

[6:45]
Kernel-Based Reinforcement Learning: A Finite-Time Analysis

Q&As 6:50-6:55

[6:50]
Q&A

(ends 7:00 AM)

Orals 6:00-6:20

[6:00]
Agnostic Learning of Halfspaces with Gradient Descent via Soft Margins

Spotlights 6:20-6:50

[6:20]
Two-way kernel matrix puncturing: towards resource-efficient PCA and spectral clustering

[6:25]
A Lower Bound for the Sample Complexity of Inverse Reinforcement Learning

[6:30]
Estimation and Quantization of Expected Persistence Diagrams

[6:35]
Post-selection inference with HSIC-Lasso

[6:40]
Provable Robustness of Adversarial Training for Learning Halfspaces with Noise

[6:45]
Distribution-Free Calibration Guarantees for Histogram Binning without Sample Splitting

Q&As 6:50-6:55

[6:50]
Q&A

(ends 7:00 AM)

Orals 6:00-6:20

[6:00]
Reserve Price Optimization for First Price Auctions in Display Advertising

Spotlights 6:20-6:50

[6:20]
Align, then memorise: the dynamics of learning with feedback alignment

[6:25]
Connecting Optimal Ex-Ante Collusion in Teams to Extensive-Form Correlation: Faster Algorithms and Positive Complexity Results

[6:30]
Learning to Price Against a Moving Target

[6:35]
Fast Algorithms for Stackelberg Prediction Game with Least Squares Loss

[6:40]
Approximate Group Fairness for Clustering

[6:45]
Incentivizing Compliance with Algorithmic Instruments

Q&As 6:50-6:55

[6:50]
Q&A

(ends 7:00 AM)

Orals 6:00-6:20

[6:00]
Bilinear Classes: A Structural Framework for Provable Generalization in RL

Spotlights 6:20-6:50

[6:20]
Risk Bounds and Rademacher Complexity in Batch Reinforcement Learning

[6:25]
Online Policy Gradient for Model Free Learning of Linear Quadratic Regulators with √T Regret

[6:30]
Reward Identification in Inverse Reinforcement Learning

[6:35]
Online Optimization in Games via Control Theory: Connecting Regret, Passivity and Poincaré Recurrence

[6:40]
Efficient Performance Bounds for Primal-Dual Reinforcement Learning from Demonstrations

[6:45]
Robust Reinforcement Learning using Least Squares Policy Iteration with Provable Performance Guarantees

Q&As 6:50-6:55

[6:50]
Q&A

(ends 7:00 AM)

Orals 6:00-6:20

[6:00]
Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions

Spotlights 6:20-6:50

[6:20]
Megaverse: Simulating Embodied Agents at One Million Experiences per Second

[6:25]
Scaling Multi-Agent Reinforcement Learning with Selective Parameter Sharing

[6:30]
Towards Open Ad Hoc Teamwork Using Graph-based Policy Learning

[6:35]
Off-Belief Learning

[6:40]
On Reinforcement Learning with Adversarial Corruption and Its Application to Block MDP

[6:45]
Combining Pessimism with Optimism for Robust and Efficient Model-Based Deep Reinforcement Learning

Q&As 6:50-6:55

[6:50]
Q&A

(ends 7:00 AM)

Orals 6:00-6:20

[6:00]
Dynamic Game Theoretic Neural Optimizer

Spotlights 6:20-6:50

[6:20]
Zero-Shot Knowledge Distillation from a Decision-Based Black-Box Model

[6:25]
A Modular Analysis of Provable Acceleration via Polyak's Momentum: Training a Wide ReLU Network and a Deep Linear Network

[6:30]
Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers

[6:35]
Tractable structured natural-gradient descent using local parameterizations

[6:40]
Towards Rigorous Interpretations: a Formalisation of Feature Attribution

[6:45]
Distributed Nystr\"{o}m Kernel Learning with Communications

Q&As 6:50-6:55

[6:50]
Q&A

(ends 7:00 AM)

Orals 6:00-6:20

[6:00]
Model-based Reinforcement Learning for Continuous Control with Posterior Sampling

Spotlights 6:20-6:50

[6:20]
Principled Exploration via Optimistic Bootstrapping and Backward Induction

[6:25]
Ensemble Bootstrapping for Q-Learning

[6:30]
Finite-Sample Analysis of Off-Policy Natural Actor-Critic Algorithm

[6:35]
A Regret Minimization Approach to Iterative Learning Control

[6:40]
TempoRL: Learning When to Act

[6:45]
State Relevance for Off-Policy Evaluation

Q&As 6:50-6:55

[6:50]
Q&A

(ends 7:00 AM)

Orals 6:00-6:20

[6:00]
Understanding Instance-Level Label Noise: Disparate Impacts and Treatments

Spotlights 6:20-6:50

[6:20]
Selecting Data Augmentation for Simulating Interventions

[6:25]
Training Data Subset Selection for Regression with Controlled Generalization Error

[6:30]
Opening the Blackbox: Accelerating Neural Differential Equations by Regularizing Internal Solver Heuristics

[6:35]
Learning from Noisy Labels with No Change to the Training Process

[6:40]
What does LIME really see in images?

[6:45]
Narrow Margins: Classification, Margins and Fat Tails

Q&As 6:50-6:55

[6:50]
Q&A

(ends 7:00 AM)

6:40 a.m.

7 a.m.

Orals 7:00-7:20

[7:00]
High-dimensional Experimental Design and Kernel Bandits

Spotlights 7:20-7:50

[7:20]
Dichotomous Optimistic Search to Quantify Human Perception

[7:25]
Improved Confidence Bounds for the Linear Logistic Model and Applications to Bandits

[7:30]
Stochastic Multi-Armed Bandits with Unrestricted Delay Distributions

[7:35]
Deciding What to Learn: A Rate-Distortion Approach

[7:40]
No-regret Algorithms for Capturing Events in Poisson Point Processes

[7:45]
Parametric Graph for Unimodal Ranking Bandit

Q&As 7:50-7:55

[7:50]
Q&A

(ends 8:00 AM)

Orals 7:00-7:40

[7:00]
The Logical Options Framework

[7:20]
On Reward-Free RL with Kernel and Neural Function Approximations: Single-Agent MDP and Markov Game

Spotlights 7:40-7:50

[7:40]
Adversarial Option-Aware Hierarchical Imitation Learning

[7:45]
Value Iteration in Continuous Actions, States and Time

Q&As 7:50-7:55

[7:50]
Q&A

(ends 8:00 AM)

Orals 7:00-7:20

[7:00]
PAC-Learning for Strategic Classification

Spotlights 7:20-7:50

[7:20]
Learning from Biased Data: A Semi-Parametric Approach

[7:25]
Learning in Nonzero-Sum Stochastic Games with Potentials

[7:30]
Guarantees for Tuning the Step Size using a Learning-to-Learn Approach

[7:35]
Large-Scale Multi-Agent Deep FBSDEs

[7:40]
Multi-group Agnostic PAC Learnability

[7:45]
One for One, or All for All: Equilibria and Optimality of Collaboration in Federated Learning

Q&As 7:50-7:55

[7:50]
Q&A

(ends 8:00 AM)

Spotlights 7:00-7:45

[7:00]
Instabilities of Offline RL with Pre-Trained Neural Representation

[7:05]
Path Planning using Neural A* Search

[7:10]
Group-Sparse Matrix Factorization for Transfer Learning of Word Embeddings

[7:15]
Tightening the Dependence on Horizon in the Sample Complexity of Q-Learning

[7:20]
Solving Challenging Dexterous Manipulation Tasks With Trajectory Optimisation and Reinforcement Learning

[7:25]
Continuous-time Model-based Reinforcement Learning

[7:30]
Bayesian Optimistic Optimisation with Exponentially Decaying Regret

[7:35]
Best Model Identification: A Rested Bandit Formulation

[7:40]
Global Convergence of Policy Gradient for Linear-Quadratic Mean-Field Control/Game in Continuous Time

Q&As 7:45-7:50

[7:45]
Q&A

(ends 8:00 AM)

Orals 7:00-7:20

[7:00]
Modelling Behavioural Diversity for Learning in Open-Ended Games

Spotlights 7:20-7:50

[7:20]
Follow-the-Regularized-Leader Routes to Chaos in Routing Games

[7:25]
How to Learn when Data Reacts to Your Model: Performative Gradient Descent

[7:30]
Continuous Coordination As a Realistic Scenario for Lifelong Learning

[7:35]
Efficient Deviation Types and Learning for Hindsight Rationality in Extensive-Form Games

[7:40]
Collaborative Bayesian Optimization with Fair Regret

[7:45]
Exponentially Many Local Minima in Quantum Neural Networks

Q&As 7:50-7:55

[7:50]
Q&A

(ends 8:00 AM)

Orals 7:00-7:20

[7:00]
On the Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror Descent

Spotlights 7:20-7:50

[7:20]
A statistical perspective on distillation

[7:25]
The Lipschitz Constant of Self-Attention

[7:30]
Revealing the Structure of Deep Neural Networks via Convex Duality

[7:35]
Representational aspects of depth and conditioning in normalizing flows

[7:40]
Toward Understanding the Feature Learning Process of Self-supervised Contrastive Learning

[7:45]
The Hintons in your Neural Network: a Quantum Field Theory View of Deep Learning

Q&As 7:50-7:55

[7:50]
Q&A

(ends 8:00 AM)

Orals 7:00-7:20

[7:00]
Inferring Latent Dynamics Underlying Neural Population Activity via Neural Differential Equations

Spotlights 7:20-7:50

[7:20]
Learning Queueing Policies for Organ Transplantation Allocation using Interpretable Counterfactual Survival Analysis

[7:25]
Deep Continuous Networks

[7:30]
SimAM: A Simple, Parameter-Free Attention Module for Convolutional Neural Networks

[7:35]
Factor-analytic inverse regression for high-dimension, small-sample dimensionality reduction

[7:40]
On-Off Center-Surround Receptive Fields for Accurate and Robust Image Classification

[7:45]
AGENT: A Benchmark for Core Psychological Reasoning

Q&As 7:50-7:55

[7:50]
Q&A

(ends 8:00 AM)

Orals 7:00-7:20

[7:00]
Inferring serial correlation with dynamic backgrounds

Spotlights 7:20-7:50

[7:20]
Variance Reduced Training with Stratified Sampling for Forecasting Models

[7:25]
Necessary and sufficient conditions for causal feature selection in time series with latent common causes

[7:30]
Multiplying Matrices Without Multiplying

[7:35]
The Power of Log-Sum-Exp: Sequential Density Ratio Matrix Estimation for Speed-Accuracy Optimization

[7:40]
Data-driven Prediction of General Hamiltonian Dynamics via Learning Exactly-Symplectic Maps

[7:45]
Learning Stochastic Behaviour from Aggregate Data

Q&As 7:50-7:55

[7:50]
Q&A

(ends 8:00 AM)

Orals 7:00-7:20

[7:00]
Kernel Stein Discrepancy Descent

Spotlights 7:20-7:50

[7:20]
Conditional Distributional Treatment Effect with Kernel Conditional Mean Embeddings and U-Statistic Regression

[7:25]
Generalised Lipschitz Regularisation Equals Distributional Robustness

[7:30]
Interpretable Stein Goodness-of-fit Tests on Riemannian Manifold

[7:35]
An exact solver for the Weston-Watkins SVM subproblem

[7:40]
Proximal Causal Learning with Kernels: Two-Stage Estimation and Moment Restriction

[7:45]
Faster Kernel Matrix Algebra via Density Estimation

Q&As 7:50-7:55

[7:50]
Q&A

(ends 8:00 AM)

Orals 7:00-7:20

[7:00]
Measuring Robustness in Deep Learning Based Compressive Sensing

Spotlights 7:20-7:50

[7:20]
Instance-Optimal Compressed Sensing via Posterior Sampling

[7:25]
A Nullspace Property for Subspace-Preserving Recovery

[7:30]
Homomorphic Sensing: Sparsity and Noise

[7:35]
Active Deep Probabilistic Subsampling

[7:40]
Prior Image-Constrained Reconstruction using Style-Based Generative Models

[7:45]
Intermediate Layer Optimization for Inverse Problems using Deep Generative Models

Q&As 7:50-7:55

[7:50]
Q&A

(ends 8:00 AM)

8 a.m.

Invited Talk:

Esther Duflo

(ends 9:00 AM)

9 a.m.

The Power of Log-Sum-Exp: Sequential Density Ratio Matrix Estimation for Speed-Accuracy Optimization

(ends 11:00 AM)

1 p.m.

3:30 p.m.

5 p.m.

Orals 5:00-5:20

[5:00]
Learning Optimal Auctions with Correlated Valuations from Samples

Spotlights 5:20-5:50

[5:20]
Alternative Microfoundations for Strategic Classification

[5:25]
Multi-Receiver Online Bayesian Persuasion

[5:30]
Online Learning for Load Balancing of Unknown Monotone Resource Allocation Games

[5:35]
Compressed Maximum Likelihood

[5:40]
Consistent regression when oblivious outliers overwhelm

[5:45]
Asymptotics of Ridge Regression in Convolutional Models

Q&As 5:50-5:55

[5:50]
Q&A

(ends 6:00 PM)

Orals 5:00-5:20

[5:00]
Rate-Distortion Analysis of Minimum Excess Risk in Bayesian Learning

Spotlights 5:20-5:50

[5:20]
Near-Optimal Linear Regression under Distribution Shift

[5:25]
Detection of Signal in the Spiked Rectangular Models

[5:30]
A Distribution-dependent Analysis of Meta Learning

[5:35]
How Important is the Train-Validation Split in Meta-Learning?

[5:40]
Robust Unsupervised Learning via L-statistic Minimization

[5:45]
A Theory of Label Propagation for Subpopulation Shift

Q&As 5:50-5:55

[5:50]
Q&A

(ends 6:00 PM)

Orals 5:00-5:20

[5:00]
Label Distribution Learning Machine

Spotlights 5:20-5:50

[5:20]
Representation Matters: Assessing the Importance of Subgroup Allocations in Training Data

[5:25]
Heterogeneous Risk Minimization

[5:30]
Optimizing Black-box Metrics with Iterative Example Weighting

[5:35]
A theory of high dimensional regression with arbitrary correlations between input features and target functions: sample complexity, multiple descent curves and a hierarchy of phase transitions

[5:40]
How Does Loss Function Affect Generalization Performance of Deep Learning? Application to Human Age Estimation

[5:45]
Implicit rate-constrained optimization of non-decomposable objectives

Q&As 5:50-5:55

[5:50]
Q&A

(ends 6:00 PM)

Orals 5:00-5:20

[5:00]
The Implicit Bias for Adaptive Optimization Algorithms on Homogeneous Neural Networks

Spotlights 5:20-5:50

[5:20]
Asymmetric Heavy Tails and Implicit Bias in Gaussian Noise Injections

[5:25]
Understanding Noise Injection in GANs

[5:30]
FL-NTK: A Neural Tangent Kernel-based Framework for Federated Learning Analysis

[5:35]
Improved OOD Generalization via Adversarial Training and Pretraing

[5:40]
WGAN with an Infinitely Wide Generator Has No Spurious Stationary Points

[5:45]
Besov Function Approximation and Binary Classification on Low-Dimensional Manifolds Using Convolutional Residual Networks

Q&As 5:50-5:55

[5:50]
Q&A

(ends 6:00 PM)

Orals 5:00-5:20

[5:00]
UCB Momentum Q-learning: Correcting the bias without forgetting

Spotlights 5:20-5:45

[5:20]
Non-Exponentially Weighted Aggregation: Regret Bounds for Unbounded Loss Functions

[5:25]
Adversarial Dueling Bandits

[5:30]
Fast active learning for pure exploration in reinforcement learning

[5:35]
Leveraging Non-uniformity in First-order Non-convex Optimization

[5:40]
Probabilistic Programs with Stochastic Conditioning

Q&As 5:45-5:50

[5:45]
Q&A

(ends 6:00 PM)

Orals 5:00-5:20

[5:00]
Understanding self-supervised learning dynamics without contrastive pairs

Spotlights 5:20-5:50

[5:20]
Learning by Turning: Neural Architecture Aware Optimisation

[5:25]
Consensus Control for Decentralized Deep Learning

[5:30]
Selfish Sparse RNN Training

[5:35]
Catastrophic Fisher Explosion: Early Phase Fisher Matrix Impacts Generalization

[5:40]
Quasi-global Momentum: Accelerating Decentralized Deep Learning on Heterogeneous Data

[5:45]
Understanding the Dynamics of Gradient Flow in Overparameterized Linear models

Q&As 5:50-5:55

[5:50]
Q&A

(ends 6:00 PM)

Orals 5:00-5:20

[5:00]
Exponential Lower Bounds for Batch Reinforcement Learning: Batch RL can be Exponentially Harder than Online RL

Spotlights 5:20-5:50

[5:20]
Provably Efficient Reinforcement Learning for Discounted MDPs with Feature Mapping

[5:25]
Confidence-Budget Matching for Sequential Budgeted Learning

[5:30]
Sample Efficient Reinforcement Learning In Continuous State Spaces: A Perspective Beyond Linearity

[5:35]
Sparse Feature Selection Makes Batch Reinforcement Learning More Sample Efficient

[5:40]
Robust Policy Gradient against Strong Data Corruption

[5:45]
Logarithmic Regret for Reinforcement Learning with Linear Function Approximation

Q&As 5:50-5:55

[5:50]
Q&A

(ends 6:00 PM)

Orals 5:00-5:20

[5:00]
Online Unrelated Machine Load Balancing with Predictions Revisited

Spotlights 5:20-5:50

[5:20]
MOTS: Minimax Optimal Thompson Sampling

[5:25]
Regularized Online Allocation Problems: Fairness and Beyond

[5:30]
Near-Optimal Representation Learning for Linear Bandits and Linear RL

[5:35]
Improved Corruption Robust Algorithms for Episodic Reinforcement Learning

[5:40]
DriftSurf: Stable-State / Reactive-State Learning under Concept Drift

[5:45]
Online Submodular Resource Allocation with Applications to Rebalancing Shared Mobility Systems

Q&As 5:50-5:55

[5:50]
Q&A

(ends 6:00 PM)

Orals 5:00-5:20

[5:00]
The Symmetry between Arms and Knapsacks: A Primal-Dual Approach for Bandits with Knapsacks

Spotlights 5:20-5:50

[5:20]
Dynamic Planning and Learning under Recovering Rewards

[5:25]
Best Arm Identification in Graphical Bilinear Bandits

[5:30]
Achieving Near Instance-Optimality and Minimax-Optimality in Stochastic and Adversarial Linear Bandits Simultaneously

[5:35]
Incentivized Bandit Learning with Self-Reinforcing User Preferences

[5:40]
Approximation Theory Based Methods for RKHS Bandits

[5:45]
Dynamic Balancing for Model Selection in Bandits and RL

Q&As 5:50-5:55

[5:50]
Q&A

(ends 6:00 PM)

5:15 p.m.

6 p.m.

Orals 6:00-6:20

[6:00]
Dissecting Supervised Constrastive Learning

Spotlights 6:20-6:50

[6:20]
Noise and Fluctuation of Finite Learning Rate Stochastic Gradient Descent

[6:25]
Tensor Programs IV: Feature Learning in Infinite-Width Neural Networks

[6:30]
Scaling Properties of Deep Residual Networks

[6:35]
Contrastive Learning Inverts the Data Generating Process

[6:40]
Tensor Programs IIb: Architectural Universality Of Neural Tangent Kernel Training Dynamics

[6:45]
Do We Actually Need Dense Over-Parameterization? In-Time Over-Parameterization in Sparse Training

Q&As 6:50-6:55

[6:50]
Q&A

(ends 7:00 PM)

Orals 6:00-6:20

[6:00]
Stability and Generalization of Stochastic Gradient Methods for Minimax Problems

Spotlights 6:20-6:50

[6:20]
Outside the Echo Chamber: Optimizing the Performative Risk

[6:25]
Asymptotic Normality and Confidence Intervals for Prediction Risk of the Min-Norm Least Squares Estimator

[6:30]
Provable Meta-Learning of Linear Representations

[6:35]
Sample Complexity of Robust Linear Classification on Separated Data

[6:40]
The Impact of Record Linkage on Learning from Feature Partitioned Data

[6:45]
Train simultaneously, generalize better: Stability of gradient-based minimax learners

Q&As 6:50-6:55

[6:50]
Q&A

(ends 7:00 PM)

Orals 6:00-6:20

[6:00]
Cyclically Equivariant Neural Decoders for Cyclic Codes

Spotlights 6:20-6:50

[6:20]
KO codes: inventing nonlinear encoding and decoding for reliable wireless communication via deep-learning

[6:25]
An Information-Geometric Distance on the Space of Tasks

[6:30]
On Perceptual Lossy Compression: The Cost of Perceptual Reconstruction and An Optimal Training Framework

[6:35]
Discrete-Valued Latent Preference Matrix Estimation with Graph Side Information

[6:40]
A Novel Method to Solve Neural Knapsack Problems

[6:45]
Chebyshev Polynomial Codes: Task Entanglement-based Coding for Distributed Matrix Multiplication

Q&As 6:50-6:55

[6:50]
Q&A

(ends 7:00 PM)

Orals 6:00-6:20

[6:00]
Resource Allocation in Multi-armed Bandit Exploration: Overcoming Sublinear Scaling with Adaptive Parallelism

Spotlights 6:20-6:50

[6:20]
Optimal Streaming Algorithms for Multi-Armed Bandits

[6:25]
Top-k eXtreme Contextual Bandits with Arm Hierarchy

[6:30]
Improved Regret Bounds of Bilinear Bandits using Action Space Analysis

[6:35]
Interaction-Grounded Learning

[6:40]
Almost Optimal Anytime Algorithm for Batched Multi-Armed Bandits

[6:45]
Pure Exploration and Regret Minimization in Matching Bandits

Q&As 6:50-6:55

[6:50]
Q&A

(ends 7:00 PM)

Orals 6:00-6:20

[6:00]
Task-Optimal Exploration in Linear Dynamical Systems

Spotlights 6:20-6:45

[6:20]
Gaussian Process-Based Real-Time Learning for Safety Critical Applications

[6:25]
CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee

[6:30]
Randomized Exploration in Reinforcement Learning with General Value Function Approximation

[6:35]
Deep Coherent Exploration for Continuous Control

[6:40]
Towards Distraction-Robust Active Visual Tracking

Q&As 6:45-6:50

[6:45]
Q&A