Skip to yearly menu bar Skip to main content


Show Detail Timezone:
America/Los_Angeles
 
Filter Rooms:  

SUN 18 JUL

TUE 20 JUL
5 a.m.
Orals 5:00-5:20
[5:00] BORE: Bayesian Optimization by Density-Ratio Estimation
Spotlights 5:20-5:45
[5:20] AutoSampling: Search for Effective Data Sampling Schedules
[5:25] HardCoRe-NAS: Hard Constrained diffeRentiable Neural Architecture Search
[5:30] Bias-Robust Bayesian Optimization via Dueling Bandits
[5:35] Zeroth-Order Non-Convex Learning via Hierarchical Dual Averaging
[5:40] Sparsifying Networks via Subdifferential Inclusion
Q&As 5:45-5:50
[5:45] Q&A
(ends 6:00 AM)
Orals 5:00-5:20
[5:00] Relative Positional Encoding for Transformers with Linear Complexity
Spotlights 5:20-5:50
[5:20] A Free Lunch From ANN: Towards Efficient, Accurate Spiking Neural Networks Calibration
[5:25] A Unified Lottery Ticket Hypothesis for Graph Neural Networks
[5:30] Generative Adversarial Transformers
[5:35] Evolving Attention with Residual Convolutions
[5:40] Zoo-Tuning: Adaptive Transfer from A Zoo of Models
[5:45] UnICORNN: A recurrent model for learning very long time dependencies
Q&As 5:50-5:55
[5:50] Q&A
(ends 6:00 AM)
Orals 5:00-5:20
[5:00] Attention is not all you need: pure attention loses rank doubly exponentially with depth
Spotlights 5:20-5:50
[5:20] Navigation Turing Test (NTT): Learning to Evaluate Human-Like Navigation
[5:25] Efficient Generative Modelling of Protein Structure Fragments using a Deep Markov Model
[5:30] Exploiting structured data for learning contagious diseases under incomplete testing
[5:35] Strategic Classification Made Practical
[5:40] Large-Margin Contrastive Learning with Distance Polarization Regularizer
[5:45] SPADE: A Spectral Method for Black-Box Adversarial Robustness Evaluation
Q&As 5:50-5:55
[5:50] Q&A
(ends 6:00 AM)
Orals 5:00-5:20
[5:00] Size-Invariant Graph Representations for Graph Classification Extrapolations
Spotlights 5:20-5:50
[5:20] Consistent Nonparametric Methods for Network Assisted Covariate Estimation
[5:25] Explainable Automated Graph Representation Learning with Hyperparameter Importance
[5:30] Breaking the Limits of Message Passing Graph Neural Networks
[5:35] From Local Structures to Size Generalization in Graph Neural Networks
[5:40] Interpretable Stability Bounds for Spectral Graph Filters
[5:45] Learning Node Representations Using Stationary Flow Prediction on Large Payment and Cash Transaction Networks
Q&As 5:50-5:55
[5:50] Q&A
(ends 6:00 AM)
Orals 5:00-5:20
[5:00] Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting Pot
Spotlights 5:20-5:45
[5:20] UneVEn: Universal Value Exploration for Multi-Agent Reinforcement Learning
[5:25] A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning
[5:30] Multi-Agent Training beyond Zero-Sum with Correlated Equilibrium Meta-Solvers
[5:35] PC-MLP: Model-based Reinforcement Learning with Policy Cover Guided Exploration
[5:40] Imitation by Predicting Observations
Q&As 5:45-5:50
[5:45] Q&A
(ends 6:00 AM)
Orals 5:00-5:20
[5:00] Scalable Computations of Wasserstein Barycenter via Input Convex Neural Networks
Spotlights 5:20-5:50
[5:20] Outlier-Robust Optimal Transport
[5:25] Dataset Dynamics via Gradient Flows in Probability Space
[5:30] Sliced Iterative Normalizing Flows
[5:35] Low-Rank Sinkhorn Factorization
[5:40] Unbalanced minibatch Optimal Transport; applications to Domain Adaptation
[5:45] Making transport more robust and interpretable by moving data through a small number of anchor points
Q&As 5:50-5:55
[5:50] Q&A
(ends 6:00 AM)
Orals 5:00-5:20
[5:00] Optimal Complexity in Decentralized Training
Spotlights 5:20-5:50
[5:20] Stochastic Sign Descent Methods: New Algorithms and Better Theory
[5:25] Bias-Variance Reduced Local SGD for Less Heterogeneous Federated Learning
[5:30] A Hybrid Variance-Reduced Method for Decentralized Stochastic Non-Convex Optimization
[5:35] Asynchronous Decentralized Optimization With Implicit Stochastic Variance Reduction
[5:40] Newton Method over Networks is Fast up to the Statistical Precision
[5:45] Federated Learning under Arbitrary Communication Patterns
Q&As 5:50-5:55
[5:50] Q&A
(ends 6:00 AM)
Orals 5:00-5:20
[5:00] Phasic Policy Gradient
Spotlights 5:20-5:50
[5:20] Reinforcement Learning with Prototypical Representations
[5:25] Diversity Actor-Critic: Sample-Aware Entropy Regularization for Sample-Efficient Exploration
[5:30] Muesli: Combining Improvements in Policy Optimization
[5:35] Unsupervised Learning of Visual 3D Keypoints for Control
[5:40] Learning Task Informed Abstractions
[5:45] State Entropy Maximization with Random Encoders for Efficient Exploration
Q&As 5:50-5:55
[5:50] Q&A
(ends 6:00 AM)
Orals 5:00-5:20
[5:00] Deeply-Debiased Off-Policy Interval Estimation
Spotlights 5:20-5:45
[5:20] Offline Contextual Bandits with Overparameterized Models
[5:25] Demonstration-Conditioned Reinforcement Learning for Few-Shot Imitation
[5:30] A New Representation of Successor Features for Transfer across Dissimilar Environments
[5:35] Preferential Temporal Difference Learning
[5:40] On the Optimality of Batch Policy Optimization Algorithms
Q&As 5:45-5:50
[5:45] Q&A
(ends 6:00 AM)
6 a.m.
Orals 6:00-6:20
[6:00] Neural Architecture Search without Training
Spotlights 6:20-6:50
[6:20] Is Space-Time Attention All You Need for Video Understanding?
[6:25] A Probabilistic Approach to Neural Network Pruning
[6:30] KNAS: Green Neural Architecture Search
[6:35] Efficient Lottery Ticket Finding: Less Data is More
[6:40] ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases
[6:45] Provably Strict Generalisation Benefit for Equivariant Models
Q&As 6:50-6:55
[6:50] Q&A
(ends 7:00 AM)
Orals 6:00-6:20
[6:00] Leveraging Sparse Linear Layers for Debuggable Deep Networks
Spotlights 6:20-6:50
[6:20] Voice2Series: Reprogramming Acoustic Models for Time Series Classification
[6:25] Self-Tuning for Data-Efficient Deep Learning
[6:30] How Framelets Enhance Graph Neural Networks
[6:35] Federated Continual Learning with Weighted Inter-client Transfer
[6:40] Self Normalizing Flows
[6:45] Loss Surface Simplexes for Mode Connecting Volumes and Fast Ensembling
Q&As 6:50-6:55
[6:50] Q&A
(ends 7:00 AM)
Orals 6:00-6:20
[6:00] What Are Bayesian Neural Network Posteriors Really Like?
Spotlights 6:20-6:50
[6:20] Scalable Marginal Likelihood Estimation for Model Selection in Deep Learning
[6:25] Amortized Conditional Normalized Maximum Likelihood: Reliable Out of Distribution Uncertainty Estimation
[6:30] Deep kernel processes
[6:35] Global inducing point variational posteriors for Bayesian neural networks and deep Gaussian processes
[6:40] Bayesian Deep Learning via Subnetwork Inference
[6:45] Generative Particle Variational Inference via Estimation of Functional Gradients
Q&As 6:50-6:55
[6:50] Q&A
(ends 7:00 AM)
Orals 6:00-6:20
[6:00] Principled Simplicial Neural Networks for Trajectory Prediction
Spotlights 6:20-6:50
[6:20] Efficient Differentiable Simulation of Articulated Bodies
[6:25] On Monotonic Linear Interpolation of Neural Network Parameters
[6:30] Connecting Sphere Manifolds Hierarchically for Regularization
[6:35] Controlling Graph Dynamics with Reinforcement Learning and Graph Neural Networks
[6:40] Thinking Like Transformers
[6:45] Federated Learning of User Verification Models Without Sharing Embeddings
Q&As 6:50-6:55
[6:50] Q&A
(ends 7:00 AM)
Orals 6:00-6:20
[6:00] Oops I Took A Gradient: Scalable Sampling for Discrete Distributions
Spotlights 6:20-6:50
[6:20] Multiscale Invertible Generative Networks for High-Dimensional Bayesian Inference
[6:25] GraphDF: A Discrete Flow Model for Molecular Graph Generation
[6:30] Hierarchical VAEs Know What They Don’t Know
[6:35] Order Matters: Probabilistic Modeling of Node Sequence for Graph Generation
[6:40] Generative Video Transformer: Can Objects be the Words?
[6:45] Poisson-Randomised DirBN: Large Mutation is Needed in Dirichlet Belief Networks
Q&As 6:50-6:55
[6:50] Q&A
(ends 7:00 AM)
Orals 6:00-6:20
[6:00] Let's Agree to Degree: Comparing Graph Convolutional Networks in the Message-Passing Framework
Spotlights 6:20-6:50
[6:20] Fundamental Tradeoffs in Distributionally Adversarial Training
[6:25] Towards Understanding Learning in Neural Networks with Linear Teachers
[6:30] Continual Learning in the Teacher-Student Setup: Impact of Task Similarity
[6:35] A Functional Perspective on Learning Symmetric Functions with Neural Networks
[6:40] Weisfeiler and Lehman Go Topological: Message Passing Simplicial Networks
[6:45] On the Random Conjugate Kernel and Neural Tangent Kernel
Q&As 6:50-6:55
[6:50] Q&A
(ends 7:00 AM)
Orals 6:00-6:20
[6:00] PAGE: A Simple and Optimal Probabilistic Gradient Estimator for Nonconvex Optimization
Spotlights 6:20-6:50
[6:20] Projection Robust Wasserstein Barycenters
[6:25] Efficient Message Passing for 0–1 ILPs with Binary Decision Diagrams
[6:30] Distributionally Robust Optimization with Markovian Data
[6:35] Acceleration via Fractal Learning Rate Schedules
[6:40] A Novel Sequential Coreset Method for Gradient Descent Algorithms
[6:45] Scalable Optimal Transport in High Dimensions for Graph Distances, Embedding Alignment, and More
Q&As 6:50-6:55
[6:50] Q&A
(ends 7:00 AM)
Orals 6:00-6:20
[6:00] Variance Reduction via Primal-Dual Accelerated Dual Averaging for Nonsmooth Convex Finite-Sums
Spotlights 6:20-6:50
[6:20] Dueling Convex Optimization
[6:25] Global Optimality Beyond Two Layers: Training Deep ReLU Networks via Convex Programs
[6:30] Parameter-free Locally Accelerated Conditional Gradients
[6:35] Principal Component Hierarchy for Sparse Quadratic Programs
[6:40] One-sided Frank-Wolfe algorithms for saddle problems
[6:45] ConvexVST: A Convex Optimization Approach to Variance-stabilizing Transformation
Q&As 6:50-6:55
[6:50] Q&A
(ends 7:00 AM)
Orals 6:00-6:20
[6:00] Risk-Sensitive Reinforcement Learning with Function Approximation: A Debiasing Approach
Spotlights 6:20-6:50
[6:20] Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample Complexity
[6:25] Neuro-algorithmic Policies Enable Fast Combinatorial Generalization
[6:30] PID Accelerated Value Iteration Algorithm
[6:35] Provably Efficient Learning of Transferable Rewards
[6:40] Reinforcement Learning for Cost-Aware Markov Decision Processes
[6:45] Value Alignment Verification
Q&As 6:50-6:55
[6:50] Q&A
(ends 7:00 AM)
7 a.m.
Orals 7:00-7:20
[7:00] OmniNet: Omnidirectional Representations from Transformers
Spotlights 7:20-7:45
[7:20] Boosting the Throughput and Accelerator Utilization of Specialized CNN Inference Beyond Increasing Batch Size
[7:25] E(n) Equivariant Graph Neural Networks
[7:30] Grid-Functioned Neural Networks
[7:35] MSA Transformer
[7:40] Parallelizing Legendre Memory Unit Training
Q&As 7:45-7:50
[7:45] Q&A
(ends 8:00 AM)
Orals 7:00-7:20
[7:00] ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision
Spotlights 7:20-7:50
[7:20] Learning Curves for Analysis of Deep Networks
[7:25] GLSearch: Maximum Common Subgraph Detection via Learning to Search
[7:30] Learning Intra-Batch Connections for Deep Metric Learning
[7:35] Simultaneous Similarity-based Self-Distillation for Deep Metric Learning
[7:40] Unifying Vision-and-Language Tasks via Text Generation
[7:45] DeepWalking Backwards: From Embeddings Back to Graphs
Q&As 7:50-7:55
[7:50] Q&A
(ends 8:00 AM)
Orals 7:00-7:20
[7:00] Spectral Smoothing Unveils Phase Transitions in Hierarchical Variational Autoencoders
Spotlights 7:20-7:50
[7:20] Riemannian Convex Potential Maps
[7:25] Autoencoding Under Normalization Constraints
[7:30] PixelTransformer: Sample Conditioned Signal Generation
[7:35] Generative Adversarial Networks for Markovian Temporal Dynamics: Stochastic Continuous Data Generation
[7:40] Autoencoder Image Interpolation by Shaping the Latent Space
[7:45] Improved Denoising Diffusion Probabilistic Models
Q&As 7:50-7:55
[7:50] Q&A
(ends 8:00 AM)
Orals 7:00-7:20
[7:00] Directional Graph Networks
Spotlights 7:20-7:50
[7:20] Winograd Algorithm for AdderNet
[7:25] LieTransformer: Equivariant Self-Attention for Lie Groups
[7:30] "Hey, that's not an ODE": Faster ODE Adjoints via Seminorms
[7:35] Graph Mixture Density Networks
[7:40] Momentum Residual Neural Networks
[7:45] Better Training using Weight-Constrained Stochastic Dynamics
Q&As 7:50-7:55
[7:50] Q&A
(ends 8:00 AM)
Orals 7:00-7:20
[7:00] Coach-Player Multi-agent Reinforcement Learning for Dynamic Team Composition
Spotlights 7:20-7:50
[7:20] Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning
[7:25] A New Formalism, Method and Open Issues for Zero-Shot Coordination
[7:30] Targeted Data Acquisition for Evolving Negotiation Agents
[7:35] Inverse Constrained Reinforcement Learning
[7:40] Counterfactual Credit Assignment in Model-Free Reinforcement Learning
[7:45] Interactive Learning from Activity Description
Q&As 7:50-7:55
[7:50] Q&A
(ends 8:00 AM)
Orals 7:00-7:20
[7:00] Stability and Convergence of Stochastic Gradient Clipping: Beyond Lipschitz Continuity and Smoothness
Spotlights 7:20-7:50
[7:20] Positive-Negative Momentum: Manipulating Stochastic Gradient Noise to Improve Generalization
[7:25] Variational Data Assimilation with a Learned Inverse Observation Operator
[7:30] Fast Projection Onto Convex Smooth Constraints
[7:35] Decomposable Submodular Function Minimization via Maximum Flow
[7:40] Multiplicative Noise and Heavy Tails in Stochastic Optimization
[7:45] Distributed Second Order Methods with Fast Rates and Compressed Communication
Q&As 7:50-7:55
[7:50] Q&A
(ends 8:00 AM)
Orals 7:00-7:20
[7:00] Not All Memories are Created Equal: Learning to Forget by Expiring
Spotlights 7:20-7:50
[7:20] Learning Bounds for Open-Set Learning
[7:25] Perceiver: General Perception with Iterative Attention
[7:30] Synthesizer: Rethinking Self-Attention for Transformer Models
[7:35] Slot Machines: Discovering Winning Combinations of Random Weights in Neural Networks
[7:40] What's in the Box? Exploring the Inner Life of Neural Networks with Robust Rules
[7:45] Neural-Pull: Learning Signed Distance Function from Point clouds by Learning to Pull Space onto Surface
Q&As 7:50-7:55
[7:50] Q&A
(ends 8:00 AM)
Orals 7:00-7:20
[7:00] World Model as a Graph: Learning Latent Landmarks for Planning
Spotlights 7:20-7:45
[7:20] Revisiting Rainbow: Promoting more insightful and inclusive deep reinforcement learning research
[7:25] Deep Reinforcement Learning amidst Continual Structured Non-Stationarity
[7:30] Offline Reinforcement Learning with Pseudometric Learning
[7:35] EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline and Online RL
[7:40] Decision-Making Under Selective Labels: Optimal Finite-Domain Policies and Beyond
Q&As 7:45-7:50
[7:45] Q&A
(ends 8:00 AM)
Orals 7:00-7:20
[7:00] Skill Discovery for Exploration and Planning using Deep Skill Graphs
Spotlights 7:20-7:50
[7:20] Learning Routines for Effective Off-Policy Reinforcement Learning
[7:25] PODS: Policy Optimization via Differentiable Simulation
[7:30] Learning and Planning in Complex Action Spaces
[7:35] Model-Based Reinforcement Learning via Latent-Space Collocation
[7:40] Vector Quantized Models for Planning
[7:45] LTL2Action: Generalizing LTL Instructions for Multi-Task RL
Q&As 7:50-7:55
[7:50] Q&A
(ends 8:00 AM)
8 a.m.
Affinity Workshop:
(ends 5:00 PM)
Invited Talk:
Daphne Koller
(ends 9:00 AM)
9 a.m.
Posters 9:00-11:00
(ends 11:00 AM)
11 a.m.
Town Hall:
(ends 12:00 PM)
5 p.m.
Orals 5:00-5:20
[5:00] A Practical Method for Constructing Equivariant Multilayer Perceptrons for Arbitrary Matrix Groups
Spotlights 5:20-5:50
[5:20] Accelerate CNNs from Three Dimensions: A Comprehensive Pruning Framework
[5:25] The Earth Mover's Pinball Loss: Quantiles for Histogram-Valued Regression
[5:30] Signatured Deep Fictitious Play for Mean Field Games with Common Noise
[5:35] Equivariant message passing for the prediction of tensorial properties and molecular spectra
[5:40] Improving Breadth-Wise Backpropagation in Graph Neural Networks Helps Learning Long-Range Dependencies.
[5:45] LARNet: Lie Algebra Residual Network for Face Recognition
Q&As 5:50-5:55
[5:50] Q&A
(ends 6:00 PM)
Orals 5:00-5:20
[5:00] Characterizing Structural Regularities of Labeled Data in Overparameterized Models
Spotlights 5:20-5:50
[5:20] Stabilizing Equilibrium Models by Jacobian Regularization
[5:25] On the Predictability of Pruning Across Scales
[5:30] Lottery Ticket Preserves Weight Correlation: Is It Desirable or Not?
[5:35] LIME: Learning Inductive Bias for Primitives of Mathematical Reasoning
[5:40] Dense for the Price of Sparse: Improved Performance of Sparsely Initialized Networks via a Subspace Offset
[5:45] Learning Neural Network Subspaces
Q&As 5:50-5:55
[5:50] Q&A
(ends 6:00 PM)
Orals 5:00-5:20
[5:00] CATE: Computation-aware Neural Architecture Encoding with Transformers
Spotlights 5:20-5:50
[5:20] What Does Rotation Prediction Tell Us about Classifier Accuracy under Varying Testing Environments?
[5:25] Towards Domain-Agnostic Contrastive Learning
[5:30] Joining datasets via data augmentation in the label space for neural networks
[5:35] Differentiable Sorting Networks for Scalable Sorting and Ranking Supervision
[5:40] Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation
[5:45] Poolingformer: Long Document Modeling with Pooling Attention
Q&As 5:50-5:55
[5:50] Q&A
(ends 6:00 PM)
Orals 5:00-5:20
[5:00] NeRF-VAE: A Geometry Aware 3D Scene Generative Model
Spotlights 5:20-5:50
[5:20] Quantitative Understanding of VAE as a Non-linearly Scaled Isometric Embedding
[5:25] Soft then Hard: Rethinking the Quantization in Neural Image Compression
[5:30] Improved Contrastive Divergence Training of Energy-Based Models
[5:35] Deep Generative Learning via Schrödinger Bridge
[5:40] Partially Observed Exchangeable Modeling
[5:45] Understanding Failures in Out-of-Distribution Detection with Deep Generative Models
Q&As 5:50-5:55
[5:50] Q&A
(ends 6:00 PM)
Orals 5:00-5:20
[5:00] Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning
Spotlights 5:20-5:50
[5:20] Re-understanding Finite-State Representations of Recurrent Policy Networks
[5:25] Emergent Social Learning via Multi-agent Reinforcement Learning
[5:30] From Poincaré Recurrence to Convergence in Imperfect Information Games: Finding Equilibrium via Regularization
[5:35] Actionable Models: Unsupervised Offline Reinforcement Learning of Robotic Skills
[5:40] Trajectory Diversity for Zero-Shot Coordination
[5:45] FOP: Factorizing Optimal Joint Policy of Maximum-Entropy Multi-Agent Reinforcement Learning
Q&As 5:50-5:55
[5:50] Q&A
(ends 6:00 PM)
Orals 5:00-5:20
[5:00] On the price of explainability for some clustering problems
Spotlights 5:20-5:50
[5:20] Instance Specific Approximations for Submodular Maximization
[5:25] Adapting to Delays and Data in Adversarial Multi-Armed Bandits
[5:30] Structured Convolutional Kernel Networks for Airline Crew Scheduling
[5:35] Online Graph Dictionary Learning
[5:40] Stochastic Iterative Graph Matching
[5:45] Training Quantized Neural Networks to Global Optimality via Semidefinite Programming
Q&As 5:50-5:55
[5:50] Q&A
(ends 6:00 PM)
Orals 5:00-5:20
[5:00] A Tale of Two Efficient and Informative Negative Sampling Distributions
Spotlights 5:20-5:50
[5:20] TeraPipe: Token-Level Pipeline Parallelism for Training Large-Scale Language Models
[5:25] Quantization Algorithms for Random Fourier Features
[5:30] Rethinking Neural vs. Matrix-Factorization Collaborative Filtering: the Theoretical Perspectives
[5:35] Concentric mixtures of Mallows models for top-$k$ rankings: sampling and identifiability
[5:40] Heterogeneity for the Win: One-Shot Federated Clustering
[5:45] Cross-Gradient Aggregation for Decentralized Learning from Non-IID Data
Q&As 5:50-5:55
[5:50] Q&A
(ends 6:00 PM)
Orals 5:00-5:20
[5:00] PsiPhi-Learning: Reinforcement Learning with Demonstrations using Successor Features and Inverse Temporal Difference Learning
Spotlights 5:20-5:50
[5:20] Safe Reinforcement Learning with Linear Function Approximation
[5:25] Shortest-Path Constrained Reinforcement Learning for Sparse Reward Tasks
[5:30] Offline Reinforcement Learning with Fisher Divergence Critic Regularization
[5:35] Recomposing the Reinforcement Learning Building Blocks with Hypernetworks
[5:40] OptiDICE: Offline Policy Optimization via Stationary Distribution Correction Estimation
[5:45] Discovering symbolic policies with deep reinforcement learning
Q&As 5:50-5:55
[5:50] Q&A
(ends 6:00 PM)
Orals 5:00-5:20
[5:00] Robust Asymmetric Learning in POMDPs
Spotlights 5:20-5:50
[5:20] Differentiable Spatial Planning using Transformers
[5:25] Convex Regularization in Monte-Carlo Tree Search
[5:30] On-Policy Deep Reinforcement Learning for the Average-Reward Criterion
[5:35] Multi-Task Reinforcement Learning with Context-based Representations
[5:40] High Confidence Generalization for Reinforcement Learning
[5:45] Locally Persistent Exploration in Continuous Control Tasks with Sparse Rewards
Q&As 5:50-5:55
[5:50] Q&A
(ends 6:00 PM)
6 p.m.
Spotlights 6:00-6:15
[6:00] iDARTS: Differentiable Architecture Search with Stochastic Implicit Gradients
[6:05] Accurate Post Training Quantization With Small Calibration Sets
[6:10] Optimal Transport Kernels for Sequential and Parallel Neural Architecture Search
Orals 6:15-6:35
[6:15] Few-Shot Neural Architecture Search
Spotlights 6:35-6:50
[6:35] AutoAttend: Automated Attention Representation Search
[6:40] Think Global and Act Local: Bayesian Optimisation over High-Dimensional Categorical and Mixed Search Spaces
[6:45] Auto-NBA: Efficient and Effective Search Over the Joint Space of Networks, Bitwidths, and Accelerators
Q&As 6:50-6:55
[6:50] Q&A
(ends 7:00 PM)
Orals 6:00-6:20
[6:00] Unbiased Gradient Estimation in Unrolled Computation Graphs with Persistent Evolution Strategies
Spotlights 6:20-6:45
[6:20] EfficientNetV2: Smaller Models and Faster Training
[6:25] Causal Curiosity: RL Agents Discovering Self-supervised Experiments for Causal Representation Learning
[6:30] LAMDA: Label Matching Deep Domain Adaptation
[6:35] Temporally Correlated Task Scheduling for Sequence Learning
[6:40] Information Obfuscation of Graph Neural Networks
Q&As 6:45-6:50
[6:45] Q&A
(ends 7:00 PM)
Orals 6:00-6:20
[6:00] Generating images with sparse representations
Spotlights 6:20-6:50
[6:20] An Identifiable Double VAE For Disentangled Representations
[6:25] A Unified Generative Adversarial Network Training via Self-Labeling and Self-Attention
[6:30] On Characterizing GAN Convergence Through Proximal Duality Gap
[6:35] Scalable Normalizing Flows for Permutation Invariant Densities
[6:40] Parallel and Flexible Sampling from Autoregressive Models via Langevin Dynamics
[6:45] Zero-Shot Text-to-Image Generation
Q&As 6:50-6:55
[6:50] Q&A
(ends 7:00 PM)
Orals 6:00-6:20
[6:00] The Emergence of Individuality
Spotlights 6:20-6:45
[6:20] DFAC Framework: Factorizing the Value Function via Quantile Mixture for Multi-Agent Distributional Q-Learning
[6:25] From Local to Global Norm Emergence: Dissolving Self-reinforcing Substructures with Incremental Social Instruments
[6:30] Learning While Playing in Mean-Field Games: Convergence and Optimality
[6:35] Learning Fair Policies in Decentralized Cooperative Multi-Agent Reinforcement Learning
[6:40] Augmented World Models Facilitate Zero-Shot Dynamics Generalization From a Single Offline Environment
Q&As 6:45-6:50
[6:45] Q&A
(ends 7:00 PM)
Orals 6:00-6:20
[6:00] Network Inference and Influence Maximization from Samples
Spotlights 6:20-6:50
[6:20] Regularized Submodular Maximization at Scale
[6:25] Marginal Contribution Feature Importance - an Axiomatic Approach for Explaining Data
[6:30] Connecting Interpretability and Robustness in Decision Trees through Separation
[6:35] Light RUMs
[6:40] Submodular Maximization subject to a Knapsack Constraint: Combinatorial Algorithms with Near-optimal Adaptive Complexity
[6:45] CombOptNet: Fit the Right NP-Hard Problem by Learning Integer Programming Constraints
Q&As 6:50-6:55
[6:50] Q&A
(ends 7:00 PM)
Orals 6:00-6:20
[6:00] The Power of Adaptivity for Stochastic Submodular Cover
Spotlights 6:20-6:50
[6:20] The Heavy-Tail Phenomenon in SGD
[6:25] Federated Composite Optimization
[6:30] On Estimation in Latent Variable Models
[6:35] Asynchronous Distributed Learning : Adapting to Gradient Delays without Prior Knowledge
[6:40] Randomized Algorithms for Submodular Function Maximization with a $k$-System Constraint
[6:45] BASGD: Buffered Asynchronous SGD for Byzantine Learning
Q&As 6:50-6:55
[6:50] Q&A
(ends 7:00 PM)
Orals 6:00-6:20
[6:00] A Wasserstein Minimax Framework for Mixed Linear Regression
Spotlights 6:20-6:50
[6:20] Weight-covariance alignment for adversarially robust neural networks
[6:25] Rethinking Rotated Object Detection with Gaussian Wasserstein Distance Loss
[6:30] Communication-Efficient Distributed SVD via Local Power Iterations
[6:35] A Riemannian Block Coordinate Descent Method for Computing the Projection Robust Wasserstein Distance
[6:40] Probabilistic Sequential Shrinking: A Best Arm Identification Algorithm for Stochastic Bandits with Corruptions
[6:45] Leveraging Language to Learn Program Abstractions and Search Heuristics
Q&As 6:50-6:55
[6:50] Q&A
(ends 7:00 PM)
Orals 6:00-6:20
[6:00] Decoupling Value and Policy for Generalization in Reinforcement Learning
Spotlights 6:20-6:50
[6:20] Prioritized Level Replay
[6:25] SECANT: Self-Expert Cloning for Zero-Shot Generalization of Visual Policies
[6:30] GMAC: A Distributional Perspective on Actor-Critic Framework
[6:35] Goal-Conditioned Reinforcement Learning with Imagined Subgoals
[6:40] Policy Gradient Bayesian Robust Optimization for Imitation Learning
[6:45] Reinforcement Learning of Implicit and Explicit Control Flow Instructions
Q&As 6:50-6:55
[6:50] Q&A
(ends 7:00 PM)
Orals 6:00-6:20
[6:00] PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via Relabeling Experience and Unsupervised Pre-training
Spotlights 6:20-6:50
[6:20] Uncertainty Weighted Actor-Critic for Offline Reinforcement Learning
[6:25] Keyframe-Focused Visual Imitation Learning
[6:30] Learning and Planning in Average-Reward Markov Decision Processes
[6:35] Towards Better Laplacian Representation in Reinforcement Learning with Generalized Graph Drawing
[6:40] Low-Precision Reinforcement Learning: Running Soft Actor-Critic in Half Precision
[6:45] Emphatic Algorithms for Deep Reinforcement Learning
Q&As 6:50-6:55
[6:50] Q&A
(ends 7:00 PM)
7 p.m.
Orals 7:00-7:20
[7:00] AlphaNet: Improved Training of Supernets with Alpha-Divergence
Spotlights 7:20-7:50
[7:20] Catformer: Designing Stable Transformers via Sensitivity Analysis
[7:25] A Receptor Skeleton for Capsule Neural Networks
[7:30] Explore Visual Concept Formation for Image Classification
[7:35] K-shot NAS: Learnable Weight-Sharing for NAS with K-shot Supernets
[7:40] High-Performance Large-Scale Image Recognition Without Normalization
[7:45] Lipschitz normalization for self-attention layers with application to graph neural networks
Q&As 7:50-7:55
[7:50] Q&A
(ends 8:00 PM)
Orals 7:00-7:20
[7:00] Out-of-Distribution Generalization via Risk Extrapolation (REx)
Spotlights 7:20-7:50
[7:20] What Makes for End-to-End Object Detection?
[7:25] On Explainability of Graph Neural Networks via Subgraph Explorations
[7:30] Just How Toxic is Data Poisoning? A Unified Benchmark for Backdoor and Data Poisoning Attacks
[7:35] Data Augmentation for Meta-Learning
[7:40] Understanding Invariance via Feedforward Inversion of Discriminatively Trained Classifiers
[7:45] Neural Symbolic Regression that scales
Q&As 7:50-7:55
[7:50] Q&A
(ends 8:00 PM)
Orals 7:00-7:20
[7:00] Just Train Twice: Improving Group Robustness without Training Group Information
Spotlights 7:20-7:50
[7:20] Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth
[7:25] GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training
[7:30] A Bit More Bayesian: Domain-Invariant Learning with Uncertainty
[7:35] Neural Rough Differential Equations for Long Time Series
[7:40] Whitening and Second Order Optimization Both Make Information in the Dataset Unusable During Training, and Can Reduce or Prevent Generalization
[7:45] Data augmentation for deep learning based accelerated MRI reconstruction with limited data
Q&As 7:50-7:55
[7:50] Q&A
(ends 8:00 PM)
Orals 7:00-7:20
[7:00] Sequential Domain Adaptation by Synthesizing Distributionally Robust Experts
Spotlights 7:20-7:50
[7:20] Oblivious Sketching-based Central Path Method for Linear Programming
[7:25] Bayesian Optimization over Hybrid Spaces
[7:30] Variational (Gradient) Estimate of the Score Function in Energy-based Latent Variable Models
[7:35] Compositional Video Synthesis with Action Graphs
[7:40] Neural Pharmacodynamic State Space Modeling
[7:45] Three Operator Splitting with a Nonconvex Loss Function
Q&As 7:50-7:55
[7:50] Q&A
(ends 8:00 PM)
Orals 7:00-7:20
[7:00] Accelerated Algorithms for Smooth Convex-Concave Minimax Problems with O(1/k^2) Rate on Squared Gradient Norm
Spotlights 7:20-7:50
[7:20] Communication-Efficient Distributed Optimization with Quantized Preconditioners
[7:25] Optimal regret algorithm for Pseudo-1d Bandit Convex Optimization
[7:30] Fast Stochastic Bregman Gradient Methods: Sharp Analysis and Variance Reduction
[7:35] Moreau-Yosida $f$-divergences
[7:40] Affine Invariant Analysis of Frank-Wolfe on Strongly Convex Sets
[7:45] On a Combination of Alternating Minimization and Nesterov's Momentum
Q&As 7:50-7:55
[7:50] Q&A
(ends 8:00 PM)
Orals 7:00-7:20
[7:00] ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training
Spotlights 7:20-7:50
[7:20] Householder Sketch for Accurate and Accelerated Least-Mean-Squares Solvers
[7:25] Accumulated Decoupled Learning with Gradient Staleness Mitigation for Convolutional Neural Networks
[7:30] Training Graph Neural Networks with 1000 Layers
[7:35] 1-bit Adam: Communication Efficient Large-Scale Training with Adam's Convergence Speed
[7:40] Federated Deep AUC Maximization for Hetergeneous Data with a Constant Communication Complexity
[7:45] Ditto: Fair and Robust Federated Learning Through Personalization
Q&As 7:50-7:55
[7:50] Q&A
(ends 8:00 PM)
Orals 7:00-7:20
[7:00] Inverse Decision Modeling: Learning Interpretable Representations of Behavior
Spotlights 7:20-7:50
[7:20] On Proximal Policy Optimization's Heavy-tailed Gradients
[7:25] Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement Learning
[7:30] Variational Empowerment as Representation Learning for Goal-Conditioned Reinforcement Learning
[7:35] Is Pessimism Provably Efficient for Offline RL?
[7:40] Beyond Variance Reduction: Understanding the True Impact of Baselines on Policy Optimization
[7:45] Density Constrained Reinforcement Learning
Q&As 7:50-7:55
[7:50] Q&A
(ends 8:00 PM)
Orals 7:00-7:20
[7:00] Cooperative Exploration for Multi-Agent Deep Reinforcement Learning
Spotlights 7:20-7:50
[7:20] A Deep Reinforcement Learning Approach to Marginalized Importance Sampling with the Successor Representation
[7:25] Learning to Weight Imperfect Demonstrations
[7:30] DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning
[7:35] MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven Reinforcement Learning
[7:40] RRL: Resnet as representation for Reinforcement Learning
[7:45] SCC: an efficient deep reinforcement learning agent mastering the game of StarCraft II
Q&As 7:50-7:55
[7:50] Q&A
(ends 8:00 PM)
Orals 7:00-7:20
[7:00] Hyperparameter Selection for Imitation Learning
Spotlights 7:20-7:50
[7:20] Revisiting Peng's Q($\lambda$) for Modern Reinforcement Learning
[7:25] Monotonic Robust Policy Optimization with Model Discrepancy
[7:30] Taylor Expansion of Discount Factors
[7:35] Generalizable Episodic Memory for Deep Reinforcement Learning
[7:40] Representation Matters: Offline Pretraining for Sequential Decision Making
[7:45] Reinforcement Learning Under Moral Uncertainty
Q&As 7:50-7:55
[7:50] Q&A
(ends 8:00 PM)
8 p.m.
Invited Talk:
Xiao Cunde · Qin Dahe
(ends 9:00 PM)
9 p.m.
Posters 9:00-11:00
(ends 11:00 PM)

WED 21 JUL
5 a.m.
Orals 5:00-5:20
[5:00] Cross-domain Imitation from Observations
Spotlights 5:20-5:50
[5:20] SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning
[5:25] Decoupling Exploration and Exploitation for Meta-Reinforcement Learning without Sacrifices
[5:30] Active Feature Acquisition with Generative Surrogate Models
[5:35] Characterizing the Gap Between Actor-Critic and Policy Gradient
[5:40] Spectral Normalisation for Deep Reinforcement Learning: An Optimisation Perspective
[5:45] Accelerating Safe Reinforcement Learning with Constraint-mismatched Baseline Policies
Q&As 5:50-5:55
[5:50] Q&A
(ends 6:00 AM)
Orals 5:00-5:20
[5:00] Near Optimal Reward-Free Reinforcement Learning
Spotlights 5:20-5:50
[5:20] Batch Value-function Approximation with Only Realizability
[5:25] Adversarial Combinatorial Bandits with General Non-linear Reward Functions
[5:30] Model-Free and Model-Based Policy Evaluation when Causality is Uncertain
[5:35] Bootstrapping Fitted Q-Evaluation for Off-Policy Inference
[5:40] On Learnability via Gradient Method for Two-Layer ReLU Neural Networks in Teacher-Student Setting
[5:45] Spectral vertex sparsifiers and pair-wise spanners over distributed graphs
Q&As 5:50-5:55
[5:50] Q&A
(ends 6:00 AM)
Orals 5:00-5:20
[5:00] On Energy-Based Models with Overparametrized Shallow Neural Networks
Spotlights 5:20-5:50
[5:20] Uncertainty Principles of Encoding GANs
[5:25] On the Proof of Global Convergence of Gradient Descent for Deep ReLU Networks with Linear Widths
[5:30] Tight Bounds on the Smallest Eigenvalue of the Neural Tangent Kernel for Deep ReLU Networks
[5:35] Functional Space Analysis of Local GAN Convergence
[5:40] Exact Gap between Generalization Error and Uniform Convergence in Random Feature Models
[5:45] Provable Generalization of SGD-trained Neural Networks of Any Width in the Presence of Adversarial Label Noise
Q&As 5:50-5:55
[5:50] Q&A
(ends 6:00 AM)
Orals 5:00-5:20
[5:00] APS: Active Pretraining with Successor Features
Spotlights 5:20-5:50
[5:20] Guided Exploration with Proximal Policy Optimization using a Single Demonstration
[5:25] Self-Paced Context Evaluation for Contextual Reinforcement Learning
[5:30] Unsupervised Skill Discovery with Bottleneck Option Learning
[5:35] TeachMyAgent: a Benchmark for Automatic Curriculum Learning in Deep RL
[5:40] Grounding Language to Entities and Dynamics for Generalization in Reinforcement Learning
[5:45] Data-efficient Hindsight Off-policy Option Learning
Q&As 5:50-5:55
[5:50] Q&A
(ends 6:00 AM)
Orals 5:00-5:20
[5:00] The Limits of Min-Max Optimization Algorithms: Convergence to Spurious Non-Critical Sets
Spotlights 5:20-5:50
[5:20] Theory of Spectral Method for Union of Subspaces-Based Random Geometry Graph
[5:25] Approximating a Distribution Using Weight Queries
[5:30] Estimating $\alpha$-Rank from A Few Entries with Low Rank Matrix Completion
[5:35] Revenue-Incentive Tradeoffs in Dynamic Reserve Pricing
[5:40] Towards the Unification and Robustness of Perturbation and Gradient Based Explanations
[5:45] Classifying high-dimensional Gaussian mixtures: Where kernel methods fail and neural networks succeed
Q&As 5:50-5:55
[5:50] Q&A
(ends 6:00 AM)
Orals 5:00-5:20
[5:00] Optimizing persistent homology based functions
Spotlights 5:20-5:50
[5:20] Debiasing a First-order Heuristic for Approximate Bi-level Optimization