Show Detail |
Timezone: America/Los_Angeles |
Filter Rooms:
SUN 23 JUL
1 p.m.
(ends 8:00 PM)
2:30 p.m.
3:30 p.m.
Expo Talk Panel with Coffee & a Snack:
(ends 4:30 PM)
4:30 p.m.
Expo Talk Panel with Coffee & a Snack:
(ends 5:30 PM)
5 p.m.
5:30 p.m.
6 p.m.
7 p.m.
8 p.m.
MON 24 JUL
11 a.m.
(ends 10:00 PM)
11:30 a.m.
11:45 a.m.
12:30 p.m.
Tutorial:
(ends 3:00 PM)
1 p.m.
1:30 p.m.
3 p.m.
4:30 p.m.
Tutorial:
(ends 6:30 PM)
Tutorial:
(ends 6:30 PM)
Tutorial:
(ends 6:30 PM)
6:30 p.m.
7 p.m.
Tutorial:
(ends 9:00 PM)
9:15 p.m.
9:30 p.m.
TUE 25 JUL
11 a.m.
(ends 9:00 PM)
noon
12:15 p.m.
1 p.m.
1:30 p.m.
2 p.m.
(ends 3:30 PM)
3:30 p.m.
5 p.m.
(ends 6:30 PM)
6:30 p.m.
7 p.m.
8 p.m.
8:30 p.m.
Orals 8:30-9:50
[8:30]
Bayesian Design Principles for Frequentist Sequential Learning
[8:38]
Towards Theoretical Understanding of Inverse Reinforcement Learning
[8:46]
On the Power of Pre-training for Generalization in RL: Provable Benefits and Hardness
[8:54]
Delayed Feedback in Kernel Bandits
[9:02]
Provably Learning Object-Centric Representations
[9:10]
Task-specific experimental design for treatment effect estimation
[9:18]
Are labels informative in semi-supervised learning? Estimating and leveraging the missing-data mechanism.
[9:26]
Interventional Causal Representation Learning
[9:34]
Returning The Favour: When Regression Benefits From Probabilistic Causal Knowledge
[9:42]
Sequential Underspecified Instrument Selection for Cause-Effect Estimation
(ends 10:00 PM)
Orals 8:30-9:50
[8:30]
Raising the Cost of Malicious AI-Powered Image Editing
[8:38]
Dynamics-inspired Neuromorphic Visual Representation Learning
[8:46]
Scaling Vision Transformers to 22 Billion Parameters
[8:54]
Facial Expression Recognition with Adaptive Frame Rate based on Multiple Testing Correction
[9:02]
Fourmer: An Efficient Global Modeling Paradigm for Image Restoration
[9:10]
Learning Signed Distance Functions from Noisy 3D Point Clouds via Noise to Noise Mapping
[9:18]
Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles
[9:26]
Rockmate: an Efficient, Fast, Automatic and Generic Tool for Re-materialization in PyTorch
[9:34]
SparseProp: Efficient Sparse Backpropagation for Faster Training of Neural Networks at the Edge
[9:42]
Fast Inference from Transformers via Speculative Decoding
(ends 10:00 PM)
Orals 8:30-9:58
[8:30]
Self-Repellent Random Walks on General Graphs - Achieving Minimal Sampling Variance via Nonlinear Markov Chains
[8:38]
Tighter Lower Bounds for Shuffling SGD: Random Permutations and Beyond
[8:46]
Which Features are Learnt by Contrastive Learning? On the Role of Simplicity Bias in Class Collapse and Feature Suppression
[8:54]
Tighter Information-Theoretic Generalization Bounds from Supersamples
[9:02]
Beyond the Universal Law of Robustness: Sharper Laws for Random Features and Neural Tangent Kernels
[9:10]
Bayes-optimal Learning of Deep Random Networks of Extensive-width
[9:18]
Why does Throwing Away Data Improve Worst-Group Error?
[9:26]
Marginalization is not Marginal: No Bad VAE Local Minima when Learning Optimal Sparse Representations
[9:34]
Sharper Bounds for $\ell_p$ Sensitivity Sampling
[9:42]
AdaBoost is not an Optimal Weak to Strong Learner
[9:50]
Generalization on the Unseen, Logic Reasoning and Degree Curriculum
(ends 10:00 PM)
Orals 8:30-9:50
[8:30]
AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
[8:38]
Adversarial Example Does Good: Preventing Painting Imitation from Diffusion Models via Adversarial Examples
[8:46]
Graphically Structured Diffusion Models
[8:54]
Diffusion Models as Artists: Are we Closing the Gap between Humans and Machines?
[9:02]
Refining Generative Process with Discriminator Guidance in Score-based Diffusion Models
[9:10]
Diffusion Models are Minimax Optimal Distribution Estimators
[9:18]
GibbsDDRM: A Partially Collapsed Gibbs Sampler for Solving Blind Inverse Problems with Denoising Diffusion Restoration
[9:26]
OCD: Learning to Overfit with Conditional Diffusion Models
[9:34]
Denoising MCMC for Accelerating Diffusion-Based Generative Models
[9:42]
Cones: Concept Neurons in Diffusion Models for Customized Generation
(ends 10:00 PM)
Orals 8:30-9:50
[8:30]
Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the Machiavelli Benchmark
[8:38]
Information-Theoretic State Space Model for Multi-View Reinforcement Learning
[8:46]
Reparameterized Policy Learning for Multimodal Trajectory Optimization
[8:54]
Representation Learning with Multi-Step Inverse Kinematics: An Efficient and Optimal Approach to Rich-Observation RL
[9:02]
Subequivariant Graph Reinforcement Learning in 3D Environments
[9:10]
A Study of Global and Episodic Bonuses for Exploration in Contextual MDPs
[9:18]
Warm-Start Actor-Critic: From Approximation Error to Sub-optimality Gap
[9:26]
Efficient RL via Disentangled Environment and Agent Representations
[9:34]
Flipping Coins to Estimate Pseudocounts for Exploration in Reinforcement Learning
[9:42]
On the Statistical Benefits of Temporal Difference Learning
(ends 10:00 PM)
Orals 8:30-9:50
[8:30]
Learning GFlowNets From Partial Episodes For Improved Convergence And Stability
[8:38]
The Dormant Neuron Phenomenon in Deep Reinforcement Learning
[8:46]
Reinforcement Learning from Passive Data via Latent Intentions
[8:54]
Best of Both Worlds Policy Optimization
[9:02]
Exponential Smoothing for Off-Policy Learning
[9:10]
Quantile Credit Assignment
[9:18]
Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels
[9:26]
Hierarchies of Reward Machines
[9:34]
Human-Timescale Adaptation in an Open-Ended Task Space
[9:42]
Settling the Reward Hypothesis
(ends 10:00 PM)
WED 26 JUL
11 a.m.
(ends 9:00 PM)
12:30 p.m.
Invited Talk:
Jennifer Doudna
(ends 1:30 PM)
1 p.m.
1:30 p.m.
2 p.m.
(ends 3:30 PM)
3:30 p.m.
5 p.m.
Posters 5:00-6:30
Shiftable Context: Addressing Training-Inference Context Mismatch in Simultaneous Speech Translation
LoSparse: Structured Compression of Large Language Models based on Low-Rank and Sparse Approximation
(ends 6:30 PM)
6:30 p.m.
7 p.m.
Orals 7:00-8:12
[7:00]
When Personalization Harms Performance: Reconsidering the Use of Group Attributes in Prediction
[7:08]
Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling
[7:16]
Whose Opinions Do Language Models Reflect?
[7:24]
A Watermark for Large Language Models
[7:32]
DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature
[7:40]
Using Large Language Models to Simulate Multiple Humans and Replicate Human Subject Studies
[7:48]
Inflow, Outflow, and Reciprocity in Machine Learning
[7:56]
Structure-informed Language Models Are Protein Designers
[8:04]
Transformers Learn In-Context by Gradient Descent
(ends 8:30 PM)
Orals 7:00-8:12
[7:00]
Pretraining Language Models with Human Preferences
[7:08]
Instant Soup: Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models
[7:16]
Specializing Smaller Language Models towards Multi-Step Reasoning
[7:24]
SparseGPT: Massive Language Models Can be Accurately Pruned in One-Shot
[7:32]
Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models
[7:40]
FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU
[7:48]
BPipe: Memory-Balanced Pipeline Parallelism for Training Large Language Models
[7:56]
Tractable Control for Autoregressive Language Generation
[8:04]
Equivariant Architectures for Learning in Deep Weight Spaces
(ends 8:30 PM)
Orals 7:00-8:20
[7:00]
Nonparametric Extensions of Randomized Response for Private Confidence Sets
[7:08]
Differentially Private Hierarchical Clustering with Provable Approximation Guarantees
[7:16]
Tight Data Access Bounds for Private Top-$k$ Selection
[7:24]
JAWS-X: Addressing Efficiency Bottlenecks of Conformal Prediction Under Standard and Feedback Covariate Shift
[7:32]
Active Ranking of Experts Based on their Performances in Many Tasks
[7:40]
The Price of Differential Privacy under Continual Observation
[7:48]
HETAL: Efficient Privacy-preserving Transfer Learning with Homomorphic Encryption
[7:56]
Sketch-Flip-Merge: Mergeable Sketches for Private Distinct Counting
[8:04]
Fast Private Kernel Density Estimation via Locality Sensitive Quantization
[8:12]
Multi-Epoch Matrix Factorization Mechanisms for Private Machine Learning
(ends 8:30 PM)
Orals 7:00-8:20
[7:00]
Adversarial Policies Beat Superhuman Go AIs
[7:08]
Adapting to game trees in zero-sum imperfect information games
[7:16]
Semi Bandit dynamics in Congestion Games: Convergence to Nash Equilibrium and No-Regret Guarantees.
[7:24]
Delving into Noisy Label Detection with Clean Data
[7:32]
Robustly Learning a Single Neuron via Sharpness
[7:40]
Data Feedback Loops: Model-driven Amplification of Dataset Biases
[7:48]
Towards Reliable Neural Specifications
[7:56]
Do Perceptually Aligned Gradients Imply Robustness?
[8:04]
ODS: Test-Time Adaptation in the Presence of Open-World Data Shift
[8:12]
Analysis of Error Feedback in Federated Non-Convex Optimization with Biased Compression: Fast Convergence and Partial Participation
(ends 8:30 PM)
Orals 7:00-8:20
[7:00]
RankMe: Assessing the Downstream Performance of Pretrained Self-Supervised Representations by Their Rank
[7:08]
Evaluating Self-Supervised Learning via Risk Decomposition
[7:16]
BEATs: Audio Pre-Training with Acoustic Tokenizers
[7:24]
Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language
[7:32]
Bidirectional Adaptation for Robust Semi-Supervised Learning with Inconsistent Data Distributions
[7:40]
TRAK: Attributing Model Behavior at Scale
[7:48]
Understanding Plasticity in Neural Networks
[7:56]
Fundamental Limits of Two-layer Autoencoders, and Achieving Them with Gradient Methods
[8:04]
Random Classification Noise does not defeat All Convex Potential Boosters Irrespective of Model Choice
[8:12]
Brauer's Group Equivariant Neural Networks
(ends 8:30 PM)
8:45 p.m.
10 p.m.
THU 27 JUL
11 a.m.
(ends 8:00 PM)
11:30 a.m.
12:30 p.m.
Invited Talk:
John Schulman
(ends 1:30 PM)
1 p.m.
1:30 p.m.
3 p.m.
4:30 p.m.
6 p.m.
Orals 6:00-7:20
[6:00]
Mimetic Initialization of Self-Attention Layers
[6:08]
Difference of submodular minimization via DC programming
[6:16]
Simplex Random Features
[6:24]
Patch-level Routing in Mixture-of-Experts is Provably Sample-efficient for Convolutional Neural Networks
[6:32]
Tilted Sparse Additive Models
[6:40]
Dynamic Regularized Sharpness Aware Minimization in Federated Learning: Approaching Global Consistency and Smooth Landscape
[6:48]
Hyena Hierarchy: Towards Larger Convolutional Language Models
[6:56]
Direct Parameterization of Lipschitz-Bounded Deep Networks
[7:12]
Subsample Ridge Ensembles: Equivalences and Generalized Cross-Validation
(ends 7:30 PM)
Orals 6:00-7:20
[6:00]
Neural Continuous-Discrete State Space Models for Irregularly-Sampled Time Series
[6:08]
Self-Interpretable Time Series Prediction with Counterfactual Explanations
[6:16]
Resurrecting Recurrent Neural Networks for Long Sequences
[6:24]
Inferring Relational Potentials in Interacting Systems
[6:32]
Memory-Based Dual Gaussian Processes for Sequential Learning
[6:40]
H-Likelihood Approach to Deep Neural Networks with Temporal-Spatial Random Effects for High-Cardinality Categorical Features
[6:48]
Generalized Teacher Forcing for Learning Chaotic Dynamics
[6:56]
Gaussian Process Priors for Systems of Linear Partial Differential Equations with Constant Coefficients
[7:04]
Spherical Fourier Neural Operators: Learning Stable Dynamics on the Sphere
[7:12]
Learning Control-Oriented Dynamical Structure from Data
(ends 7:30 PM)
Orals 6:00-7:20
[6:00]
Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding
[6:08]
Calibrating Multimodal Learning
[6:16]
StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis
[6:24]
ProtST: Multi-Modality Learning of Protein Sequences and Biomedical Texts
[6:32]
Cross-Modal Fine-Tuning: Align then Refine
[6:40]
Mu$^2$SLAM: Multitask, Multilingual Speech and Language Models
[6:48]
Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
[6:56]
Pre-training for Speech Translation: CTC Meets Optimal Transport
[7:04]
Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models
[7:12]
Spherical Inducing Features for Orthogonally-Decoupled Gaussian Processes
(ends 7:30 PM)
Orals 6:00-7:12
[6:00]
Second-Order Optimization with Lazy Hessians
[6:08]
Unifying Nesterov's Accelerated Gradient Methods for Convex and Strongly Convex Objective Functions
[6:16]
Transformer-based Stagewise Decomposition for Large-Scale Multistage Stochastic Optimization
[6:24]
Continuation Path Learning for Homotopy Optimization
[6:32]
Over-parametrization via Lifting for Low-rank Matrix Sensing: Conversion of Spurious Solutions to Strict Saddle Points
[6:40]
Buying Information for Stochastic Optimization
[6:48]
A Fully First-Order Method for Stochastic Bilevel Optimization
[6:56]
Practical and Matching Gradient Variance Bounds for Black-Box Variational Bayesian Inference
[7:04]
Learning-Rate-Free Learning by D-Adaptation
(ends 7:30 PM)
Orals 6:00-7:04
[6:00]
Learning Mixtures of Markov Chains and MDPs
[6:08]
Uncertain Evidence in Probabilistic Models and Stochastic Simulators
[6:16]
How Bad is Top-$K$ Recommendation under Competing Content Creators?
[6:24]
Weighted Flow Diffusion for Local Graph Clustering with Node Attributes: an Algorithm and Statistical Guarantees
[6:32]
Equivariant Polynomials for Graph Neural Networks
[6:40]
Taming graph kernels with random features
[6:48]
Robust Budget Pacing with a Single Sample
[6:56]
Multicalibration as Boosting for Regression
(ends 7:30 PM)
7:30 p.m.
8:45 p.m.
FRI 28 JUL
10 a.m.
11 a.m.
(ends 7:00 PM)
11:50 a.m.
11:55 a.m.
noon
Workshop:
(ends 8:00 PM)
Workshop:
(ends 8:00 PM)
Workshop:
(ends 8:00 PM)
Workshop:
(ends 8:00 PM)
12:15 p.m.
1 p.m.
3 p.m.
6 p.m.
SAT 29 JUL
11 a.m.
(ends 2:00 PM)
11:50 a.m.
11:55 a.m.
noon
Workshop:
(ends 8:15 PM)
12:15 p.m.
1 p.m.
3 p.m.
6 p.m.