Skip to yearly menu bar Skip to main content


Show Detail Timezone:
America/Los_Angeles
 
Filter Rooms:  

SUN 23 JUL
1 p.m.
(ends 8:00 PM)
2:30 p.m.
Lunch (On Your Own):
(ends 3:30 PM)
3:30 p.m.
Expo Talk Panel with Coffee & a Snack:
(ends 4:30 PM)
4:30 p.m.
Expo Talk Panel with Coffee & a Snack:
(ends 5:30 PM)
Expo Talk Panel:
(ends 5:30 PM)
5 p.m.
Break:
(ends 9:00 PM)
5:30 p.m.
Break:
(ends 6:00 PM)
6 p.m.
Expo Talk Panel:
(ends 7:00 PM)
7 p.m.
Coffee Break:
(ends 8:00 PM)
8 p.m.
Expo Talk Panel:
(ends 9:00 PM)

MON 24 JUL
11 a.m.
(ends 10:00 PM)
11:30 a.m.
Panel:
(ends 12:30 PM)
11:45 a.m.
Affinity Workshop:
(ends 8:00 PM)
1 p.m.
Exhibit Hall Open:
(ends 11:00 PM)
1:30 p.m.
Coffee Break:
(ends 2:00 PM)
3 p.m.
Lunch -(On Your Own):
(ends 4:30 PM)
6:30 p.m.
Coffee Break:
(ends 7:00 PM)
9:15 p.m.
Welcome Reception:
(ends 11:00 PM)
9:30 p.m.
EXPO Attendee Raffle Prize Give Away:
(ends 9:45 PM)

TUE 25 JUL
11 a.m.
(ends 9:00 PM)
noon
Opening Remarks:
(ends 12:15 PM)
12:15 p.m.
Invited Talk:
Marzyeh Ghassemi
(ends 1:30 PM)
1 p.m.
Exhibit Hall Open:
(ends 9:00 PM)
1:30 p.m.
Coffee Break:
(ends 2:00 PM)
2 p.m.
Posters 2:00-4:30
(ends 3:30 PM)
3:30 p.m.
Lunch -(On Your Own):
(ends 5:00 PM)
5 p.m.
Posters 5:00-6:30
(ends 6:30 PM)
6:30 p.m.
Coffee Only Break:
(ends 7:00 PM)
7 p.m.
Invited Talk:
Shakir Mohamed
(ends 8:00 PM)
8 p.m.
Coffee Break:
(ends 8:30 PM)
8:30 p.m.
Orals 8:30-9:50
[8:30] Bayesian Design Principles for Frequentist Sequential Learning
[8:38] Towards Theoretical Understanding of Inverse Reinforcement Learning
[8:46] On the Power of Pre-training for Generalization in RL: Provable Benefits and Hardness
[8:54] Delayed Feedback in Kernel Bandits
[9:02] Provably Learning Object-Centric Representations
[9:10] Task-specific experimental design for treatment effect estimation
[9:18] Are labels informative in semi-supervised learning? Estimating and leveraging the missing-data mechanism.
[9:26] Interventional Causal Representation Learning
[9:34] Returning The Favour: When Regression Benefits From Probabilistic Causal Knowledge
[9:42] Sequential Underspecified Instrument Selection for Cause-Effect Estimation
(ends 10:00 PM)
Orals 8:30-9:50
[8:30] Raising the Cost of Malicious AI-Powered Image Editing
[8:38] Dynamics-inspired Neuromorphic Visual Representation Learning
[8:46] Scaling Vision Transformers to 22 Billion Parameters
[8:54] Facial Expression Recognition with Adaptive Frame Rate based on Multiple Testing Correction
[9:02] Fourmer: An Efficient Global Modeling Paradigm for Image Restoration
[9:10] Learning Signed Distance Functions from Noisy 3D Point Clouds via Noise to Noise Mapping
[9:18] Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles
[9:26] Rockmate: an Efficient, Fast, Automatic and Generic Tool for Re-materialization in PyTorch
[9:34] SparseProp: Efficient Sparse Backpropagation for Faster Training of Neural Networks at the Edge
[9:42] Fast Inference from Transformers via Speculative Decoding
(ends 10:00 PM)
Orals 8:30-9:58
[8:30] Self-Repellent Random Walks on General Graphs - Achieving Minimal Sampling Variance via Nonlinear Markov Chains
[8:38] Tighter Lower Bounds for Shuffling SGD: Random Permutations and Beyond
[8:46] Which Features are Learnt by Contrastive Learning? On the Role of Simplicity Bias in Class Collapse and Feature Suppression
[8:54] Tighter Information-Theoretic Generalization Bounds from Supersamples
[9:02] Beyond the Universal Law of Robustness: Sharper Laws for Random Features and Neural Tangent Kernels
[9:10] Bayes-optimal Learning of Deep Random Networks of Extensive-width
[9:18] Why does Throwing Away Data Improve Worst-Group Error?
[9:26] Marginalization is not Marginal: No Bad VAE Local Minima when Learning Optimal Sparse Representations
[9:34] Sharper Bounds for $\ell_p$ Sensitivity Sampling
[9:42] AdaBoost is not an Optimal Weak to Strong Learner
[9:50] Generalization on the Unseen, Logic Reasoning and Degree Curriculum
(ends 10:00 PM)
Orals 8:30-9:50
[8:30] AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
[8:38] Adversarial Example Does Good: Preventing Painting Imitation from Diffusion Models via Adversarial Examples
[8:46] Graphically Structured Diffusion Models
[8:54] Diffusion Models as Artists: Are we Closing the Gap between Humans and Machines?
[9:02] Refining Generative Process with Discriminator Guidance in Score-based Diffusion Models
[9:10] Diffusion Models are Minimax Optimal Distribution Estimators
[9:18] GibbsDDRM: A Partially Collapsed Gibbs Sampler for Solving Blind Inverse Problems with Denoising Diffusion Restoration
[9:26] OCD: Learning to Overfit with Conditional Diffusion Models
[9:34] Denoising MCMC for Accelerating Diffusion-Based Generative Models
[9:42] Cones: Concept Neurons in Diffusion Models for Customized Generation
(ends 10:00 PM)
Orals 8:30-9:50
[8:30] Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the Machiavelli Benchmark
[8:38] Information-Theoretic State Space Model for Multi-View Reinforcement Learning
[8:46] Reparameterized Policy Learning for Multimodal Trajectory Optimization
[8:54] Representation Learning with Multi-Step Inverse Kinematics: An Efficient and Optimal Approach to Rich-Observation RL
[9:02] Subequivariant Graph Reinforcement Learning in 3D Environments
[9:10] A Study of Global and Episodic Bonuses for Exploration in Contextual MDPs
[9:18] Warm-Start Actor-Critic: From Approximation Error to Sub-optimality Gap
[9:26] Efficient RL via Disentangled Environment and Agent Representations
[9:34] Flipping Coins to Estimate Pseudocounts for Exploration in Reinforcement Learning
[9:42] On the Statistical Benefits of Temporal Difference Learning
(ends 10:00 PM)
Orals 8:30-9:50
[8:30] Learning GFlowNets From Partial Episodes For Improved Convergence And Stability
[8:38] The Dormant Neuron Phenomenon in Deep Reinforcement Learning
[8:46] Reinforcement Learning from Passive Data via Latent Intentions
[8:54] Best of Both Worlds Policy Optimization
[9:02] Exponential Smoothing for Off-Policy Learning
[9:10] Quantile Credit Assignment
[9:18] Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels
[9:26] Hierarchies of Reward Machines
[9:34] Human-Timescale Adaptation in an Open-Ended Task Space
[9:42] Settling the Reward Hypothesis
(ends 10:00 PM)

WED 26 JUL
11 a.m.
(ends 9:00 PM)
12:30 p.m.
Invited Talk:
Jennifer Doudna
(ends 1:30 PM)
1 p.m.
Exhibit Hall Open:
(ends 9:00 PM)
1:30 p.m.
Coffee Break:
(ends 2:00 PM)
2 p.m.
Posters 2:00-3:30
(ends 3:30 PM)
3:30 p.m.
Lunch -(On Your Own):
(ends 5:00 PM)
5 p.m.
Posters 5:00-6:30
(ends 6:30 PM)
6:30 p.m.
Coffee Break:
(ends 7:00 PM)
7 p.m.
Orals 7:00-8:12
[7:00] When Personalization Harms Performance: Reconsidering the Use of Group Attributes in Prediction
[7:08] Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling
[7:16] Whose Opinions Do Language Models Reflect?
[7:24] A Watermark for Large Language Models
[7:32] DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature
[7:40] Using Large Language Models to Simulate Multiple Humans and Replicate Human Subject Studies
[7:48] Inflow, Outflow, and Reciprocity in Machine Learning
[7:56] Structure-informed Language Models Are Protein Designers
[8:04] Transformers Learn In-Context by Gradient Descent
(ends 8:30 PM)
Orals 7:00-8:12
[7:00] Pretraining Language Models with Human Preferences
[7:08] Instant Soup: Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models
[7:16] Specializing Smaller Language Models towards Multi-Step Reasoning
[7:24] SparseGPT: Massive Language Models Can be Accurately Pruned in One-Shot
[7:32] Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models
[7:40] FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU
[7:48] BPipe: Memory-Balanced Pipeline Parallelism for Training Large Language Models
[7:56] Tractable Control for Autoregressive Language Generation
[8:04] Equivariant Architectures for Learning in Deep Weight Spaces
(ends 8:30 PM)
Orals 7:00-8:20
[7:00] Nonparametric Extensions of Randomized Response for Private Confidence Sets
[7:08] Differentially Private Hierarchical Clustering with Provable Approximation Guarantees
[7:16] Tight Data Access Bounds for Private Top-$k$ Selection
[7:24] JAWS-X: Addressing Efficiency Bottlenecks of Conformal Prediction Under Standard and Feedback Covariate Shift
[7:32] Active Ranking of Experts Based on their Performances in Many Tasks
[7:40] The Price of Differential Privacy under Continual Observation
[7:48] HETAL: Efficient Privacy-preserving Transfer Learning with Homomorphic Encryption
[7:56] Sketch-Flip-Merge: Mergeable Sketches for Private Distinct Counting
[8:04] Fast Private Kernel Density Estimation via Locality Sensitive Quantization
[8:12] Multi-Epoch Matrix Factorization Mechanisms for Private Machine Learning
(ends 8:30 PM)
Orals 7:00-8:20
[7:00] Adversarial Policies Beat Superhuman Go AIs
[7:08] Adapting to game trees in zero-sum imperfect information games
[7:16] Semi Bandit dynamics in Congestion Games: Convergence to Nash Equilibrium and No-Regret Guarantees.
[7:24] Delving into Noisy Label Detection with Clean Data
[7:32] Robustly Learning a Single Neuron via Sharpness
[7:40] Data Feedback Loops: Model-driven Amplification of Dataset Biases
[7:48] Towards Reliable Neural Specifications
[7:56] Do Perceptually Aligned Gradients Imply Robustness?
[8:04] ODS: Test-Time Adaptation in the Presence of Open-World Data Shift
[8:12] Analysis of Error Feedback in Federated Non-Convex Optimization with Biased Compression: Fast Convergence and Partial Participation
(ends 8:30 PM)
Orals 7:00-8:20
[7:00] RankMe: Assessing the Downstream Performance of Pretrained Self-Supervised Representations by Their Rank
[7:08] Evaluating Self-Supervised Learning via Risk Decomposition
[7:16] BEATs: Audio Pre-Training with Acoustic Tokenizers
[7:24] Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language
[7:32] Bidirectional Adaptation for Robust Semi-Supervised Learning with Inconsistent Data Distributions
[7:40] TRAK: Attributing Model Behavior at Scale
[7:48] Understanding Plasticity in Neural Networks
[7:56] Fundamental Limits of Two-layer Autoencoders, and Achieving Them with Gradient Methods
[8:04] Random Classification Noise does not defeat All Convex Potential Boosters Irrespective of Model Choice
[8:12] Brauer's Group Equivariant Neural Networks
(ends 8:30 PM)
Panel:
(ends 8:30 PM)
8:45 p.m.
Town Hall:
(ends 9:15 PM)
10 p.m.

THU 27 JUL
11 a.m.
(ends 8:00 PM)
11:30 a.m.
Test Of Time:
(ends 12:00 PM)
12:30 p.m.
Invited Talk:
John Schulman
(ends 1:30 PM)
1 p.m.
Coffee Break:
(ends 2:00 PM)
1:30 p.m.
Posters 1:30-3:00
(ends 3:00 PM)
3 p.m.
Lunch -(On Your Own):
(ends 4:30 PM)
4:30 p.m.
Posters 4:30-6:00
(ends 6:00 PM)
6 p.m.
Coffee Only Break:
(ends 6:30 PM)
Orals 6:00-7:20
[6:00] Mimetic Initialization of Self-Attention Layers
[6:08] Difference of submodular minimization via DC programming
[6:16] Simplex Random Features
[6:24] Patch-level Routing in Mixture-of-Experts is Provably Sample-efficient for Convolutional Neural Networks
[6:32] Tilted Sparse Additive Models
[6:40] Dynamic Regularized Sharpness Aware Minimization in Federated Learning: Approaching Global Consistency and Smooth Landscape
[6:48] Hyena Hierarchy: Towards Larger Convolutional Language Models
[6:56] Direct Parameterization of Lipschitz-Bounded Deep Networks
[7:12] Subsample Ridge Ensembles: Equivalences and Generalized Cross-Validation
(ends 7:30 PM)
Orals 6:00-7:20
[6:00] Neural Continuous-Discrete State Space Models for Irregularly-Sampled Time Series
[6:08] Self-Interpretable Time Series Prediction with Counterfactual Explanations
[6:16] Resurrecting Recurrent Neural Networks for Long Sequences
[6:24] Inferring Relational Potentials in Interacting Systems
[6:32] Memory-Based Dual Gaussian Processes for Sequential Learning
[6:40] H-Likelihood Approach to Deep Neural Networks with Temporal-Spatial Random Effects for High-Cardinality Categorical Features
[6:48] Generalized Teacher Forcing for Learning Chaotic Dynamics
[6:56] Gaussian Process Priors for Systems of Linear Partial Differential Equations with Constant Coefficients
[7:04] Spherical Fourier Neural Operators: Learning Stable Dynamics on the Sphere
[7:12] Learning Control-Oriented Dynamical Structure from Data
(ends 7:30 PM)
Orals 6:00-7:20
[6:00] Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding
[6:08] Calibrating Multimodal Learning
[6:16] StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis
[6:24] ProtST: Multi-Modality Learning of Protein Sequences and Biomedical Texts
[6:32] Cross-Modal Fine-Tuning: Align then Refine
[6:40] Mu$^2$SLAM: Multitask, Multilingual Speech and Language Models
[6:48] Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
[6:56] Pre-training for Speech Translation: CTC Meets Optimal Transport
[7:04] Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models
[7:12] Spherical Inducing Features for Orthogonally-Decoupled Gaussian Processes
(ends 7:30 PM)
Orals 6:00-7:12
[6:00] Second-Order Optimization with Lazy Hessians
[6:08] Unifying Nesterov's Accelerated Gradient Methods for Convex and Strongly Convex Objective Functions
[6:16] Transformer-based Stagewise Decomposition for Large-Scale Multistage Stochastic Optimization
[6:24] Continuation Path Learning for Homotopy Optimization
[6:32] Over-parametrization via Lifting for Low-rank Matrix Sensing: Conversion of Spurious Solutions to Strict Saddle Points
[6:40] Buying Information for Stochastic Optimization
[6:48] A Fully First-Order Method for Stochastic Bilevel Optimization
[6:56] Practical and Matching Gradient Variance Bounds for Black-Box Variational Bayesian Inference
[7:04] Learning-Rate-Free Learning by D-Adaptation
(ends 7:30 PM)
Orals 6:00-7:04
[6:00] Learning Mixtures of Markov Chains and MDPs
[6:08] Uncertain Evidence in Probabilistic Models and Stochastic Simulators
[6:16] How Bad is Top-$K$ Recommendation under Competing Content Creators?
[6:24] Weighted Flow Diffusion for Local Graph Clustering with Node Attributes: an Algorithm and Statistical Guarantees
[6:32] Equivariant Polynomials for Graph Neural Networks
[6:40] Taming graph kernels with random features
[6:48] Robust Budget Pacing with a Single Sample
[6:56] Multicalibration as Boosting for Regression
(ends 7:30 PM)
Oral C6:
(ends 7:30 PM)
7:30 p.m.
Reception:
(ends 8:45 PM)

FRI 28 JUL
10 a.m.
11 a.m.
(ends 7:00 PM)
11:55 a.m.
1 p.m.
Coffee Break:
(ends 1:30 PM)
3 p.m.
Break:
(ends 4:30 PM)
6 p.m.
Coffee Break:
(ends 6:30 PM)

SAT 29 JUL
11 a.m.
(ends 2:00 PM)
11:55 a.m.
12:15 p.m.
Affinity Workshop:
(ends 8:15 PM)
1 p.m.
Coffee Break:
(ends 1:30 PM)
3 p.m.
Break:
(ends 4:30 PM)
6 p.m.
Coffee Break:
(ends 6:30 PM)