Skip to yearly menu bar Skip to main content


Show Detail
Timezone: Pacific/Honolulu
 
Filter Rooms:  

SUN 23 JUL
10 a.m.
(ends 5:00 PM)
11:30 a.m.
Lunch (On Your Own):
(ends 12:30 PM)
12:30 p.m.
Expo Talk Panel with Coffee & a Snack:
(ends 1:30 PM)
1:30 p.m.
Expo Talk Panel with Coffee & a Snack:
(ends 2:30 PM)
Expo Talk Panel:
(ends 2:30 PM)
2 p.m.
Break:
(ends 6:00 PM)
2:30 p.m.
Break:
(ends 3:00 PM)
3 p.m.
Expo Talk Panel:
(ends 4:00 PM)
4 p.m.
Coffee Break:
(ends 5:00 PM)
5 p.m.
Expo Talk Panel:
(ends 6:00 PM)

MON 24 JUL
8 a.m.
(ends 7:00 PM)
8:30 a.m.
Panel:
(ends 9:30 AM)
8:45 a.m.
Affinity Workshop:
(ends 5:00 PM)
10 a.m.
Exhibit Hall Open:
(ends 8:00 PM)
10:30 a.m.
Coffee Break:
(ends 11:00 AM)
noon
Lunch -(On Your Own):
(ends 1:30 PM)
3:30 p.m.
Coffee Break:
(ends 4:00 PM)
6:15 p.m.
Welcome Reception:
(ends 8:00 PM)
6:30 p.m.
EXPO Attendee Raffle Prize Give Away:
(ends 6:45 PM)

TUE 25 JUL
8 a.m.
(ends 6:00 PM)
9 a.m.
Opening Remarks:
(ends 9:15 AM)
9:15 a.m.
Invited Talk:
Marzyeh Ghassemi
(ends 10:30 AM)
10 a.m.
Exhibit Hall Open:
(ends 6:00 PM)
10:30 a.m.
Coffee Break:
(ends 11:00 AM)
11 a.m.
Posters 11:00-1:30
(ends 12:30 PM)
12:30 p.m.
Lunch -(On Your Own):
(ends 2:00 PM)
2 p.m.
Posters 2:00-3:30
(ends 3:30 PM)
3:30 p.m.
Coffee Only Break:
(ends 4:00 PM)
4 p.m.
Invited Talk:
Shakir Mohamed
(ends 5:00 PM)
5 p.m.
Coffee Break:
(ends 5:30 PM)
5:30 p.m.
Orals 5:30-6:50
[5:30] Bayesian Design Principles for Frequentist Sequential Learning
[5:38] Towards Theoretical Understanding of Inverse Reinforcement Learning
[5:46] On the Power of Pre-training for Generalization in RL: Provable Benefits and Hardness
[5:54] Delayed Feedback in Kernel Bandits
[6:02] Provably Learning Object-Centric Representations
[6:10] Task-specific experimental design for treatment effect estimation
[6:18] Are labels informative in semi-supervised learning? Estimating and leveraging the missing-data mechanism.
[6:26] Interventional Causal Representation Learning
[6:34] Returning The Favour: When Regression Benefits From Probabilistic Causal Knowledge
[6:42] Sequential Underspecified Instrument Selection for Cause-Effect Estimation
(ends 7:00 PM)
Orals 5:30-6:50
[5:30] Raising the Cost of Malicious AI-Powered Image Editing
[5:38] Dynamics-inspired Neuromorphic Visual Representation Learning
[5:46] Scaling Vision Transformers to 22 Billion Parameters
[5:54] Facial Expression Recognition with Adaptive Frame Rate based on Multiple Testing Correction
[6:02] Fourmer: An Efficient Global Modeling Paradigm for Image Restoration
[6:10] Learning Signed Distance Functions from Noisy 3D Point Clouds via Noise to Noise Mapping
[6:18] Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles
[6:26] Rockmate: an Efficient, Fast, Automatic and Generic Tool for Re-materialization in PyTorch
[6:34] SparseProp: Efficient Sparse Backpropagation for Faster Training of Neural Networks at the Edge
[6:42] Fast Inference from Transformers via Speculative Decoding
(ends 7:00 PM)
Orals 5:30-6:58
[5:30] Self-Repellent Random Walks on General Graphs - Achieving Minimal Sampling Variance via Nonlinear Markov Chains
[5:38] Tighter Lower Bounds for Shuffling SGD: Random Permutations and Beyond
[5:46] Which Features are Learnt by Contrastive Learning? On the Role of Simplicity Bias in Class Collapse and Feature Suppression
[5:54] Tighter Information-Theoretic Generalization Bounds from Supersamples
[6:02] Beyond the Universal Law of Robustness: Sharper Laws for Random Features and Neural Tangent Kernels
[6:10] Bayes-optimal Learning of Deep Random Networks of Extensive-width
[6:18] Why does Throwing Away Data Improve Worst-Group Error?
[6:26] Marginalization is not Marginal: No Bad VAE Local Minima when Learning Optimal Sparse Representations
[6:34] Sharper Bounds for $\ell_p$ Sensitivity Sampling
[6:42] AdaBoost is not an Optimal Weak to Strong Learner
[6:50] Generalization on the Unseen, Logic Reasoning and Degree Curriculum
(ends 7:00 PM)
Orals 5:30-6:50
[5:30] AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
[5:38] Adversarial Example Does Good: Preventing Painting Imitation from Diffusion Models via Adversarial Examples
[5:46] Graphically Structured Diffusion Models
[5:54] Diffusion Models as Artists: Are we Closing the Gap between Humans and Machines?
[6:02] Refining Generative Process with Discriminator Guidance in Score-based Diffusion Models
[6:10] Diffusion Models are Minimax Optimal Distribution Estimators
[6:18] GibbsDDRM: A Partially Collapsed Gibbs Sampler for Solving Blind Inverse Problems with Denoising Diffusion Restoration
[6:26] OCD: Learning to Overfit with Conditional Diffusion Models
[6:34] Denoising MCMC for Accelerating Diffusion-Based Generative Models
[6:42] Cones: Concept Neurons in Diffusion Models for Customized Generation
(ends 7:00 PM)
Orals 5:30-6:50
[5:30] Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the Machiavelli Benchmark
[5:38] Information-Theoretic State Space Model for Multi-View Reinforcement Learning
[5:46] Reparameterized Policy Learning for Multimodal Trajectory Optimization
[5:54] Representation Learning with Multi-Step Inverse Kinematics: An Efficient and Optimal Approach to Rich-Observation RL
[6:02] Subequivariant Graph Reinforcement Learning in 3D Environments
[6:10] A Study of Global and Episodic Bonuses for Exploration in Contextual MDPs
[6:18] Warm-Start Actor-Critic: From Approximation Error to Sub-optimality Gap
[6:26] Efficient RL via Disentangled Environment and Agent Representations
[6:34] Flipping Coins to Estimate Pseudocounts for Exploration in Reinforcement Learning
[6:42] On the Statistical Benefits of Temporal Difference Learning
(ends 7:00 PM)
Orals 5:30-6:50
[5:30] Learning GFlowNets From Partial Episodes For Improved Convergence And Stability
[5:38] The Dormant Neuron Phenomenon in Deep Reinforcement Learning
[5:46] Reinforcement Learning from Passive Data via Latent Intentions
[5:54] Best of Both Worlds Policy Optimization
[6:02] Exponential Smoothing for Off-Policy Learning
[6:10] Quantile Credit Assignment
[6:18] Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels
[6:26] Hierarchies of Reward Machines
[6:34] Human-Timescale Adaptation in an Open-Ended Task Space
[6:42] Settling the Reward Hypothesis
(ends 7:00 PM)

WED 26 JUL
8 a.m.
(ends 6:00 PM)
9:30 a.m.
Invited Talk:
Jennifer Doudna
(ends 10:30 AM)
10 a.m.
Exhibit Hall Open:
(ends 6:00 PM)
10:30 a.m.
Coffee Break:
(ends 11:00 AM)
11 a.m.
Posters 11:00-12:30
(ends 12:30 PM)
12:30 p.m.
Lunch -(On Your Own):
(ends 2:00 PM)
2 p.m.
Posters 2:00-3:30
(ends 3:30 PM)
3:30 p.m.
Coffee Break:
(ends 4:00 PM)
4 p.m.
Orals 4:00-5:12
[4:00] When Personalization Harms Performance: Reconsidering the Use of Group Attributes in Prediction
[4:08] Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling
[4:16] Whose Opinions Do Language Models Reflect?
[4:24] A Watermark for Large Language Models
[4:32] DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature
[4:40] Using Large Language Models to Simulate Multiple Humans and Replicate Human Subject Studies
[4:48] Inflow, Outflow, and Reciprocity in Machine Learning
[4:56] Structure-informed Language Models Are Protein Designers
[5:04] Transformers Learn In-Context by Gradient Descent
(ends 5:30 PM)
Orals 4:00-5:12
[4:00] Pretraining Language Models with Human Preferences
[4:08] Instant Soup: Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models
[4:16] Specializing Smaller Language Models towards Multi-Step Reasoning
[4:24] SparseGPT: Massive Language Models Can be Accurately Pruned in One-Shot
[4:32] Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models
[4:40] FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU
[4:48] BPipe: Memory-Balanced Pipeline Parallelism for Training Large Language Models
[4:56] Tractable Control for Autoregressive Language Generation
[5:04] Equivariant Architectures for Learning in Deep Weight Spaces
(ends 5:30 PM)
Orals 4:00-5:20
[4:00] Nonparametric Extensions of Randomized Response for Private Confidence Sets
[4:08] Differentially Private Hierarchical Clustering with Provable Approximation Guarantees
[4:16] Tight Data Access Bounds for Private Top-$k$ Selection
[4:24] JAWS-X: Addressing Efficiency Bottlenecks of Conformal Prediction Under Standard and Feedback Covariate Shift
[4:32] Active Ranking of Experts Based on their Performances in Many Tasks
[4:40] The Price of Differential Privacy under Continual Observation
[4:48] HETAL: Efficient Privacy-preserving Transfer Learning with Homomorphic Encryption
[4:56] Sketch-Flip-Merge: Mergeable Sketches for Private Distinct Counting
[5:04] Fast Private Kernel Density Estimation via Locality Sensitive Quantization
[5:12] Multi-Epoch Matrix Factorization Mechanisms for Private Machine Learning
(ends 5:30 PM)
Orals 4:00-5:20
[4:00] Adversarial Policies Beat Superhuman Go AIs
[4:08] Adapting to game trees in zero-sum imperfect information games
[4:16] Semi Bandit dynamics in Congestion Games: Convergence to Nash Equilibrium and No-Regret Guarantees.
[4:24] Delving into Noisy Label Detection with Clean Data
[4:32] Robustly Learning a Single Neuron via Sharpness
[4:40] Data Feedback Loops: Model-driven Amplification of Dataset Biases
[4:48] Towards Reliable Neural Specifications
[4:56] Do Perceptually Aligned Gradients Imply Robustness?
[5:04] ODS: Test-Time Adaptation in the Presence of Open-World Data Shift
[5:12] Analysis of Error Feedback in Federated Non-Convex Optimization with Biased Compression: Fast Convergence and Partial Participation
(ends 5:30 PM)
Orals 4:00-5:20
[4:00] RankMe: Assessing the Downstream Performance of Pretrained Self-Supervised Representations by Their Rank
[4:08] Evaluating Self-Supervised Learning via Risk Decomposition
[4:16] BEATs: Audio Pre-Training with Acoustic Tokenizers
[4:24] Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language
[4:32] Bidirectional Adaptation for Robust Semi-Supervised Learning with Inconsistent Data Distributions
[4:40] TRAK: Attributing Model Behavior at Scale
[4:48] Understanding Plasticity in Neural Networks
[4:56] Fundamental Limits of Two-layer Autoencoders, and Achieving Them with Gradient Methods
[5:04] Random Classification Noise does not defeat All Convex Potential Boosters Irrespective of Model Choice
[5:12] Brauer's Group Equivariant Neural Networks
(ends 5:30 PM)
Panel:
(ends 5:30 PM)
5:45 p.m.
Town Hall:
(ends 6:15 PM)
7 p.m.

THU 27 JUL
8 a.m.
(ends 5:00 PM)
8:30 a.m.
Test Of Time:
(ends 9:00 AM)
9:30 a.m.
Invited Talk:
John Schulman
(ends 10:30 AM)
10 a.m.
Coffee Break:
(ends 11:00 AM)
10:30 a.m.
Posters 10:30-12:00
(ends 12:00 PM)
noon
Lunch -(On Your Own):
(ends 1:30 PM)
1:30 p.m.
Posters 1:30-3:00
(ends 3:00 PM)
3 p.m.
Coffee Only Break:
(ends 3:30 PM)
Orals 3:00-4:20
[3:00] Mimetic Initialization of Self-Attention Layers
[3:08] Difference of submodular minimization via DC programming
[3:16] Simplex Random Features
[3:24] Patch-level Routing in Mixture-of-Experts is Provably Sample-efficient for Convolutional Neural Networks
[3:32] Tilted Sparse Additive Models
[3:40] Dynamic Regularized Sharpness Aware Minimization in Federated Learning: Approaching Global Consistency and Smooth Landscape
[3:48] Hyena Hierarchy: Towards Larger Convolutional Language Models
[3:56] Direct Parameterization of Lipschitz-Bounded Deep Networks
[4:12] Subsample Ridge Ensembles: Equivalences and Generalized Cross-Validation
(ends 4:30 PM)
Orals 3:00-4:20
[3:00] Neural Continuous-Discrete State Space Models for Irregularly-Sampled Time Series
[3:08] Self-Interpretable Time Series Prediction with Counterfactual Explanations
[3:16] Resurrecting Recurrent Neural Networks for Long Sequences
[3:24] Inferring Relational Potentials in Interacting Systems
[3:32] Memory-Based Dual Gaussian Processes for Sequential Learning
[3:40] H-Likelihood Approach to Deep Neural Networks with Temporal-Spatial Random Effects for High-Cardinality Categorical Features
[3:48] Generalized Teacher Forcing for Learning Chaotic Dynamics
[3:56] Gaussian Process Priors for Systems of Linear Partial Differential Equations with Constant Coefficients
[4:04] Spherical Fourier Neural Operators: Learning Stable Dynamics on the Sphere
[4:12] Learning Control-Oriented Dynamical Structure from Data
(ends 4:30 PM)
Orals 3:00-4:20
[3:00] Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding
[3:08] Calibrating Multimodal Learning
[3:16] StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis
[3:24] ProtST: Multi-Modality Learning of Protein Sequences and Biomedical Texts
[3:32] Cross-Modal Fine-Tuning: Align then Refine
[3:40] Mu$^2$SLAM: Multitask, Multilingual Speech and Language Models
[3:48] Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
[3:56] Pre-training for Speech Translation: CTC Meets Optimal Transport
[4:04] Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models
[4:12] Spherical Inducing Features for Orthogonally-Decoupled Gaussian Processes
(ends 4:30 PM)
Orals 3:00-4:12
[3:00] Second-Order Optimization with Lazy Hessians
[3:08] Unifying Nesterov's Accelerated Gradient Methods for Convex and Strongly Convex Objective Functions
[3:16] Transformer-based Stagewise Decomposition for Large-Scale Multistage Stochastic Optimization
[3:24] Continuation Path Learning for Homotopy Optimization
[3:32] Over-parametrization via Lifting for Low-rank Matrix Sensing: Conversion of Spurious Solutions to Strict Saddle Points
[3:40] Buying Information for Stochastic Optimization
[3:48] A Fully First-Order Method for Stochastic Bilevel Optimization
[3:56] Practical and Matching Gradient Variance Bounds for Black-Box Variational Bayesian Inference
[4:04] Learning-Rate-Free Learning by D-Adaptation
(ends 4:30 PM)
Orals 3:00-4:04
[3:00] Learning Mixtures of Markov Chains and MDPs
[3:08] Uncertain Evidence in Probabilistic Models and Stochastic Simulators
[3:16] How Bad is Top-$K$ Recommendation under Competing Content Creators?
[3:24] Weighted Flow Diffusion for Local Graph Clustering with Node Attributes: an Algorithm and Statistical Guarantees
[3:32] Equivariant Polynomials for Graph Neural Networks
[3:40] Taming graph kernels with random features
[3:48] Robust Budget Pacing with a Single Sample
[3:56] Multicalibration as Boosting for Regression
(ends 4:30 PM)
Oral C6:
(ends 4:30 PM)
4:30 p.m.
Reception:
(ends 5:45 PM)

FRI 28 JUL
7 a.m.
8 a.m.
(ends 4:00 PM)
8:55 a.m.
10 a.m.
Coffee Break:
(ends 10:30 AM)
noon
Break:
(ends 1:30 PM)
3 p.m.
Coffee Break:
(ends 3:30 PM)

SAT 29 JUL
8 a.m.
(ends 11:00 AM)
8:55 a.m.
9:15 a.m.
Affinity Workshop:
(ends 5:15 PM)
10 a.m.
Coffee Break:
(ends 10:30 AM)
noon
Break:
(ends 1:30 PM)
3 p.m.
Coffee Break:
(ends 3:30 PM)