Show Detail |
Timezone: Pacific/Honolulu
|
Filter Rooms:
SUN 23 JUL
10 a.m.
(ends 5:00 PM)
11:30 a.m.
12:30 p.m.
Expo Talk Panel with Coffee & a Snack:
(ends 1:30 PM)
1:30 p.m.
Expo Talk Panel with Coffee & a Snack:
(ends 2:30 PM)
2 p.m.
2:30 p.m.
3 p.m.
4 p.m.
5 p.m.
MON 24 JUL
8 a.m.
(ends 7:00 PM)
8:30 a.m.
8:45 a.m.
9:30 a.m.
Tutorial:
(ends 12:00 PM)
10 a.m.
10:30 a.m.
noon
1:30 p.m.
Tutorial:
(ends 3:30 PM)
Tutorial:
(ends 3:30 PM)
Tutorial:
(ends 3:30 PM)
3:30 p.m.
4 p.m.
Tutorial:
(ends 6:00 PM)
6:15 p.m.
6:30 p.m.
TUE 25 JUL
8 a.m.
(ends 6:00 PM)
9 a.m.
9:15 a.m.
10 a.m.
10:30 a.m.
11 a.m.
(ends 12:30 PM)
12:30 p.m.
2 p.m.
(ends 3:30 PM)
3:30 p.m.
4 p.m.
5 p.m.
5:30 p.m.
Orals 5:30-6:50
[5:30]
Bayesian Design Principles for Frequentist Sequential Learning
[5:38]
Towards Theoretical Understanding of Inverse Reinforcement Learning
[5:46]
On the Power of Pre-training for Generalization in RL: Provable Benefits and Hardness
[5:54]
Delayed Feedback in Kernel Bandits
[6:02]
Provably Learning Object-Centric Representations
[6:10]
Task-specific experimental design for treatment effect estimation
[6:18]
Are labels informative in semi-supervised learning? Estimating and leveraging the missing-data mechanism.
[6:26]
Interventional Causal Representation Learning
[6:34]
Returning The Favour: When Regression Benefits From Probabilistic Causal Knowledge
[6:42]
Sequential Underspecified Instrument Selection for Cause-Effect Estimation
(ends 7:00 PM)
Orals 5:30-6:50
[5:30]
Raising the Cost of Malicious AI-Powered Image Editing
[5:38]
Dynamics-inspired Neuromorphic Visual Representation Learning
[5:46]
Scaling Vision Transformers to 22 Billion Parameters
[5:54]
Facial Expression Recognition with Adaptive Frame Rate based on Multiple Testing Correction
[6:02]
Fourmer: An Efficient Global Modeling Paradigm for Image Restoration
[6:10]
Learning Signed Distance Functions from Noisy 3D Point Clouds via Noise to Noise Mapping
[6:18]
Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles
[6:26]
Rockmate: an Efficient, Fast, Automatic and Generic Tool for Re-materialization in PyTorch
[6:34]
SparseProp: Efficient Sparse Backpropagation for Faster Training of Neural Networks at the Edge
[6:42]
Fast Inference from Transformers via Speculative Decoding
(ends 7:00 PM)
Orals 5:30-6:58
[5:30]
Self-Repellent Random Walks on General Graphs - Achieving Minimal Sampling Variance via Nonlinear Markov Chains
[5:38]
Tighter Lower Bounds for Shuffling SGD: Random Permutations and Beyond
[5:46]
Which Features are Learnt by Contrastive Learning? On the Role of Simplicity Bias in Class Collapse and Feature Suppression
[5:54]
Tighter Information-Theoretic Generalization Bounds from Supersamples
[6:02]
Beyond the Universal Law of Robustness: Sharper Laws for Random Features and Neural Tangent Kernels
[6:10]
Bayes-optimal Learning of Deep Random Networks of Extensive-width
[6:18]
Why does Throwing Away Data Improve Worst-Group Error?
[6:26]
Marginalization is not Marginal: No Bad VAE Local Minima when Learning Optimal Sparse Representations
[6:34]
Sharper Bounds for $\ell_p$ Sensitivity Sampling
[6:42]
AdaBoost is not an Optimal Weak to Strong Learner
[6:50]
Generalization on the Unseen, Logic Reasoning and Degree Curriculum
(ends 7:00 PM)
Orals 5:30-6:50
[5:30]
AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
[5:38]
Adversarial Example Does Good: Preventing Painting Imitation from Diffusion Models via Adversarial Examples
[5:46]
Graphically Structured Diffusion Models
[5:54]
Diffusion Models as Artists: Are we Closing the Gap between Humans and Machines?
[6:02]
Refining Generative Process with Discriminator Guidance in Score-based Diffusion Models
[6:10]
Diffusion Models are Minimax Optimal Distribution Estimators
[6:18]
GibbsDDRM: A Partially Collapsed Gibbs Sampler for Solving Blind Inverse Problems with Denoising Diffusion Restoration
[6:26]
OCD: Learning to Overfit with Conditional Diffusion Models
[6:34]
Denoising MCMC for Accelerating Diffusion-Based Generative Models
[6:42]
Cones: Concept Neurons in Diffusion Models for Customized Generation
(ends 7:00 PM)
Orals 5:30-6:50
[5:30]
Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the Machiavelli Benchmark
[5:38]
Information-Theoretic State Space Model for Multi-View Reinforcement Learning
[5:46]
Reparameterized Policy Learning for Multimodal Trajectory Optimization
[5:54]
Representation Learning with Multi-Step Inverse Kinematics: An Efficient and Optimal Approach to Rich-Observation RL
[6:02]
Subequivariant Graph Reinforcement Learning in 3D Environments
[6:10]
A Study of Global and Episodic Bonuses for Exploration in Contextual MDPs
[6:18]
Warm-Start Actor-Critic: From Approximation Error to Sub-optimality Gap
[6:26]
Efficient RL via Disentangled Environment and Agent Representations
[6:34]
Flipping Coins to Estimate Pseudocounts for Exploration in Reinforcement Learning
[6:42]
On the Statistical Benefits of Temporal Difference Learning
(ends 7:00 PM)
Orals 5:30-6:50
[5:30]
Learning GFlowNets From Partial Episodes For Improved Convergence And Stability
[5:38]
The Dormant Neuron Phenomenon in Deep Reinforcement Learning
[5:46]
Reinforcement Learning from Passive Data via Latent Intentions
[5:54]
Best of Both Worlds Policy Optimization
[6:02]
Exponential Smoothing for Off-Policy Learning
[6:10]
Quantile Credit Assignment
[6:18]
Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels
[6:26]
Hierarchies of Reward Machines
[6:34]
Human-Timescale Adaptation in an Open-Ended Task Space
[6:42]
Settling the Reward Hypothesis
(ends 7:00 PM)
WED 26 JUL
8 a.m.
(ends 6:00 PM)
9:30 a.m.
Invited Talk:
Jennifer Doudna
(ends 10:30 AM)
10 a.m.
10:30 a.m.
11 a.m.
(ends 12:30 PM)
12:30 p.m.
2 p.m.
Posters 2:00-3:30
Shiftable Context: Addressing Training-Inference Context Mismatch in Simultaneous Speech Translation
LoSparse: Structured Compression of Large Language Models based on Low-Rank and Sparse Approximation
(ends 3:30 PM)
3:30 p.m.
4 p.m.
Orals 4:00-5:12
[4:00]
When Personalization Harms Performance: Reconsidering the Use of Group Attributes in Prediction
[4:08]
Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling
[4:16]
Whose Opinions Do Language Models Reflect?
[4:24]
A Watermark for Large Language Models
[4:32]
DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature
[4:40]
Using Large Language Models to Simulate Multiple Humans and Replicate Human Subject Studies
[4:48]
Inflow, Outflow, and Reciprocity in Machine Learning
[4:56]
Structure-informed Language Models Are Protein Designers
[5:04]
Transformers Learn In-Context by Gradient Descent
(ends 5:30 PM)
Orals 4:00-5:12
[4:00]
Pretraining Language Models with Human Preferences
[4:08]
Instant Soup: Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models
[4:16]
Specializing Smaller Language Models towards Multi-Step Reasoning
[4:24]
SparseGPT: Massive Language Models Can be Accurately Pruned in One-Shot
[4:32]
Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models
[4:40]
FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU
[4:48]
BPipe: Memory-Balanced Pipeline Parallelism for Training Large Language Models
[4:56]
Tractable Control for Autoregressive Language Generation
[5:04]
Equivariant Architectures for Learning in Deep Weight Spaces
(ends 5:30 PM)
Orals 4:00-5:20
[4:00]
Nonparametric Extensions of Randomized Response for Private Confidence Sets
[4:08]
Differentially Private Hierarchical Clustering with Provable Approximation Guarantees
[4:16]
Tight Data Access Bounds for Private Top-$k$ Selection
[4:24]
JAWS-X: Addressing Efficiency Bottlenecks of Conformal Prediction Under Standard and Feedback Covariate Shift
[4:32]
Active Ranking of Experts Based on their Performances in Many Tasks
[4:40]
The Price of Differential Privacy under Continual Observation
[4:48]
HETAL: Efficient Privacy-preserving Transfer Learning with Homomorphic Encryption
[4:56]
Sketch-Flip-Merge: Mergeable Sketches for Private Distinct Counting
[5:04]
Fast Private Kernel Density Estimation via Locality Sensitive Quantization
[5:12]
Multi-Epoch Matrix Factorization Mechanisms for Private Machine Learning
(ends 5:30 PM)
Orals 4:00-5:20
[4:00]
Adversarial Policies Beat Superhuman Go AIs
[4:08]
Adapting to game trees in zero-sum imperfect information games
[4:16]
Semi Bandit dynamics in Congestion Games: Convergence to Nash Equilibrium and No-Regret Guarantees.
[4:24]
Delving into Noisy Label Detection with Clean Data
[4:32]
Robustly Learning a Single Neuron via Sharpness
[4:40]
Data Feedback Loops: Model-driven Amplification of Dataset Biases
[4:48]
Towards Reliable Neural Specifications
[4:56]
Do Perceptually Aligned Gradients Imply Robustness?
[5:04]
ODS: Test-Time Adaptation in the Presence of Open-World Data Shift
[5:12]
Analysis of Error Feedback in Federated Non-Convex Optimization with Biased Compression: Fast Convergence and Partial Participation
(ends 5:30 PM)
Orals 4:00-5:20
[4:00]
RankMe: Assessing the Downstream Performance of Pretrained Self-Supervised Representations by Their Rank
[4:08]
Evaluating Self-Supervised Learning via Risk Decomposition
[4:16]
BEATs: Audio Pre-Training with Acoustic Tokenizers
[4:24]
Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language
[4:32]
Bidirectional Adaptation for Robust Semi-Supervised Learning with Inconsistent Data Distributions
[4:40]
TRAK: Attributing Model Behavior at Scale
[4:48]
Understanding Plasticity in Neural Networks
[4:56]
Fundamental Limits of Two-layer Autoencoders, and Achieving Them with Gradient Methods
[5:04]
Random Classification Noise does not defeat All Convex Potential Boosters Irrespective of Model Choice
[5:12]
Brauer's Group Equivariant Neural Networks
(ends 5:30 PM)
5:45 p.m.
7 p.m.
THU 27 JUL
8 a.m.
(ends 5:00 PM)
8:30 a.m.
9:30 a.m.
Invited Talk:
John Schulman
(ends 10:30 AM)
10 a.m.
10:30 a.m.
noon
1:30 p.m.
3 p.m.
Orals 3:00-4:20
[3:00]
Mimetic Initialization of Self-Attention Layers
[3:08]
Difference of submodular minimization via DC programming
[3:16]
Simplex Random Features
[3:24]
Patch-level Routing in Mixture-of-Experts is Provably Sample-efficient for Convolutional Neural Networks
[3:32]
Tilted Sparse Additive Models
[3:40]
Dynamic Regularized Sharpness Aware Minimization in Federated Learning: Approaching Global Consistency and Smooth Landscape
[3:48]
Hyena Hierarchy: Towards Larger Convolutional Language Models
[3:56]
Direct Parameterization of Lipschitz-Bounded Deep Networks
[4:12]
Subsample Ridge Ensembles: Equivalences and Generalized Cross-Validation
(ends 4:30 PM)
Orals 3:00-4:20
[3:00]
Neural Continuous-Discrete State Space Models for Irregularly-Sampled Time Series
[3:08]
Self-Interpretable Time Series Prediction with Counterfactual Explanations
[3:16]
Resurrecting Recurrent Neural Networks for Long Sequences
[3:24]
Inferring Relational Potentials in Interacting Systems
[3:32]
Memory-Based Dual Gaussian Processes for Sequential Learning
[3:40]
H-Likelihood Approach to Deep Neural Networks with Temporal-Spatial Random Effects for High-Cardinality Categorical Features
[3:48]
Generalized Teacher Forcing for Learning Chaotic Dynamics
[3:56]
Gaussian Process Priors for Systems of Linear Partial Differential Equations with Constant Coefficients
[4:04]
Spherical Fourier Neural Operators: Learning Stable Dynamics on the Sphere
[4:12]
Learning Control-Oriented Dynamical Structure from Data
(ends 4:30 PM)
Orals 3:00-4:20
[3:00]
Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding
[3:08]
Calibrating Multimodal Learning
[3:16]
StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis
[3:24]
ProtST: Multi-Modality Learning of Protein Sequences and Biomedical Texts
[3:32]
Cross-Modal Fine-Tuning: Align then Refine
[3:40]
Mu$^2$SLAM: Multitask, Multilingual Speech and Language Models
[3:48]
Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
[3:56]
Pre-training for Speech Translation: CTC Meets Optimal Transport
[4:04]
Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models
[4:12]
Spherical Inducing Features for Orthogonally-Decoupled Gaussian Processes
(ends 4:30 PM)
Orals 3:00-4:12
[3:00]
Second-Order Optimization with Lazy Hessians
[3:08]
Unifying Nesterov's Accelerated Gradient Methods for Convex and Strongly Convex Objective Functions
[3:16]
Transformer-based Stagewise Decomposition for Large-Scale Multistage Stochastic Optimization
[3:24]
Continuation Path Learning for Homotopy Optimization
[3:32]
Over-parametrization via Lifting for Low-rank Matrix Sensing: Conversion of Spurious Solutions to Strict Saddle Points
[3:40]
Buying Information for Stochastic Optimization
[3:48]
A Fully First-Order Method for Stochastic Bilevel Optimization
[3:56]
Practical and Matching Gradient Variance Bounds for Black-Box Variational Bayesian Inference
[4:04]
Learning-Rate-Free Learning by D-Adaptation
(ends 4:30 PM)
Orals 3:00-4:04
[3:00]
Learning Mixtures of Markov Chains and MDPs
[3:08]
Uncertain Evidence in Probabilistic Models and Stochastic Simulators
[3:16]
How Bad is Top-$K$ Recommendation under Competing Content Creators?
[3:24]
Weighted Flow Diffusion for Local Graph Clustering with Node Attributes: an Algorithm and Statistical Guarantees
[3:32]
Equivariant Polynomials for Graph Neural Networks
[3:40]
Taming graph kernels with random features
[3:48]
Robust Budget Pacing with a Single Sample
[3:56]
Multicalibration as Boosting for Regression
(ends 4:30 PM)
4:30 p.m.
5:45 p.m.
FRI 28 JUL
7 a.m.
8 a.m.
(ends 4:00 PM)
8:50 a.m.
8:55 a.m.
9 a.m.
Workshop:
(ends 5:00 PM)
Workshop:
(ends 5:00 PM)
Workshop:
(ends 5:00 PM)
Workshop:
(ends 5:00 PM)
9:15 a.m.
10 a.m.
noon
3 p.m.
SAT 29 JUL
8 a.m.
(ends 11:00 AM)
8:50 a.m.
8:55 a.m.
9 a.m.
Workshop:
(ends 5:15 PM)
9:15 a.m.
10 a.m.
noon
3 p.m.