ICML 2023 Schedule

Filter Events

SUN 23 JUL

1 p.m.

Registration

(ends 8:00 PM)

2:30 p.m.

Lunch (On Your Own)

3:30 p.m.

Expo Talk Panel with Coffee & a Snack:

Artificial Generative Innovation – How AI will change how humans innovate

(ends 4:30 PM)

4:30 p.m.

Expo Talk Panel with Coffee & a Snack:

Leveraging Self-Attention Models for Logistics problems in Amazon

(ends 5:30 PM)

Expo Talk Panel:

Generative AI and Science

(ends 5:30 PM)

5 p.m.

Exhibit Hall Open

5:30 p.m.

Coffee Only Break 310, 320

6 p.m.

Expo Talk Panel:

Graph Neural Networks in TensorFlow: a Practical Guide

(ends 7:00 PM)

Expo Talk Panel:

Learning Iconic Scenes with Differential Privacy

(ends 7:00 PM)

7 p.m.

Coffee Break - Exhibit Hall 3

8 p.m.

Expo Talk Panel:

Vowpal Wabbit: year in review and looking ahead in an LLM world

(ends 9:00 PM)

Expo Talk Panel:

Colossal-AI: Breakthroughs in Efficient AI

(ends 9:00 PM)

MON 24 JUL

11 a.m.

Registration

(ends 10:00 PM)

11:30 a.m.

Panel:

ICML Education Outreach Panel

(ends 12:30 PM)

11:45 a.m.

Affinity Workshop:

LatinX in AI (LXAI) Workshop

(ends 8:00 PM)

12:30 p.m.

Tutorial:

Tutorial on Multimodal Machine Learning: Principles, Challenges, and Open Questions

(ends 3:00 PM)

Tutorial:

Reinforcement Learning from Human Feedback: A Tutorial *

(ends 3:00 PM)

Tutorial:

Optimal Transport in Learning, Control, and Dynamical Systems

(ends 3:00 PM)

1 p.m.

Exhibit Hall Open

1:30 p.m.

Coffee Break

3 p.m.

Lunch -(On Your Own)

4:30 p.m.

Tutorial:

Self-Supervised Learning in Vision: from Research Advances to Best Practices

(ends 6:30 PM)

Tutorial:

Disinformation, Fake News and Computational Propaganda: Challenges and Opportunities for Machine Learning Research

(ends 6:30 PM)

Tutorial:

How to DP-fy ML: A Practical Tutorial to Machine Learning with Differential Privacy

(ends 6:30 PM)

6:30 p.m.

Coffee Break

7 p.m.

Tutorial:

Responsible AI for Generative AI in Practice: Lessons Learned and Open Challenges

(ends 9:00 PM)

Tutorial:

Recent Advances in the Generalization Theory of Neural Networks *

(ends 9:00 PM)

Tutorial:

Discovering Agent-Centric Latent States in Theory and in Practice

(ends 9:00 PM)

9:15 p.m.

Welcome Reception:

Welcome Reception

(ends 11:00 PM)

9:30 p.m.

EXPO Attendee Raffle Prize Give Away:

EXPO Attendee Raffle Prize Give Away

(ends 9:45 PM)

TUE 25 JUL

11 a.m.

Registration

(ends 9:00 PM)

noon

Opening Remarks:

Opening Remarks

(ends 12:15 PM)

12:15 p.m.

Invited Talk:

Taking the Pulse Of Ethical ML in Health

Marzyeh Ghassemi

(ends 1:30 PM)

1 p.m.

Exhibit Hall Open

1:30 p.m.

Coffee Break

2 p.m.

Poster Session 1 [2:00-3:30]

Posters 2:00-4:30

POUF: Prompt-Oriented Unsupervised Fine-tuning for Large Pre-trained Models

Faith-Shap: The Faithful Shapley Interaction Index

Learning to Learn from APIs: Black-Box Data-Free Meta-Learning

Meta-SAGE: Scale Meta-Learning Scheduled Adaptation with Guided Exploration for Mitigating Scale Shift on Combinatorial Optimization

Learning Compiler Pass Orders using Coreset and Normalized Value Prediction

TabLeak: Tabular Data Leakage in Federated Learning

Semi-Supervised Offline Reinforcement Learning with Action-Free Trajectories

Enhancing Activity Prediction Models in Drug Discovery with the Ability to Understand Human Language

Do You Remember? Overcoming Catastrophic Forgetting for Fake Audio Detection

Improving Fair Training under Correlation Shifts

Phase-aware Adversarial Defense for Improving Adversarial Robustness

Generative Causal Representation Learning for Out-of-Distribution Motion Forecasting

Unlocking Slot Attention by Changing Optimal Transport Costs

Optimizing Hyperparameters with Conformal Quantile Regression

Sample and Predict Your Latent: Modality-free Sequential Disentanglement via Contrastive Estimation

Test-time Adaptation with Slot-Centric Models

Long-Tailed Recognition by Mutual Information Maximization between Latent Features and Ground-Truth Labels

Cooperative Open-ended Learning Framework for Zero-Shot Coordination

Model-Bellman Inconsistency for Model-based Offline Reinforcement Learning

Eventual Discounting Temporal Logic Counterfactual Experience Replay

PAC Generalization via Invariant Representations

Nonparametric Generative Modeling with Conditional Sliced-Wasserstein Flows

Few-Sample Feature Selection via Feature Manifold Learning

Provably Convergent Schrödinger Bridge with Applications to Probabilistic Time Series Imputation

Why does Throwing Away Data Improve Worst-Group Error?

TabDDPM: Modelling Tabular Data with Diffusion Models

A Closer Look at the Intervention Procedure of Concept Bottleneck Models

On Investigating the Conservative Property of Score-Based Generative Models

Estimating Causal Effects using a Multi-task Deep Ensemble

Differentiable and Transportable Structure Learning

Monotonic Location Attention for Length Generalization

Cramming: Training a Language Model on a single GPU in one day.

Federated Conformal Predictors for Distributed Uncertainty Quantification

From Noisy Fixed-Point Iterations to Private ADMM for Centralized and Federated Learning

Complexity of Block Coordinate Descent with Proximal Regularization and Applications to Wasserstein CP-dictionary Learning

A Model-Based Method for Minimizing CVaR and Beyond

Gaussian Process Priors for Systems of Linear Partial Differential Equations with Constant Coefficients

Rotation and Translation Invariant Representation Learning with Implicit Neural Representations

Improving Statistical Fidelity for Neural Image Compression with Implicit Local Likelihood Models

Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding

Adaptive Coordination in Social Embodied Rearrangement

Active Policy Improvement from Multiple Black-box Oracles

Diverse and Faithful Knowledge-Grounded Dialogue Generation via Sequential Posterior Inference

Robust Collaborative Learning with Linear Gradient Overhead

Neural Collapse in Deep Linear Networks: From Balanced to Imbalanced Data

On Balancing Bias and Variance in Unsupervised Multi-Source-Free Domain Adaptation

Reprogramming Pretrained Language Models for Antibody Sequence Infilling

FlexRound: Learnable Rounding based on Element-wise Division for Post-Training Quantization

The Wisdom of Hindsight Makes Language Models Better Instruction Followers

Deep Regression Unlearning

Bit Allocation using Optimization

QuantumDARTS: Differentiable Quantum Architecture Search for Variational Quantum Algorithms

Semi-Autoregressive Energy Flows: Exploring Likelihood-Free Training of Normalizing Flows

DRew: Dynamically Rewired Message Passing with Delay

Few-bit Backward: Quantized Gradients of Activation Functions for Memory Footprint Reduction

Understanding Int4 Quantization for Language Models: Latency Speedup, Composability, and Failure Cases

SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient

Efficient Online Reinforcement Learning with Offline Data

Infusing Lattice Symmetry Priors in Attention Mechanisms for Sample-Efficient Abstract Geometric Reasoning

Locally Regularized Neural Differential Equations: Some Black Boxes were meant to remain closed!

The Edge of Orthogonality: A Simple View of What Makes BYOL Tick

Second-order regression models exhibit progressive sharpening to the edge of stability

Graph Neural Networks with Learnable and Optimal Polynomial Bases

SE(3) diffusion model with application to protein backbone generation

Predicting Ordinary Differential Equations with Transformers

Universal Morphology Control via Contextual Modulation

Interactive Object Placement with Reinforcement Learning

GNOT: A General Neural Operator Transformer for Operator Learning

Local Vertex Colouring Graph Neural Networks

Transformers Meet Directed Graphs

NerfDiff: Single-image View Synthesis with NeRF-guided Distillation from 3D-aware Diffusion

Under-Counted Tensor Completion with Neural Incorporation of Attributes

Context-Aware Bayesian Network Actor-Critic Methods for Cooperative Multi-Agent Reinforcement Learning

Learning for Edge-Weighted Online Bipartite Matching with Robustness Guarantees

Smart Initial Basis Selection for Linear Programs

Conditional Graph Information Bottleneck for Molecular Relational Learning

Discover and Cure: Concept-aware Mitigation of Spurious Correlation

Reconstructive Neuron Pruning for Backdoor Defense

ESC: Exploration with Soft Commonsense Constraints for Zero-shot Object Navigation

Why do Nearest Neighbor Language Models Work?

Retrieval-Augmented Multimodal Language Modeling

Optimizing Mode Connectivity for Class Incremental Learning

Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models

Variational Autoencoding Neural Operators

TAN Without a Burn: Scaling Laws of DP-SGD

A Coupled Flow Approach to Imitation Learning

On Kinetic Optimal Probability Paths for Generative Models

Bayesian Design Principles for Frequentist Sequential Learning

Speeding Up Bellman Ford via Minimum Violation Permutations

On User-Level Private Convex Optimization

Delay-Adapted Policy Optimization and Improved Regret for Adversarial MDP with Delayed Bandit Feedback

FedAvg Converges to Zero Training Loss Linearly for Overparameterized Multi-Layer Neural Networks

Contrastive Learning Meets Homophily: Two Birds with One Stone

mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video

ODS: Test-Time Adaptation in the Presence of Open-World Data Shift

Fractional Denoising for 3D Molecular Pre-training

Self-Interpretable Time Series Prediction with Counterfactual Explanations

Prompting Large Language Model for Machine Translation: A Case Study

Uncertain Evidence in Probabilistic Models and Stochastic Simulators

Theoretical Guarantees of Learning Ensembling Strategies with Applications to Time Series Forecasting

Pricing Experimental Design: Causal Effect, Expected Revenue and Tail Risk

Tight and fast generalization error bound of graph embedding in metric space

Extrapolated Random Tree for Regression

Horizon-Free and Variance-Dependent Reinforcement Learning for Latent Markov Decision Processes

Corruption-Robust Algorithms with Uncertainty Weighting for Nonlinear Contextual Bandits and Markov Decision Processes

Active causal structure learning with advice

Active Ranking of Experts Based on their Performances in Many Tasks

Sequential Changepoint Detection via Backward Confidence Sequences

Quantum Lower Bounds for Finding Stationary Points of Nonconvex Functions

The Monge Gap: A Regularizer to Learn All Transport Maps

Learning Distributions over Quantum Measurement Outcomes

Approximation Algorithms for Fair Range Clustering

Short-lived High-volume Bandits

Optimal randomized multilevel Monte Carlo for repeatedly nested expectations

Minimax estimation of discontinuous optimal transport maps: The semi-discrete case

Single Point-Based Distributed Zeroth-Order Optimization with a Non-Convex Stochastic Objective Function

Global Selection of Contrastive Batches via Optimization on Sample Permutations

Attention-Based Recurrence for Multi-Agent Reinforcement Learning under Stochastic Partial Observability

Neural Continuous-Discrete State Space Models for Irregularly-Sampled Time Series

Topologically Faithful Image Segmentation via Induced Matching of Persistence Barcodes

MetricGAN-OKD: Multi-Metric Optimization of MetricGAN via Online Knowledge Distillation for Speech Enhancement

FedVS: Straggler-Resilient and Privacy-Preserving Vertical Federated Learning for Split Models

A Gromov--Wasserstein Geometric View of Spectrum-Preserving Graph Coarsening

Dataset Distillation with Convexified Implicit Gradients

Fast Federated Machine Unlearning with Nonlinear Functional Theory

Pareto Manifold Learning: Tackling multiple tasks via ensembles of single-task models

Boosting Offline Reinforcement Learning with Action Preference Query

SemSup-XC: Semantic Supervision for Zero and Few-shot Extreme Classification

Uncertainty Estimation for Molecules: Desiderata and Methods

Reliable Measures of Spread in High Dimensional Latent Spaces

Causal Strategic Classification: A Tale of Two Shifts

Learning Preconditioners for Conjugate Gradient PDE Solvers

Flexible Phase Dynamics for Bio-Plausible Contrastive Learning

Learning Belief Representations for Partially Observable Deep RL

A Neural PDE Solver with Temporal Stencil Modeling

Towards Understanding Ensemble Distillation in Federated Learning

Improved Active Multi-Task Representation Learning via Lasso

Bag of Tricks for Training Data Extraction from Language Models

GFlowNet-EM for Learning Compositional Latent Variable Models

Optimizing DDPM Sampling with Shortcut Fine-Tuning

The Virtues of Laziness in Model-based RL: A Unified Objective and Algorithms

A Reinforcement Learning Framework for Dynamic Mediation Analysis

Additive Causal Bandits with Unknown Graph

VA-learning as a more efficient alternative to Q-learning

MolDiff: Addressing the Atom-Bond Inconsistency Problem in 3D Molecule Diffusion Generation

Learning to Jump: Thinning and Thickening Latent Counts for Generative Modeling

Emergence of Sparse Representations from Noise

The Benefits of Model-Based Generalization in Reinforcement Learning

K-SHAP: Policy Clustering Algorithm for Anonymous Multi-Agent State-Action Pairs

Inflow, Outflow, and Reciprocity in Machine Learning

Infinite Action Contextual Bandits with Reusable Data Exhaust

Hardness of Independent Learning and Sparse Equilibrium Computation in Markov Games

Improved Online Learning Algorithms for CTR Prediction in Ad Auctions

Bandits with Knapsacks: Advice on Time-Varying Demands

A Kernel-Based View of Language Model Fine-Tuning

Linear CNNs Discover the Statistical Structure of the Dataset Using Only the Most Dominant Frequencies

Accuracy on the Curve: On the Nonlinear Correlation of ML Performance Between Data Subpopulations

Analyzing Convergence in Quantum Neural Networks: Deviations from Neural Tangent Kernels

Slot-VAE: Object-Centric Scene Generation with Slot Attention

CoCo: A Coupled Contrastive Framework for Unsupervised Domain Adaptive Graph Classification

Efficient Training of Language Models using Few-Shot Learning

Identifying Useful Learnwares for Heterogeneous Label Spaces

Adaptive Compositional Continual Meta-Learning

D2Match: Leveraging Deep Learning and Degeneracy for Subgraph Matching

What Makes Entities Similar? A Similarity Flooding Perspective for Multi-sourced Knowledge Graph Embeddings

FedHPO-Bench: A Benchmark Suite for Federated Hyperparameter Optimization

A theory of continuous generative flow networks

Neural signature kernels as infinite-width-depth-limits of controlled ResNets

Practical and Matching Gradient Variance Bounds for Black-Box Variational Bayesian Inference

A Robust Test for the Stationarity Assumption in Sequential Decision Making

FedCR: Personalized Federated Learning Based on Across-Client Common Representation with Conditional Mutual Information Regularization

Auxiliary Learning as an Asymmetric Bargaining Game

Retrosynthetic Planning with Dual Value Networks

Accelerated Cyclic Coordinate Dual Averaging with Extrapolation for Composite Convex Optimization

Generating Private Synthetic Data with Genetic Algorithms

AdaBoost is not an Optimal Weak to Strong Learner

Confidence and Dispersity Speak: Characterizing Prediction Matrix for Unsupervised Accuracy Estimation

Unsupervised Skill Discovery for Learning Shared Structures across Changing Environments

Learning Neural Constitutive Laws from Motion Observations for Generalizable PDE Dynamics

Learning Neural PDE Solvers with Parameter-Guided Channel Attention

SLAMB: Accelerated Large Batch Training with Sparse Communication

GuardHFL: Privacy Guardian for Heterogeneous Federated Learning

Continuation Path Learning for Homotopy Optimization

NeuralSlice: Neural 3D Triangle Mesh Reconstruction via Slicing 4D Tetrahedral Meshes

Margin-based Neural Network Watermarking

Sequential Underspecified Instrument Selection for Cause-Effect Estimation

JAWS-X: Addressing Efficiency Bottlenecks of Conformal Prediction Under Standard and Feedback Covariate Shift

Constrained Phi-Equilibria

Diffusion Models are Minimax Optimal Distribution Estimators

Statistical Indistinguishability of Learning Algorithms

Cluster Explanation via Polyhedral Descriptions

Geometric Clifford Algebra Networks

Multisample Flow Matching: Straightening Flows with Minibatch Couplings

Memory-Based Dual Gaussian Processes for Sequential Learning

Training-Free Neural Active Learning with Initialization-Robustness Guarantees

Regret Bounds for Markov Decision Processes with Recursive Optimized Certainty Equivalents

Correcting discount-factor mismatch in on-policy policy gradient methods

A Fast Optimistic Method for Monotone Variational Inequalities

Doubly Adversarial Federated Bandits

Demystifying Disagreement-on-the-Line in High Dimensions

ReDi: Efficient Learning-Free Diffusion Inference via Trajectory Retrieval

Unifying Molecular and Textual Representations via Multi-task Language Modelling

Evidential Interactive Learning for Medical Image Captioning

Learning Noisy OR Bayesian Networks with Max-Product Belief Propagation

Hindsight Learning for MDPs with Exogenous Inputs

Improving Adversarial Robustness by Putting More Regularizations on Less Robust Samples

Adversarially Robust PAC Learnability of Real-Valued Functions

Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic Shortest Path

MAHALO: Unifying Offline Reinforcement Learning and Imitation Learning from Observations

Truncating Trajectories in Monte Carlo Reinforcement Learning

Approximately Optimal Core Shapes for Tensor Decompositions

Data Structures for Density Estimation

Internet Explorer: Targeted Representation Learning on the Open Web

Learning Unforeseen Robustness from Out-of-distribution Data Using Equivariant Domain Translator

DRCFS: Doubly Robust Causal Feature Selection

Polarity Is All You Need to Learn and Transfer Faster

Homomorphism AutoEncoder --- Learning Group Structured Representations from Observed Transitions

RLang: A Declarative Language for Describing Partial World Knowledge to Reinforcement Learning Agents

Multiple Thinking Achieving Meta-Ability Decoupling for Object Navigation

Mitigating Propagation Failures in Physics-informed Neural Networks using Retain-Resample-Release (R3) Sampling

A Robust Optimisation Perspective on Counterexample-Guided Repair of Neural Networks

Instant Soup: Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models

Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels

Do Machine Learning Models Learn Statistical Rules Inferred from Data?

Set-membership Belief State-based Reinforcement Learning for POMDPs

Masked Bayesian Neural Networks : Theoretical Guarantee and its Posterior Inference

Differentially Private Episodic Reinforcement Learning with Heavy-tailed Rewards

Online Learning with Feedback Graphs: The True Shape of Regret

High Probability Convergence of Stochastic Gradient Methods

On the Interplay Between Misspecification and Sub-optimality Gap in Linear Contextual Bandits

Sequence Modeling with Multiresolution Convolutional Memory

Dropout Reduces Underfitting

Cocktail Party Attack: Breaking Aggregation-Based Privacy in Federated Learning Using Independent Component Analysis

Uncovering Adversarial Risks of Test-Time Adaptation

Trapdoor Normalization with Irreversible Ownership Verification

Detecting Out-of-distribution Data through In-distribution Class Prior

Automatic Intrinsic Reward Shaping for Exploration in Deep Reinforcement Learning

MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation

Unsupervised Out-of-Distribution Detection with Diffusion Inpainting

Markovian Gaussian Process Variational Autoencoders

On Heterogeneous Treatment Effects in Heterogeneous Causal Graphs

Convergence of Proximal Point and Extragradient-Based Methods Beyond Monotonicity: the Case of Negative Comonotonicity

Large Language Models Struggle to Learn Long-Tail Knowledge

Invariance in Policy Optimisation and Partial Identifiability in Reward Learning

Global optimality for Euclidean CCCP under Riemannian convexity

Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice

On the Occupancy Measure of Non-Markovian Policies in Continuous MDPs

Achieving High Accuracy with PINNs via Energy Natural Gradient Descent

A Distribution Optimization Framework for Confidence Bounds of Risk Measures

A Kernel Stein Test of Goodness of Fit for Sequential Models

PCA-based Multi-Task Learning: a Random Matrix Approach

Neural Status Registers

Kernel Logistic Regression Approximation of an Understandable ReLU Neural Network

Free-Form Variational Inference for Gaussian Process State-Space Models

Sparse Learning of Dynamical Systems in RKHS: An Operator-Theoretic Approach

Rethinking Backdoor Attacks

Do Perceptually Aligned Gradients Imply Robustness?

Understanding Oversquashing in GNNs through the Lens of Effective Resistance

One-Shot Compression of Large Edge-Exchangeable Graphs using Bits-Back Coding

Solving High-Dimensional PDEs with Latent Spectral Models

Two-Scale Gradient Descent Ascent Dynamics Finds Mixed Nash Equilibria of Continuous Games: A Mean-Field Perspective

An Information-Theoretic Analysis of Nonstationary Bandit Learning

Learning Representations without Compositional Assumptions

Meta-learning Parameterized Skills

On the Convergence Rate of Gaussianization with Random Rotations

Intrinsic Sliced Wasserstein Distances for Comparing Collections of Probability Distributions on Manifolds and Graphs

Coordinated Dynamic Bidding in Repeated Second-Price Auctions with Budgets

TGRL: An Algorithm for Teacher Guided Reinforcement Learning

The Power of Preconditioning in Overparameterized Low-Rank Matrix Sensing

FeDXL: Provable Federated Learning for Deep X-Risk Optimization

Faster Gradient-Free Algorithms for Nonsmooth Nonconvex Stochastic Optimization

Efficient List-Decodable Regression using Batches

Approximate Stein Classes for Truncated Density Estimation

Rethinking Weak Supervision in Helping Contrastive Learning

Spatial Implicit Neural Representations for Global-Scale Species Mapping

Interventional Causal Representation Learning

Denoising MCMC for Accelerating Diffusion-Based Generative Models

Using Large Language Models to Simulate Multiple Humans and Replicate Human Subject Studies

Poisoning Generative Replay in Continual Learning to Promote Forgetting

Continual Task Allocation in Meta-Policy Network via Sparse Prompting

Text-To-Concept (and Back) via Cross-Model Alignment

Crafting Training Degradation Distribution for the Accuracy-Generalization Trade-off in Real-World Super-Resolution

Maximum Optimality Margin: A Unified Approach for Contextual Linear Programming and Inverse Linear Programming

Bidirectional Looking with A Novel Double Exponential Moving Average to Adaptive and Non-adaptive Momentum Optimizers

State and parameter learning with PARIS particle Gibbs

On Computing Optimal Tree Ensembles

Opponent-Limited Online Search for Imperfect Information Games

Robust Subtask Learning for Compositional Generalization

Collaborative Multi-Agent Heterogeneous Multi-Armed Bandits

Dissecting the Effects of SGD Noise in Distinct Regimes of Deep Learning

Fully Dynamic Submodular Maximization over Matroids

One-sided Matrix Completion from Two Observations Per Row

The Saddle-Point Method in Differential Privacy

Geometric Autoencoders - What You See is What You Decode

Dynamic Regularized Sharpness Aware Minimization in Federated Learning: Approaching Global Consistency and Smooth Landscape

DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation

Continual Learning in Linear Classification on Separable Data

The Implicit Regularization of Dynamical Stability in Stochastic Gradient Descent

Multi-channel Autobidding with Budget and ROI Constraints

Raising the Cost of Malicious AI-Powered Image Editing

(ends 3:30 PM)

3:30 p.m.

Lunch -(On Your Own)

5 p.m.

Poster Session 2 [5:00-6:30]

Posters 5:00-6:30

Policy Evaluation and Temporal-Difference Learning in Continuous Time and Space: A Martingale Approach

Distributed Stochastic Gradient Descent: Nonconvexity, Nonsmoothness, and Convergence to Local Minima

Weakly Supervised Disentangled Generative Causal Representation Learning

Multi-Agent Online Optimization with Delays: Asynchronicity, Adaptivity, and Optimism

Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks

Non-asymptotic Properties of Individualized Treatment Rules from Sequentially Rule-Adaptive Trials

Data-Derived Weak Universal Consistency

On the Convergence Rates of Policy Gradient Methods

Let's Make Block Coordinate Descent Converge Faster: Faster Greedy Rules, Message-Passing, Active-Set Complexity, and Superlinear Convergence

Existence, Stability and Scalability of Orthogonal Convolutional Neural Networks

Constraint Reasoning Embedded Structured Prediction

Project and Forget: Solving Large-Scale Metric Constrained Problems

CleanRL: High-quality Single-file Implementations of Deep Reinforcement Learning Algorithms

Mitigating the Effects of Non-Identifiability on Inference for Bayesian Neural Networks with Latent Variables

MALTS: Matching After Learning to Stretch

Exploiting locality in high-dimensional Factorial hidden Markov models

abess: A Fast Best-Subset Selection Library in Python and R

Adversarial Classification: Necessary Conditions and Geometric Flows

On Generalizations of Some Distance Based Classifiers for HDLSS Data

Selective Machine Learning of the Average Treatment Effect with an Invalid Instrumental Variable

XAI Beyond Classification: Interpretable Neural Clustering

Conditions and Assumptions for Constraint-based Causal Structure Learning

Flexible Model Aggregation for Quantile Regression

A General Theory for Federated Optimization with Asynchronous and Heterogeneous Clients Updates

Deep linear networks can benignly overfit when shallow ones do

Global Convergence of Sub-gradient Method for Robust Matrix Recovery: Small Initialization, Noisy Measurements, and Over-parameterization

Knowledge Hypergraph Embedding Meets Relational Algebra

A Likelihood Approach to Nonparametric Estimation of a Singular Distribution Using Deep Generative Models

Minimal Width for Universal Property of Deep RNN

Learning Optimal Group-structured Individualized Treatment Rules with Many Treatments

Towards Learning to Imitate from a Single Video Demonstration

Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities

The multimarginal optimal transport formulation of adversarial multiclass classification

Sampling random graph homomorphisms and applications to network data analysis

Cluster-Specific Predictions with Multi-Task Gaussian Processes

CSP: Self-Supervised Contrastive Spatial Pre-Training for Geospatial-Visual Representations

Lifelong Language Pretraining with Distribution-Specialized Experts

PreNAS: Preferred One-Shot Learning Towards Efficient Neural Architecture Search

Random Shuffle Transformer for Image Restoration

Feed Two Birds with One Scone: Exploiting Wild Data for Both Out-of-Distribution Generalization and Detection

Investigating the Role of Model-Based Learning in Exploration and Transfer

Tractable Control for Autoregressive Language Generation

Scaling of Class-wise Training Losses for Post-hoc Calibration

The Benefits of Mixup for Feature Learning

SEGA: Structural Entropy Guided Anchor View for Graph Contrastive Learning

Probabilistic Attention-to-Influence Neural Models for Event Sequences

Conditional Tree Matching for Inference-Time Adaptation of Tree Prediction Models

Loss-Guided Diffusion Models for Plug-and-Play Controllable Generation

Disentangled Multi-Fidelity Deep Bayesian Active Learning

Self-supervised Neural Factor Analysis for Disentangling Utterance-level Speech Representations

Data-Efficient Contrastive Self-supervised Learning: Most Beneficial Examples for Supervised Learning Contribute the Least

Trajectory-Aware Eligibility Traces for Off-Policy Reinforcement Learning

Transformers as Algorithms: Generalization and Stability in In-context Learning

Unveiling The Mask of Position-Information Pattern Through the Mist of Image Features

Deep Clustering with Incomplete Noisy Pairwise Annotations: A Geometric Regularization Approach

Learning the Right Layers a Data-Driven Layer-Aggregation Strategy for Semi-Supervised Learning on Multilayer Graphs

Revisiting the Linear-Programming Framework for Offline RL with General Function Approximation

Provably Learning Diverse Features in Multi-View Data with Midpoint Mixup

SOM-CPC: Unsupervised Contrastive Learning with Self-Organizing Maps for Structured Representations of High-Rate Time Series

Towards Trustworthy Explanation: On Causal Rationalization

Wasserstein Barycenter Matching for Graph Size Generalization of Message Passing Neural Networks

Neural Algorithmic Reasoning with Causal Regularisation

Beam Tree Recursive Cells

Bigger, Better, Faster: Human-level Atari with human-level efficiency

How to Trust Your Diffusion Model: A Convex Optimization Approach to Conformal Risk Control

Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both Worlds in Stochastic and Deterministic Environments

Constrained Efficient Global Optimization of Expensive Black-box Functions

Learning-Rate-Free Learning by D-Adaptation

Tensor Decompositions Meet Control Theory: Learning General Mixtures of Linear Dynamical Systems

DUET: 2D Structured and Approximately Equivariant Representations

Hybrid Energy Based Model in the Feature Space for Out-of-Distribution Detection

PaLM-E: An Embodied Multimodal Language Model

Language Instructed Reinforcement Learning for Human-AI Coordination

ContraBAR: Contrastive Bayes-Adaptive Deep RL

Flash: Concept Drift Adaptation in Federated Learning

Revisiting Gradient Clipping: Stochastic bias and tight convergence guarantees

Perturbation Analysis of Neural Collapse

Towards Robust Graph Incremental Learning on Evolving Graphs

Learning Lightweight Object Detectors via Multi-Teacher Progressive Distillation

Graph Ladling: Shockingly Simple Parallel GNN Training without Intermediate Communication

Interpretable Neural-Symbolic Concept Reasoning

Adversarial Collaborative Learning on Non-IID Features

Efficient Latency-Aware CNN Depth Compression via Two-Stage Dynamic Programming

Gradient-based Wang--Landau Algorithm: A Novel Sampler for Output Distribution of Neural Networks over the Input Space

Reduce, Reuse, Recycle: Compositional Generation with Energy-Based Diffusion Models and MCMC

Dual Propagation: Accelerating Contrastive Hebbian Learning with Dyadic Neurons

SparseProp: Efficient Sparse Backpropagation for Faster Training of Neural Networks at the Edge

Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time

Auto-Differentiation of Relational Computations for Very Large Scale Machine Learning

A Study of Global and Episodic Bonuses for Exploration in Contextual MDPs

Ewald-based Long-Range Message Passing for Molecular Graphs

A Theoretical Analysis of the Learning Dynamics under Class Imbalance

How much does Initialization Affect Generalization?

Gradient Descent Monotonically Decreases the Sharpness of Gradient Flow Solutions in Scalar Networks and Beyond

Towards Reliable Neural Specifications

FAENet: Frame Averaging Equivariant GNN for Materials Modeling

Feature Programming for Multivariate Time Series Prediction

Neural Latent Aligner: Cross-trial Alignment for Learning Representations of Complex, Naturalistic Neural Data

Rethinking Visual Reconstruction: Experience-Based Content Completion Guided by Visual Cues

LazyGNN: Large-Scale Graph Neural Networks via Lazy Propagation

Graph Positional Encoding via Random Feature Propagation

Structure-informed Language Models Are Protein Designers

Robust Camera Pose Refinement for Multi-Resolution Hash Encoding

Causal Structure Learning for Latent Intervened Non-stationary Data

Robust Situational Reinforcement Learning in Face of Context Disturbances

Understanding Incremental Learning of Gradient Descent: A Fine-grained Analysis of Matrix Sensing

Non-stationary Reinforcement Learning under General Function Approximation

Iterative Approximate Cross-Validation

A Critical View of Vision-Based Long-Term Dynamics Prediction Under Environment Misalignment

Data Poisoning Attacks Against Multimodal Encoders

Adversarial Example Does Good: Preventing Painting Imitation from Diffusion Models via Adversarial Examples

Multi-View Masked World Models for Visual Robotic Manipulation

Exploring the Benefits of Training Expert Language Models over Instruction Tuning

AudioLDM: Text-to-Audio Generation with Latent Diffusion Models

Robust Weak Supervision with Variational Auto-Encoders

Bidirectional Learning for Offline Model-based Biological Sequence Design

Fast Inference from Transformers via Speculative Decoding

Action Matching: Learning Stochastic Dynamics from Samples

Private Federated Learning with Autotuned Compression

Random Matrix Analysis to Balance between Supervised and Unsupervised Learning under the Low Density Separation Assumption

A theory of representation learning gives a deep generalisation of kernel methods

Optimistic Planning by Regularized Dynamic Programming

Fast Combinatorial Algorithms for Min Max Correlation Clustering

Semi Bandit dynamics in Congestion Games: Convergence to Nash Equilibrium and No-Regret Guarantees.

Global Optimization with Parametric Function Approximation

Patch-level Routing in Mixture-of-Experts is Provably Sample-efficient for Convolutional Neural Networks

Surrogate Module Learning: Reduce the Gradient Error Accumulation in Training Spiking Neural Networks

One Transformer Fits All Distributions in Multi-Modal Diffusion at Scale

Which is Better for Learning with Noisy Labels: The Semi-supervised Method or Modeling Label Noise?

Active Learning based Structural Inference

Superhuman Fairness

PWSHAP: A Path-Wise Explanation Model for Targeted Variables

Covariate balancing using the integral probability metric for causal inference

B-Learner: Quasi-Oracle Bounds on Heterogeneous Causal Effects Under Hidden Confounding

Collaborative Causal Inference with Fair Incentives

WL meet VC

Blockwise Stochastic Variance-Reduced Methods with Parallel Speedup for Multi-Block Bilevel Optimization

Refined Regret for Adversarial MDPs with Linear Function Approximation

User-level Private Stochastic Convex Optimization with Optimal Rates

Delayed Bandits: When Do Intermediate Observations Help?

Bandit Online Linear Optimization with Hints and Queries

Dimensionality Reduction for General KDE Mode Finding

The Power of Uniform Sampling for k-Median

Monge, Bregman and Occam: Interpretable Optimal Transport in High-Dimensions with Feature-Sparse Maps

Fast $(1+\varepsilon)$-Approximation Algorithms for Binary Matrix Factorization

Sketching Meets Differential Privacy: Fast Algorithm for Dynamic Kronecker Projection Maintenance

Reward-Mixing MDPs with Few Latent Contexts are Learnable

Near-Optimal Quantum Coreset Construction Algorithms for Clustering

The Fast Johnson-Lindenstrauss Transform Is Even Faster

Minimum Width of Leaky-ReLU Neural Networks for Uniform Universal Approximation

Towards Theoretical Understanding of Inverse Reinforcement Learning

Nearly Optimal Competitive Ratio for Online Allocation Problems with Two-sided Resource Constraints and Finite Requests

Lowering the Pre-training Tax for Gradient-based Subset Training: A Lightweight Distributed Pre-Training Toolkit

Learning Temporally AbstractWorld Models without Online Experimentation

Learning to Decouple Complex Systems

Regularization-free Diffeomorphic Temporal Alignment Nets

Deep Perturbation Learning: Enhancing the Network Performance via Image Perturbations

Optimizing the Collaboration Structure in Cross-Silo Federated Learning

InfoOT: Information Maximizing Optimal Transport

Stabilizing Transformer Training by Preventing Attention Entropy Collapse

Diffusion Models for Black-Box Optimization

Learning Mixtures of Markov Chains and MDPs

Self-Attention Amortized Distributional Projection Optimization for Sliced Wasserstein Point-Cloud Reconstruction

Importance Weighted Expectation-Maximization for Protein Sequence Design

Offline Meta Reinforcement Learning with In-Distribution Online Adaptation

X-Paste: Revisiting Scalable Copy-Paste for Instance Segmentation using CLIP and StableDiffusion

Open-VCLIP: Transforming CLIP to an Open-vocabulary Video Model via Interpolated Weight Optimization

Data-Copying in Generative Models: A Formal Framework

Learning to Suggest Breaks: Sustainable Optimization of Long-Term User Engagement

The Persistent Laplacian for Data Science: Evaluating Higher-Order Persistent Spectral Representations of Data

MG-GNN: Multigrid Graph Neural Networks for Learning Multilevel Domain Decomposition Methods

Harmonic Neural Networks

Representation-Driven Reinforcement Learning

Directed Chain Generative Adversarial Networks

Weakly Supervised Regression with Interval Targets

Weighted Sampling without Replacement for Deep Top-$k$ Classification

Are Diffusion Models Vulnerable to Membership Inference Attacks?

Blackout Diffusion: Generative Diffusion Models in Discrete-State Spaces

DoMo-AC: Doubly Multi-step Off-policy Actor-Critic Algorithm

Quantile Credit Assignment

Statistical Inference on Multi-armed Bandits with Delayed Feedback

Multi-User Reinforcement Learning with Low Rank Rewards

Model-based Offline Reinforcement Learning with Count-based Conservatism

TIDE: Time Derivative Diffusion for Deep Learning on Graphs

A Picture of the Space of Typical Learnable Tasks

Analyzing Diffusion as Serial Reproduction

Formalizing Preferences Over Runtime Distributions

Principled Offline RL in the Presence of Rich Exogenous Information

Differential Privacy, Linguistic Fairness, and Training Data Influence: Impossibility and Possibility Theorems for Multilingual Language Models

Inverse Reinforcement Learning without Reinforcement Learning

Team Belief DAG: Generalizing the Sequence Form to Team Games for Fast Computation of Correlated Team Max-Min Equilibria via Regret Minimization

Learning to Incentivize Information Acquisition: Proper Scoring Rules Meet Principal-Agent Model

How Bad is Top-$K$ Recommendation under Competing Content Creators?

Task-Specific Skill Localization in Fine-tuned Language Models

What do CNNs Learn in the First Layer and Why? A Linear Systems Perspective

A Three-regime Model of Network Pruning

Stochastic Gradient Descent-Induced Drift of Representation in a Two-Layer Neural Network

SNeRL: Semantic-aware Neural Radiance Fields for Reinforcement Learning

Uncertainty Estimation by Fisher Information-based Evidential Deep Learning

Linkless Link Prediction via Relational Distillation

Controlled Text Generation with Natural Language Instructions

Efficient Personalized Federated Learning via Sparse Model-Adaptation

From Relational Pooling to Subgraph GNNs: A Universal Framework for More Expressive Graph Neural Networks

On the Generalization of Multi-modal Contrastive Learning

On Pitfalls of Test-Time Adaptation

Counterfactual Identifiability of Bijective Causal Models

Multi-Layer Neural Networks as Trainable Ladders of Hilbert Spaces

Taming graph kernels with random features

SGD with AdaGrad Stepsizes: Full Adaptivity with High Probability to Unknown Parameters, Unbounded Gradients and Affine Variance

An Instrumental Variable Approach to Confounded Off-Policy Evaluation

Fair yet Asymptotically Equal Collaborative Learning

Fair and Accurate Decision Making through Group-Aware Learning

Decentralized Stochastic Bilevel Optimization with Improved per-Iteration Complexity

Matrix Estimation for Individual Fairness

Nearly-Linear Time and Streaming Algorithms for Outlier-Robust PCA

On the Impact of Knowledge Distillation for Model Interpretability

Neuro-Symbolic Continual Learning: Knowledge, Reasoning Shortcuts and Concept Rehearsal

ClimaX: A foundation model for weather and climate

Scaling Up Dataset Distillation to ImageNet-1K with Constant Memory

No One Idles: Efficient Heterogeneous Federated Learning with Parallel Edge and Server Computation

Symmetry-Aware Robot Design with Structured Subgroups

Traversing Between Modes in Function Space for Fast Ensembling

Understanding and Defending Patched-based Adversarial Attacks for Vision Transformer

Sequential Kernelized Independence Testing

Multi-Epoch Matrix Factorization Mechanisms for Private Machine Learning

Achieving Hierarchy-Free Approximation for Bilevel Programs with Equilibrium Constraints

On Excess Mass Behavior in Gaussian Mixture Models with Orlicz-Wasserstein Distances

Label differential privacy and private training data release

Moccasin: Efficient Tensor Rematerialization for Neural Networks

Rigid Body Flows for Sampling Molecular Crystal Structures

The Unintended Consequences of Discount Regularization: Improving Regularization in Certainty Equivalence Reinforcement Learning

Training Normalizing Flows from Dependent Data

Bayesian online change point detection with Hilbert space approximate Student-t process

Input uncertainty propagation through trained neural networks

Principled Acceleration of Iterative Numerical Methods Using Machine Learning

Thompson Sampling for High-Dimensional Sparse Linear Contextual Bandits

A Two-Stage Active Learning Algorithm for k-Nearest Neighbors

Semi-Dual Unbalanced Quadratic Optimal Transport: fast statistical rates and convergent algorithm.

Contextual Conservative Interleaving Bandits

Are Neurons Actually Collapsed? On the Fine-Grained Structure in Neural Representations

Efficient and Equivariant Graph Networks for Predicting Quantum Hamiltonian

Graph Switching Dynamical Systems

Probabilistic Concept Bottleneck Models

Function-Space Regularization in Neural Networks: A Probabilistic Perspective

Parallel Online Clustering of Bandits via Hedonic Game

Towards Constituting Mathematical Structures for Learning to Optimize

Tighter Information-Theoretic Generalization Bounds from Supersamples

Optimal Horizon-Free Reward-Free Exploration for Linear Mixture MDPs

CLUTR: Curriculum Learning via Unsupervised Task Representation Learning

ReLOAD: Reinforcement Learning with Optimistic Ascent-Descent for Last-Iterate Convergence in Constrained MDPs

Causal Bounds in Quasi-Markovian Graphs

Generalized Reductions: Making any Hierarchical Clustering Fair and Balanced with Low Cost

SpotEM: Efficient Video Search for Episodic Memory

Provably Invariant Learning without Domain Information

Evaluating Unsupervised Denoising Requires Unsupervised Metrics

Learning to acquire novel cognitive tasks with evolution, plasticity and meta-meta-learning

Identifying Interpretable Subspaces in Image Representations

Causal Proxy Models for Concept-based Model Explanations

Neural Stochastic Differential Games for Time-series Analysis

Implicit Neural Spatial Representations for Time-dependent PDEs

FARE: Provably Fair Representation Learning with Practical Certificates

Improving Visual Prompt Tuning for Self-supervised Vision Transformers

On Pre-Training for Visuo-Motor Control: Revisiting a Learning-from-Scratch Baseline

A Closer Look at Few-shot Classification Again

Bilevel Optimization with Coupled Decision-Dependent Distributions

Approximation and Estimation Ability of Transformers for Sequence-to-Sequence Functions with Infinite Dimensional Input

Federated Linear Contextual Bandits with User-level Differential Privacy

Unconstrained Online Learning with Unbounded Losses

Doubly Optimal No-Regret Learning in Monotone Games

Brainformers: Trading Simplicity for Efficiency

A Modern Look at the Relationship between Sharpness and Generalization

SRATTA: Sample Re-ATTribution Attack of Secure Aggregation in Federated Learning.

Exploring the Limits of Model-Targeted Indiscriminate Data Poisoning Attacks

On Strengthening and Defending Graph Reconstruction Attack with Markov Chain Approximation

Progressive Purification for Instance-Dependent Partial Label Learning

LESSON: Learning to Integrate Exploration Strategies for Reinforcement Learning via an Option Framework

Multi-Objective GFlowNets

Improving Medical Predictions by Irregular Multimodal Electronic Health Records Modeling

Long-Term Rhythmic Video Soundtracker

Adaptively Weighted Data Augmentation Consistency Regularization for Robust Optimization under Concept Shift

Neural Diffusion Processes

Estimating Joint Treatment Effects by Combining Multiple Experiments

Generalized Implicit Follow-The-Regularized-Leader

PAL: Program-aided Language Models

Looped Transformers as Programmable Computers

Is Learning Summary Statistics Necessary for Likelihood-free Inference?

The Computational Complexity of Concise Hypersphere Classification

Partial Optimality in Cubic Correlation Clustering

Accelerated Stochastic Optimization Methods under Quasar-convexity

Off-Policy Average Reward Actor-Critic with Deterministic Policy Search

Stochastic Gradient Descent under Markovian Sampling Schemes

Detecting Adversarial Directions in Deep Reinforcement Learning to Make Robust Decisions

Estimation Beyond Data Reweighting: Kernel Method of Moments

spred: Solving L1 Penalty with SGD

Graph Neural Tangent Kernel: Convergence on Large Graphs

Functional Neural Networks: Shift invariant models for functional data with applications to EEG classification

Compositional Score Modeling for Simulation-Based Inference

Unearthing InSights into Mars: Unsupervised Source Separation with Limited Data

Preprocessors Matter! Realistic Decision-Based Attacks on Machine Learning Systems

Performative Recommendation: Diversifying Content via Strategic Incentives

"Why did the Model Fail?": Attributing Model Performance Changes to Distribution Shifts

Block Subsampled Randomized Hadamard Transform for Nyström Approximation on Distributed Architectures

Latent Traversals in Generative Models as Potential Flows

Competitive Gradient Optimization

The Statistical Benefits of Quantile Temporal-Difference Learning for Value Estimation

Learning Dynamic Query Combinations for Transformer-based Object Detection and Segmentation

Human-Timescale Adaptation in an Open-Ended Task Space

Expected Gradients of Maxout Networks and Consequences to Parameter Initialization

A Category-theoretical Meta-analysis of Definitions of Disentanglement

On Preemption and Learning in Stochastic Scheduling

Adversarial Cheap Talk

Towards Stable and Efficient Adversarial Training against $l_1$ Bounded Adversarial Attacks

Improving the Model Consistency of Decentralized Federated Learning

One-Step Estimator for Permuted Sparse Recovery

Improving l1-Certified Robustness via Randomized Smoothing by Leveraging Box Constraints

Neural FIM for learning Fisher information metrics from point cloud data

Learn to Accumulate Evidence from All Training Samples: Theory and Practice

IncDSI: Incrementally Updatable Document Retrieval

Graph Neural Networks can Recover the Hidden Features Solely from the Graph Structure

The case for 4-bit precision: k-bit Inference Scaling Laws

A Toy Model of Universality: Reverse Engineering how Networks Learn Group Operations

Generated Graph Detection

LIV: Language-Image Representations and Rewards for Robotic Control

PromptBoosting: Black-Box Text Classification with Ten Forward Passes

Learning Controllable Degradation for Real-World Super-Resolution via Constrained Flows

PASTA: Pessimistic Assortment Optimization

Which Tricks are Important for Learning to Rank?

Neural Wasserstein Gradient Flows for Discrepancies with Riesz Kernels

Multicalibration as Boosting for Regression

Scalable Multi-Agent Reinforcement Learning through Intelligent Information Aggregation

On the Correctness of Automatic Differentiation for Neural Networks with Machine-Representable Parameters

Posterior Sampling for Deep Reinforcement Learning

Multi-Agent Best Arm Identification with Private Communications

A Fast, Well-Founded Approximation to the Empirical Neural Tangent Kernel

Multi-agent Online Scheduling: MMS Allocations for Indivisible Items

Double-Weighting for Covariate Shift Adaptation

Online Local Differential Private Quantile Inference via Self-normalization

Dink-Net: Neural Clustering on Large Graphs

MABe22: A Multi-Species Multi-Task Benchmark for Learned Representations of Behavior

Fast Excess Risk Rates via Offset Rademacher Complexity

Settling the Reward Hypothesis

Transcendental Idealism of Planner: Evaluating Perception from Planning Perspective for Autonomous Driving

Algorithms for bounding contribution for histogram estimation under user-level privacy

(ends 6:30 PM)

6:30 p.m.

Coffee Only Break

7 p.m.

Invited Talk:

Machine Learning with Social Purpose

Shakir Mohamed

(ends 8:00 PM)

8 p.m.

Coffee Break

8:30 p.m.

Oral A1 Causal Learning, RL, Personalization [8:30-10:00]

Orals 8:30-9:50

[8:30] Bayesian Design Principles for Frequentist Sequential Learning

[8:38] Towards Theoretical Understanding of Inverse Reinforcement Learning

[8:46] On the Power of Pre-training for Generalization in RL: Provable Benefits and Hardness

[8:54] Delayed Feedback in Kernel Bandits

[9:02] Provably Learning Object-Centric Representations

[9:10] Task-specific experimental design for treatment effect estimation

[9:18] Are labels informative in semi-supervised learning? Estimating and leveraging the missing-data mechanism.

[9:26] Interventional Causal Representation Learning

[9:34] Returning The Favour: When Regression Benefits From Probabilistic Causal Knowledge

[9:42] Sequential Underspecified Instrument Selection for Cause-Effect Estimation

(ends 10:00 PM)

Oral A2 Computer Vision and Efficient ML [8:30-10:00]

Orals 8:30-9:50

[8:30] Raising the Cost of Malicious AI-Powered Image Editing

[8:38] Dynamics-inspired Neuromorphic Visual Representation Learning

[8:46] Scaling Vision Transformers to 22 Billion Parameters

[8:54] Facial Expression Recognition with Adaptive Frame Rate based on Multiple Testing Correction

[9:02] Fourmer: An Efficient Global Modeling Paradigm for Image Restoration

[9:10] Learning Signed Distance Functions from Noisy 3D Point Clouds via Noise to Noise Mapping

[9:18] Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles

[9:26] Rockmate: an Efficient, Fast, Automatic and Generic Tool for Re-materialization in PyTorch

[9:34] SparseProp: Efficient Sparse Backpropagation for Faster Training of Neural Networks at the Edge

[9:42] Fast Inference from Transformers via Speculative Decoding

(ends 10:00 PM)

Oral A3 ML Theory [8:30-10:00]

Orals 8:30-9:58

[8:30] Self-Repellent Random Walks on General Graphs - Achieving Minimal Sampling Variance via Nonlinear Markov Chains

[8:38] Tighter Lower Bounds for Shuffling SGD: Random Permutations and Beyond

[8:46] Which Features are Learnt by Contrastive Learning? On the Role of Simplicity Bias in Class Collapse and Feature Suppression

[8:54] Tighter Information-Theoretic Generalization Bounds from Supersamples

[9:02] Beyond the Universal Law of Robustness: Sharper Laws for Random Features and Neural Tangent Kernels

[9:10] Bayes-optimal Learning of Deep Random Networks of Extensive-width

[9:18] Why does Throwing Away Data Improve Worst-Group Error?

[9:26] Marginalization is not Marginal: No Bad VAE Local Minima when Learning Optimal Sparse Representations

[9:34] Sharper Bounds for $\ell_p$ Sensitivity Sampling

[9:42] AdaBoost is not an Optimal Weak to Strong Learner

[9:50] Generalization on the Unseen, Logic Reasoning and Degree Curriculum

(ends 10:00 PM)

Oral A4 Diffusion [8:30-10:00]

Orals 8:30-9:50

[8:30] AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners

[8:38] Adversarial Example Does Good: Preventing Painting Imitation from Diffusion Models via Adversarial Examples

[8:46] Graphically Structured Diffusion Models

[8:54] Diffusion Models as Artists: Are we Closing the Gap between Humans and Machines?

[9:02] Refining Generative Process with Discriminator Guidance in Score-based Diffusion Models

[9:10] Diffusion Models are Minimax Optimal Distribution Estimators

[9:18] GibbsDDRM: A Partially Collapsed Gibbs Sampler for Solving Blind Inverse Problems with Denoising Diffusion Restoration

[9:26] OCD: Learning to Overfit with Conditional Diffusion Models

[9:34] Denoising MCMC for Accelerating Diffusion-Based Generative Models

[9:42] Cones: Concept Neurons in Diffusion Models for Customized Generation

(ends 10:00 PM)

Oral A5 Reinforcement Learning 1 [8:30-10:00]

Orals 8:30-9:50

[8:30] Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the Machiavelli Benchmark

[8:38] Information-Theoretic State Space Model for Multi-View Reinforcement Learning

[8:46] Reparameterized Policy Learning for Multimodal Trajectory Optimization

[8:54] Representation Learning with Multi-Step Inverse Kinematics: An Efficient and Optimal Approach to Rich-Observation RL

[9:02] Subequivariant Graph Reinforcement Learning in 3D Environments

[9:10] A Study of Global and Episodic Bonuses for Exploration in Contextual MDPs

[9:18] Warm-Start Actor-Critic: From Approximation Error to Sub-optimality Gap

[9:26] Efficient RL via Disentangled Environment and Agent Representations

[9:34] Flipping Coins to Estimate Pseudocounts for Exploration in Reinforcement Learning

[9:42] On the Statistical Benefits of Temporal Difference Learning

(ends 10:00 PM)

Oral A6 Reinforcement Learning 2 [8:30-10:00]

Orals 8:30-9:50

[8:30] Learning GFlowNets From Partial Episodes For Improved Convergence And Stability

[8:38] The Dormant Neuron Phenomenon in Deep Reinforcement Learning

[8:46] Reinforcement Learning from Passive Data via Latent Intentions

[8:54] Best of Both Worlds Policy Optimization

[9:02] Exponential Smoothing for Off-Policy Learning

[9:10] Quantile Credit Assignment

[9:18] Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels

[9:26] Hierarchies of Reward Machines

[9:34] Human-Timescale Adaptation in an Open-Ended Task Space

[9:42] Settling the Reward Hypothesis

(ends 10:00 PM)

WED 26 JUL

11 a.m.

Registration

(ends 9:00 PM)

12:30 p.m.

Invited Talk:

The Future of ML in Biology: CRISPR for Health and Climate

Jennifer Doudna

(ends 1:30 PM)

1 p.m.

Exhibit Hall Open

1:30 p.m.

Coffee Break

2 p.m.

Poster Session 3 [2:00-3:30]

Posters 2:00-3:30

On Uni-Modal Feature Learning in Supervised Multi-Modal Learning

Tuning Language Models as Training Data Generators for Augmentation-Enhanced Few-Shot Learning

ChiPFormer: Transferable Chip Placement via Offline Decision Transformer

Machine Learning Force Fields with Data Cost Aware Training

Out-of-Domain Robustness via Targeted Augmentations

Grounding Large Language Models in Interactive Environments with Online Reinforcement Learning

PINA: Leveraging Side Information in eXtreme Multi-label Classification via Predicted Instance Neighborhood Aggregation

Bidirectional Adaptation for Robust Semi-Supervised Learning with Inconsistent Data Distributions

FAIRER: Fairness as Decision Rationale Alignment

ConCerNet: A Contrastive Learning Based Framework for Automated Conservation Law Discovery and Trustworthy Dynamical System Prediction

Entropy-driven Unsupervised Keypoint Representation Learning in Videos

CocktailSGD: Fine-tuning Foundation Models over 500Mbps Networks

Optimization for Amortized Inverse Problems

Deep Latent State Space Models for Time-Series Generation

HyperTuning: Toward Adapting Large Language Models without Back-propagation

When does Privileged information Explain Away Label Noise?

Entity Divider with Language Grounding in Multi-Agent Reinforcement Learning

Differentiable Multi-Target Causal Bayesian Experimental Design

From Temporal to Contemporaneous Iterative Causal Discovery in the Presence of Latent Confounders

Reinforcement Learning in Low-rank MDPs with Density Features

When and How Does Known Class Help Discover Unknown Ones? Provable Understanding Through Spectral Analysis

Temporal Label Smoothing for Early Event Prediction

Conformal Prediction for Federated Uncertainty Quantification Under Label Shift

Extrapolative Controlled Sequence Generation via Iterative Refinement

Hierarchical Grammar-Induced Geometry for Data-Efficient Molecular Property Prediction

Hyperbolic Representation Learning: Revisiting and Advancing

COLA: Orchestrating Error Coding and Learning for Robust Neural Network Inference Against Hardware Defects

Learning Prescriptive ReLU Networks

Representer Point Selection for Explaining Regularized High-dimensional Models

Parallel Neurosymbolic Integration with Concordia

SparseGPT: Massive Language Models Can be Accurately Pruned in One-Shot

The Flan Collection: Designing Data and Methods for Effective Instruction Tuning

Multi-Fidelity Covariance Estimation in the Log-Euclidean Geometry

Pareto Regret Analyses in Multi-objective Multi-armed Bandit

Constrained Optimization via Exact Augmented Lagrangian and Randomized Iterative Sketching

Two Losses Are Better Than One: Faster Optimization Using a Cheaper Proxy

Sliced-Wasserstein on Symmetric Positive Definite Matrices for M/EEG Signals

Orthogonality-Enforced Latent Space in Autoencoders: An Approach to Learning Disentangled Representations

SinDDM: A Single Image Denoising Diffusion Model

Distilling Internet-Scale Vision-Language Models into Embodied Agents

Emergent Agentic Transformer from Chain of Hindsight Experience

Bootstrapped Representations in Reinforcement Learning

PPG Reloaded: An Empirical Study on What Matters in Phasic Policy Gradient

An SDE for Modeling SAM: Theory and Insights

Beyond the Universal Law of Robustness: Sharper Laws for Random Features and Neural Tangent Kernels

MetaModulation: Learning Variational Feature Hierarchies for Few-Shot Learning with Fewer Tasks

FusionRetro: Molecule Representation Fusion via In-Context Learning for Retrosynthetic Planning

Are Large Kernels Better Teachers than Transformers for ConvNets?

Neural Prediction Errors enable Analogical Visual Reasoning in Human Standard Intelligence Tests

Test-Time Style Shifting: Handling Arbitrary Styles in Domain Generalization

CrossSplit: Mitigating Label Noise Memorization through Data Splitting

Searching Large Neighborhoods for Integer Linear Programs with Contrastive Learning

Controlling Posterior Collapse by an Inverse Lipschitz Constraint on the Decoder Network

On Sampling with Approximate Transport Maps

Differentiable Simulations for Enhanced Sampling of Rare Events

Scaling Spherical CNNs

Hardware-Aware Compression with Random Operation Access Specific Tile (ROAST) Hashing

Can Large Language Models Reason about Program Invariants?

A Unifying Framework to the Analysis of Interaction Methods using Synergy Functions

Scaling Vision Transformers to 22 Billion Parameters

On Regularization and Inference with Label Constraints

On Bridging the Gap between Mean Field and Finite Width Deep Random Multilayer Perceptron with Batch Normalization

SAM operates far from home: eigenvalue regularization as a dynamical phenomenon

Efficient Approximations of Complete Interatomic Potentials for Crystal Property Prediction

End-to-end Differentiable Clustering with Associative Memories

Fourmer: An Efficient Global Modeling Paradigm for Image Restoration

HOPE: High-order Graph ODE For Modeling Interacting Dynamics

SlotGAT: Slot-based Message Passing for Heterogeneous Graphs

Graph Inductive Biases in Transformers without Message Passing

Outline, Then Details: Syntactically Guided Coarse-To-Fine Code Generation

Autoregressive Diffusion Model for Graph Generation

Optimal Arms Identification with Knapsacks

Adaptive IMLE for Few-shot Pretraining-free Generative Modelling

Rethinking Explaining Graph Neural Networks via Non-parametric Subgraph Matching

Demystifying Uneven Vulnerability of Link Stealing Attacks against Graph Neural Networks

Bayesian Reparameterization of Reward-Conditioned Reinforcement Learning with Energy-based Models

MonoNeRF: Learning Generalizable NeRFs from Monocular Videos without Camera Poses

Muse: Text-To-Image Generation via Masked Generative Transformers

A Simple Zero-shot Prompt Weighting Technique to Improve Prompt Ensembling in Text-Image Models

VectorMapNet: End-to-end Vectorized HD Map Learning

A Statistical Perspective on Retrieval-Based Models

Q-learning Decision Transformer: Leveraging Dynamic Programming for Conditional Sequence Modelling in Offline RL

Actor-Critic Alignment for Offline-to-Online Reinforcement Learning

Pre-computed memory or on-the-fly encoding? A hybrid approach to retrieval augmentation makes the most of your compute

Universal Physics-Informed Neural Networks: Symbolic Differential Operator Discovery with Sparse Data

PAC-Bayesian Offline Contextual Bandits With Guarantees

Exponential Smoothing for Off-Policy Learning

A Kernelized Stein Discrepancy for Biological Sequences

Sequential Counterfactual Risk Minimization

Optimal LP Rounding and Linear-Time Approximation Algorithms for Clustering Edge-Colored Hypergraphs

Polynomial Time and Private Learning of Unbounded Gaussian Mixture Models

Improved Online Conformal Prediction via Strongly Adaptive Online Learning

A Study on Transformer Configuration and Training Objective

Learning to Boost Training by Periodic Nowcasting Near Future Weights

Evolving Semantic Prototype Improves Generative Zero-Shot Learning

A Universal Unbiased Method for Classification from Aggregate Observations

LipsNet: A Smooth and Robust Neural Network with Adaptive Lipschitz Constant for High Accuracy Optimal Control

BiRT: Bio-inspired Replay in Vision Transformers for Continual Learning

On the Effectiveness of Offline RL for Dialogue Response Generation

Generalized Disparate Impact for Configurable Fairness Solutions in ML

Optimality of Thompson Sampling with Noninformative Priors for Pareto Bandits

Towards Bridging the Gaps between the Right to Explanation and the Right to be Forgotten

On the Connection Between MPNN and Graph Transformer

A Fully First-Order Method for Stochastic Bilevel Optimization

Horizon-free Learning for Markov Decision Processes and Games: Stochastically Bounded Rewards and Improved Bounds

Optimal Online Generalized Linear Regression with Stochastic Noise and Its Application to Heteroscedastic Bandits

Concurrent Shuffle Differential Privacy Under Continual Observation

The Optimal Approximation Factors in Misspecified Off-Policy Value Function Estimation

Constant Matters: Fine-grained Error Bound on Differentially Private Continual Observation

Projected Tensor Power Method for Hypergraph Community Recovery

LegendreTron: Uprising Proper Multiclass Loss Learning

A Nearly-Optimal Bound for Fast Regression with $\ell_\infty$ Guarantee

Adapting to game trees in zero-sum imperfect information games

Efficient Rate Optimal Regret for Adversarial Contextual MDPs Using Online Function Approximation

Tight Regret Bounds for Single-pass Streaming Multi-armed Bandits

Effective Minkowski Dimension of Deep Nonparametric Regression: Function Approximation and Statistical Theories

Fast Rates for Maximum Entropy Exploration

Achieving Linear Speedup in Non-IID Federated Bilevel Learning

Measuring the Impact of Programming Language Distribution

Hierarchical Imitation Learning with Vector Quantized Models

CataBEEM: Integrating Latent Interaction Categories in Node-wise Community Detection Models for Network Data

Interpolation for Robust Learning: Data Augmentation on Wasserstein Geodesics

SAAL: Sharpness-Aware Active Learning

Structural Re-weighting Improves Graph Domain Adaptation

Total Variation Graph Neural Networks

STEP: Learning N:M Structured Sparsity Masks from Scratch with Precondition

Distributional Offline Policy Evaluation with Predictive Error Guarantees

Decentralized SGD and Average-direction SAM are Asymptotically Equivalent

Go Beyond Imagination: Maximizing Episodic Reachability with World Models

Multi-task Hierarchical Adversarial Inverse Reinforcement Learning

In Search of Insights, Not Magic Bullets: Towards Demystification of the Model Selection Dilemma in Heterogeneous Treatment Effect Estimation

On Provable Copyright Protection for Generative Models

Attributing Image Generative Models using Latent Fingerprints

On the Robustness of Text Vectorizers

Neural Inverse Operators for Solving PDE Inverse Problems

Neural Wave Machines: Learning Spatiotemporally Structured Representations with Locally Coupled Oscillatory Recurrent Neural Networks

Graph Reinforcement Learning for Network Control via Bi-Level Optimization

Image Restoration with Mean-Reverting Stochastic Differential Equations

Variance Control for Distributional Reinforcement Learning

Data-OOB: Out-of-bag Estimate as a Simple and Efficient Data Value

Improved Techniques for Maximum Likelihood Estimation for Diffusion ODEs

Probabilistic Contrastive Learning Recovers the Correct Aleatoric Uncertainty of Ambiguous Inputs

Policy Gradient in Robust MDPs with Global Convergence Guarantee

Stable Estimation of Heterogeneous Treatment Effects

Off-Policy Evaluation for Large Action Spaces via Conjunct Effect Modeling

Fair and Optimal Classification via Post-Processing

Provably Efficient Representation Learning with Tractable Planning in Low-Rank POMDP

End-to-end Training of Deep Boltzmann Machines by Unbiased Contrastive Divergence with Local Mode Initialization

A Generalization of ViT/MLP-Mixer to Graphs

Model-agnostic Measure of Generalization Difficulty

The Acquisition of Physical Knowledge in Generative Neural Networks

PAC Prediction Sets for Large Language Models of Code

Deep Laplacian-based Options for Temporally-Extended Exploration

Learning-augmented private algorithms for multiple quantile release

Efficient Exploration via Epistemic-Risk-Seeking Policy Optimization

Best Arm Identification in Multi-Agent Multi-Armed Bandits

Probably Anytime-Safe Stochastic Combinatorial Semi-Bandits

Who Needs to Know? Minimal Knowledge for Optimal Coordination

Mimetic Initialization of Self-Attention Layers

Data Representations' Study of Latent Image Manifolds

On Over-Squashing in Message Passing Neural Networks: The Impact of Width, Depth, and Topology

General Sequential Episodic Memory Model

Shape-Guided Dual-Memory Learning for 3D Anomaly Detection

Calibrating Multimodal Learning

DualHSIC: HSIC-Bottleneck and Alignment for Continual Learning

ProtST: Multi-Modality Learning of Protein Sequences and Biomedical Texts

Anchor Sampling for Federated Learning with Partial Client Participation

A Complete Expressiveness Hierarchy for Subgraph GNNs via Subgraph Weisfeiler-Lehman Tests

Towards Deep Attention in Graph Neural Networks: Problems and Remedies

On the Impact of Algorithmic Recourse on Social Segregation

Memory-Based Meta-Learning on Non-Stationary Distributions

Polynomial Preconditioning for Gradient Methods

Delay-agnostic Asynchronous Coordinate Update Algorithm

Accounting For Informative Sampling When Learning to Forecast Treatment Outcomes Over Time

Parameter-Level Soft-Masking for Continual Learning

Low Complexity Homeomorphic Projection to Ensure Neural-Network Solution Feasibility for Optimization over (Non-)Convex Set

On the Within-Group Fairness of Screening Classifiers

Provable Benefit of Mixup for Finding Optimal Decision Boundaries

Large Language Models Can Be Easily Distracted by Irrelevant Context

Resurrecting Recurrent Neural Networks for Long Sequences

NUNO: A General Framework for Learning Parametric PDEs with Non-Uniform Data

Differentially Private Optimization on Large Model at Small Cost

SpeedDETR: Speed-aware Transformers for End-to-end Object Detection

NeuralStagger: Accelerating Physics-constrained Neural PDE Solver with Spatial-temporal Decomposition

Less is More: Task-aware Layer-wise Distillation for Language Model Compression

Understanding Backdoor Attacks through the Adaptability Hypothesis

Proximal Causal Learning of Conditional Average Treatment Effects

Fast Private Kernel Density Estimation via Locality Sensitive Quantization

Approximate Causal Effect Identification under Weak Confounding

Regression with Label Permutation in Generalized Linear Model

Multi-Task Differential Privacy Under Distribution Skew

Over-parametrization via Lifting for Low-rank Matrix Sensing: Conversion of Spurious Solutions to Strict Saddle Points

Modeling Temporal Data as Continuous Functions with Stochastic Process Diffusion

Towards Understanding and Improving GFlowNet Training

Certifying Ensembles: A General Certification Theory with S-Lipschitzness

Using Perturbation to Improve Goodness-of-Fit Tests based on Kernelized Stein Discrepancy

Adaptive Barrier Smoothing for First-Order Policy Gradient with Contact Dynamics

Variational Sparse Inverse Cholesky Approximation for Latent Gaussian Processes via Double Kullback-Leibler Minimization

BNN-DP: Robustness Certification of Bayesian Neural Networks via Dynamic Programming

EF21-P and Friends: Improved Theoretical Communication Complexity for Distributed Optimization with Bidirectional Compression

Existence and Estimation of Critical Batch Size for Training Generative Adversarial Networks with Two Time-Scale Update Rule

Bayesian Neural Networks Avoid Encoding Complex and Perturbation-Sensitive Concepts

StriderNet: A Graph Reinforcement Learning Approach to Optimize Atomic Structures on Rough Energy Landscapes

On the Functional Similarity of Robust and Non-Robust Neural Representations

Aligning Language Models with Preferences through $f$-divergence Minimization

On the Convergence of Federated Averaging with Cyclic Client Participation

Understanding Gradient Regularization in Deep Learning: Efficient Finite-Difference Computation and Implicit Bias

Robustly Learning a Single Neuron via Sharpness

Online Restless Bandits with Unobserved States

Beyond Reward: Offline Preference-guided Policy Optimization

Streaming Submodular Maximization with Differential Privacy

Atari-5: Distilling the Arcade Learning Environment down to Five Games

DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature

An Effective Meaningful Way to Evaluate Survival Models

Efficient RL via Disentangled Environment and Agent Representations

Learning Deductive Reasoning from Synthetic Corpus based on Formal Logic

Towards Understanding Generalization of Graph Neural Networks

Learning Physical Models that Can Respect Conservation Laws

Are Random Decompositions all we need in High Dimensional Bayesian Optimisation?

Architecture-Agnostic Masked Image Modeling -- From ViT back to CNN

Curious Replay for Model-based Adaptation

Temporally Consistent Transformers for Video Generation

Adaptive Estimation of Graphical Models under Total Positivity

Nearly-tight Bounds for Deep Kernel Learning

Best of Both Worlds Policy Optimization

Optimal Stochastic Non-smooth Non-convex Optimization through Online-to-Non-convex Conversion

High-dimensional Location Estimation via Norm Concentration for Subgamma Vectors

CRISP: Curriculum based Sequential neural decoders for Polar code family

Banker Online Mirror Descent: A Universal Approach for Delayed Online Bandit Learning

Scalable Adaptive Computation for Iterative Generation

On the Training Instability of Shuffling SGD with Batch Normalization

Reasons for the Superiority of Stochastic Estimators over Deterministic Ones: Robustness, Consistency and Perceptual Quality

On the Robustness of Randomized Ensembles to Adversarial Perturbations

Exploring Model Dynamics for Accumulative Poisoning Discovery

Adversarial Learning of Distributional Reinforcement Learning

Understanding the Complexity Gains of Single-Task RL with a Curriculum

Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models

Understanding and Generalizing Contrastive Learning from the Inverse Optimal Transport Perspective

Constrained Causal Bayesian Optimization

Comparison of meta-learners for estimating multi-valued treatment heterogeneous effects

LinSATNet: The Positive Linear Satisfiability Neural Networks

Specializing Smaller Language Models towards Multi-Step Reasoning

Not all Strongly Rayleigh Distributions Have Small Probabilistic Generating Circuits

Tighter Bounds on the Expressivity of Transformer Encoders

Equivariant Architectures for Learning in Deep Weight Spaces

H-Likelihood Approach to Deep Neural Networks with Temporal-Spatial Random Effects for High-Cardinality Categorical Features

Fairness in Streaming Submodular Maximization over a Matroid Constraint

Difference of submodular minimization via DC programming

Scalable Safe Policy Improvement via Monte Carlo Tree Search

Restoration-Degradation Beyond Linear Diffusions: A Non-Asymptotic Analysis For DDIM-type Samplers

Enabling First-Order Gradient-Based Learning for Equilibrium Computation in Markets

Semiparametrically Efficient Off-Policy Evaluation in Linear Markov Decision Processes

Sequential Predictive Conformal Inference for Time Series

Shedding a PAC-Bayesian Light on Adaptive Sliced-Wasserstein Distances

Generative Adversarial Symmetry Discovery

Efficient Transformed Gaussian Processes for Non-Stationary Dependent Multi-class Classification

OCD: Learning to Overfit with Conditional Diffusion Models

Probabilistic Unrolling: Scalable, Inverse-Free Maximum Likelihood Estimation for Latent Gaussian Models

Conformalization of Sparse Generalized Linear Models

Protecting Language Generation Models via Invisible Watermarking

The Dormant Neuron Phenomenon in Deep Reinforcement Learning

Model Transferability with Responsive Decision Subjects

Nearly-Optimal Hierarchical Clustering for Well-Clustered Graphs

FP-Diffusion: Improving Score-based Diffusion Models by Enforcing the Underlying Score Fokker-Planck Equation

Tighter Lower Bounds for Shuffling SGD: Random Permutations and Beyond

Facial Expression Recognition with Adaptive Frame Rate based on Multiple Testing Correction

Lookahead When It Matters: Adaptive Non-causal Transformers for Streaming Neural Transducers

PAC-Bayesian Generalization Bounds for Adversarial Generative Models

How to address monotonicity for model risk management?

Causal Modeling of Policy Interventions From Treatment–Outcome Sequences

Towards Robust and Safe Reinforcement Learning with Benign Off-policy Data

Low-Switching Policy Gradient with Exploration via Online Sensitivity Sampling

Optimistic Online Mirror Descent for Bridging Stochastic and Adversarial Online Convex Optimization

On Enhancing Expressive Power via Compositions of Single Fixed-Size ReLU Network

Convex Geometry of ReLU-layers, Injectivity on the Ball and Local Reconstruction

Image generation with shortest path diffusion

On the Importance of Feature Decorrelation for Unsupervised Representation Learning in Reinforcement Learning

LESS-VFL: Communication-Efficient Feature Selection for Vertical Federated Learning

InGram: Inductive Knowledge Graph Embedding via Relation Graphs

UPSCALE: Unconstrained Channel Pruning

Path Neural Networks: Expressive and Accurate Graph Neural Networks

Rethink DARTS Search Space and Renovate a New Benchmark

Variational Open-Domain Question Answering

Gradient-Free Structured Pruning with Unlabeled Data

GREAD: Graph Neural Reaction-Diffusion Networks

Provably Efficient Offline Reinforcement Learning with Perturbed Data Sources

Efficient Bound of Lipschitz Constant for Convolutional Layers by Gram Iteration

Rockmate: an Efficient, Fast, Automatic and Generic Tool for Re-materialization in PyTorch

The Statistical Scope of Multicalibration

Cross-Entropy Loss Functions: Theoretical Analysis and Applications

On Many-Actions Policy Gradient

The Test of Tests: A Framework for Differentially Private Hypothesis Testing

Automated Search for Conjectures on Mathematical Constants using Analysis of Integer Sequences

Bandit Multi-linear DR-Submodular Maximization and Its Applications on Adversarial Submodular Bandits

Dimension-independent Certified Neural Network Watermarks via Mollifier Smoothing

Nonparametric Extensions of Randomized Response for Private Confidence Sets

Revisiting Weighted Aggregation in Federated Learning with Neural Networks

SegCLIP: Patch Aggregation with Learnable Centers for Open-Vocabulary Semantic Segmentation

Cooperation in the Latent Space: The Benefits of Adding Mixture Components in Variational Autoencoders

From Perception to Programs: Regularize, Overparameterize, and Amortize

Random Classification Noise does not defeat All Convex Potential Boosters Irrespective of Model Choice

A Hybrid Quantum-Classical Approach based on the Hadamard Transform for the Convolutional Layer

Computationally Efficient PAC RL in POMDPs with Latent Determinism and Conditional Embeddings

(ends 3:30 PM)

3:30 p.m.

Lunch -(On Your Own)

5 p.m.

Poster Session 4 [5:00-6:30]

Posters 5:00-6:30

Continual Learners are Incremental Model Generalizers

CLIPood: Generalizing CLIP to Out-of-Distributions

Effectively Using Public Data in Privacy Preserving Machine Learning

Graph Generative Model for Benchmarking Graph Neural Networks

Learning Intuitive Policies Using Action Features

Shiftable Context: Addressing Training-Inference Context Mismatch in Simultaneous Speech Translation

Improving Adversarial Robustness Through the Contrastive-Guided Diffusion Process

Deep Graph Representation Learning and Optimization for Influence Maximization

Fascinating Supervisory Signals and Where to Find Them: Deep Anomaly Detection with Scale Learning

InfoDiffusion: Representation Learning Using Information Maximizing Diffusion Models

Scalable Set Encoding with Universal Mini-Batch Consistency and Unbiased Full Set Gradient Approximation

Contrastive Energy Prediction for Exact Energy-Guided Diffusion Sampling in Offline Reinforcement Learning

Consistency Models

XTab: Cross-table Pretraining for Tabular Transformers

Simple and Fast Group Robustness by Automatic Feature Reweighting

Hierarchical Programmatic Reinforcement Learning via Learning to Compose Programs

Hypervolume Knowledge Gradient: A Lookahead Approach for Multi-Objective Bayesian Optimization with Partial Information

Nonlinear Causal Discovery with Latent Confounders

Marginalization is not Marginal: No Bad VAE Local Minima when Learning Optimal Sparse Representations

MonoFlow: Rethinking Divergence GANs via the Perspective of Wasserstein Gradient Flows

Reachability-Aware Laplacian Representation in Reinforcement Learning

A Large-Scale Study of Probabilistic Calibration in Neural Network Regression

Hypothesis Transfer Learning with Surrogate Classification Losses: Generalization Bounds through Algorithmic Stability

Synthetic data for model selection

Drug Discovery under Covariate Shift with Domain-Informed Prior Distributions over Functions

Feature Expansion for Graph Neural Networks

GRAFENNE: Learning on Graphs with Heterogeneous and Dynamic Feature Sets

Predicting Rare Events by Shrinking Towards Proportional Odds

Decoding Layer Saliency in Language Transformers

Hyena Hierarchy: Towards Larger Convolutional Language Models

Towards Controlled Data Augmentations for Active Learning

Mirror Sinkhorn: Fast Online Optimization on Transport Polytopes

Online Nonstochastic Control with Adversarial and Static Constraints

Accelerated Primal-Dual Methods for Convex-Strongly-Concave Saddle Point Problems

Simplifying Momentum-based Positive-definite Submanifold Optimization with Applications to Deep Learning

Restoration based Generative Models

Self-supervised learning of Split Invariant Equivariant representations

Invariant Slot Attention: Object Discovery with Slot-Centric Reference Frames

Simple Embodied Language Learning as a Byproduct of Meta-Reinforcement Learning

Simplified Temporal Consistency Reinforcement Learning

Reinforcement Learning from Passive Data via Latent Intentions

Differentially Private Sharpness-Aware Training

Provably and Practically Efficient Neural Contextual Bandits

Deterministic equivalent and error universality of deep random features learning

How Does Information Bottleneck Help Deep Learning?

Prototype-Sample Relation Distillation: Towards Replay-Free Continual Learning

End-to-End Full-Atom Antibody Design

A Closer Look at Self-Supervised Lightweight Vision Transformers

Trompt: Towards a Better Deep Neural Network for Tabular Data

Pre-training for Speech Translation: CTC Meets Optimal Transport

LSDS++ : Dual Sampling for Accelerated k-means++

Efficient Parametric Approximations of Neural Network Function Space Distance

Why Target Networks Stabilise Temporal Difference Methods

Can Forward Gradient Match Backpropagation?

Toward Large Kernel Models

SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

Nugget: Neural Agglomerative Embeddings of Text

Mechanistic Mode Connectivity

Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles

Evaluating Self-Supervised Learning via Risk Decomposition

Width and Depth Limits Commute in Residual Networks

On the Convergence of SARSA with Linear Function Approximation

Generating Novel, Designable, and Diverse Protein Structures by Equivariantly Diffusing Oriented Residue Clouds

Differentiable Tree Operations Promote Compositional Generalization

Bayesian Progressive Deep Topic Model with Knowledge Informed Textual Data Coarsening Process

Learning Subpocket Prototypes for Generalizable Structure-based Drug Design

LoSparse: Structured Compression of Large Language Models based on Low-Rank and Sparse Approximation

Fisher Information Embedding for Node and Graph Learning

Composer: Creative and Controllable Image Synthesis with Composable Conditions

Vector Quantized Wasserstein Auto-Encoder

Tensor Gaussian Process with Contraction for Multi-Channel Imaging Analysis

Enforcing Hard Constraints with Soft Barriers: Safe Reinforcement Learning in Unknown Stochastic Environments

Local Optimization Achieves Global Optimality in Multi-Agent Reinforcement Learning

Nested Elimination: A Simple Algorithm for Best-Item Identification From Choice-Based Feedback

Regularizing Towards Soft Equivariance Under Mixed Symmetries

Unleashing Mask: Explore the Intrinsic Out-of-Distribution Detection Capability

Byzantine-Robust Learning on Heterogeneous Data via Gradient Splitting

Future-conditioned Unsupervised Pretraining for Decision Transformer

Controllability-Aware Unsupervised Skill Discovery

Text-To-4D Dynamic Scene Generation

CHiLS: Zero-Shot Image Classification with Hierarchical Label Sets

A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition

Policy Contrastive Imitation Learning

Supported Trust Region Optimization for Offline Reinforcement Learning

Synthetic Data, Real Errors: How (Not) to Publish and Use Synthetic Data

Controlled Differential Equations on Long Sequences via Non-standard Wavelets

Reinforcement Learning with History Dependent Dynamic Contexts

When do Minimax-fair Learning and Empirical Risk Minimization Coincide?

Sampling-Based Accuracy Testing of Posterior Estimators for General Inference

Learning to Maximize Mutual Information for Dynamic Feature Selection

Submodular Order Functions and Assortment Optimization

Federated Heavy Hitter Recovery under Linear Sketching

Langevin Thompson Sampling with Logarithmic Communication: Bandits and Reinforcement Learning

Revisiting Discriminative vs. Generative Classifiers: Theory and Implications

Continual Vision-Language Representation Learning with Off-Diagonal Information

Diversity-enhancing Generative Network for Few-shot Hypothesis Adaptation

Loss Balancing for Fair Supervised Learning

PixelAsParam: A Gradient View on Diffusion Sampling with Guidance

Demonstration-free Autonomous Reinforcement Learning via Implicit and Bidirectional Curriculum

Tuning Computer Vision Models With Task Rewards

Fairness in Matching under Uncertainty

Multiply Robust Off-policy Evaluation and Learning under Truncation by Death

The Ideal Continual Learner: An Agent That Never Forgets

Oscillation-free Quantization for Low-bit Vision Transformers

Stochastic Policy Gradient Methods: Improved Sample Complexity for Fisher-non-degenerate Policies

Policy Mirror Ascent for Efficient and Independent Learning in Mean Field Games

Distributed Contextual Linear Bandits with Minimax Optimal Communication Cost

Dynamic Constrained Submodular Optimization with Polylogarithmic Update Time

$H$-Consistency Bounds for Pairwise Misranking Loss Surrogates

Differentially Private Hierarchical Clustering with Provable Approximation Guarantees

Accelerated Infeasibility Detection of Constrained Optimization and Fixed-Point Iterations

Near-Optimal Cryptographic Hardness of Agnostically Learning Halfspaces and ReLU Regression under Gaussian Marginals

On the Optimality of Misspecified Kernel Ridge Regression

Communication-Constrained Bandits under Additive Gaussian Noise

Quantum Speedups for Zero-Sum Games via Improved Dynamic Gibbs Sampling

Nearly Optimal Algorithms with Sublinear Computational Complexity for Online Kernel Regression

Maximal Initial Learning Rates in Deep ReLU Networks

Dynamical Linear Bandits

Fast Algorithms for Distributed k-Clustering with Outliers

Adaptive Computation with Elastic Input Sequence

Stein Variational Goal Generation for adaptive Exploration in Multi-Goal Reinforcement Learning

Weighted Flow Diffusion for Local Graph Clustering with Node Attributes: an Algorithm and Statistical Guarantees

Wrapped Cauchy Distributed Angular Softmax for Long-Tailed Visual Recognition

FedDisco: Federated Learning with Discrepancy-Aware Collaboration

Personalized Federated Learning with Inferred Collaboration Graphs

ModelDiff: A Framework for Comparing Learning Algorithms

Straightening Out the Straight-Through Estimator: Overcoming Optimization Challenges in Vector Quantized Networks

TR0N: Translator Networks for 0-Shot Plug-and-Play Conditional Generation

SMURF-THP: Score Matching-based UnceRtainty quantiFication for Transformer Hawkes Process

Averaged Method of Multipliers for Bi-Level Optimization without Lower-Level Strong Convexity

SeMAIL: Eliminating Distractors in Visual Imitation via Separated Models

One-shot Imitation in a Non-Stationary Environment via Multi-Modal Skill

GraphCleaner: Detecting Mislabelled Samples in Popular Graph Learning Benchmarks

Adaptive Identification of Populations with Treatment Benefit in Clinical Trials: Machine Learning Challenges and Solutions

Theoretical Behavior of XAI Methods in the Presence of Suppressor Variables

A Watermark for Large Language Models

Towards Explaining Distribution Shifts

Topological Point Cloud Clustering

Learning Affinity with Hyperbolic Representation for Spatial Propagation

Continuous Spatiotemporal Transformer

Trainability, Expressivity and Interpretability in Gated Neural ODEs

DevFormer: A Symmetric Transformer for Context-Aware Device Placement

Subequivariant Graph Reinforcement Learning in 3D Environments

Tilted Sparse Additive Models

Dual Focal Loss for Calibration

A Flexible Diffusion Model

Learning Control by Iterative Inversion

On the Global Convergence of Risk-Averse Policy Gradient Methods with Expected Conditional Risk Measures

Difference-in-Differences Meets Tree-based Methods: Heterogeneous Treatment Effects Estimation with Unmeasured Confounding

Fundamental Tradeoffs in Learning with Prior Information

Reinforcement Learning Can Be More Efficient with Multiple Rewards

Minimizing Trajectory Curvature of ODE-based Generative Models

All in a Row: Compressed Convolution Networks for Graphs

Generalization on the Unseen, Logic Reasoning and Degree Curriculum

Diffusion Models as Artists: Are we Closing the Gap between Humans and Machines?

AutoCoreset: An Automatic Practical Coreset Construction Framework

Automatic Data Augmentation via Invariance-Constrained Learning

On the Privacy-Robustness-Utility Trilemma in Distributed Learning

Target-based Surrogates for Stochastic Optimization

Regret Minimization and Convergence to Equilibria in General-sum Markov Games

Online Mechanism Design for Information Acquisition

Sequential Strategic Screening

Nonlinear Advantage: Trained Networks Might Not Be As Complex as You Think

Lottery Tickets in Evolutionary Optimization: On Sparse Backpropagation-Free Trainability

Dynamics-inspired Neuromorphic Visual Representation Learning

Prototype-oriented unsupervised anomaly detection for multivariate time series

Provable Dynamic Fusion for Low-Quality Multimodal Data

Master-ASR: Achieving Multilingual Scalability and Low-Resource Adaptation in ASR with Modular Learning

MAGANet: Achieving Combinatorial Generalization by Modeling a Group Action

Fed-CBS: A Heterogeneity-Aware Client Sampling Mechanism for Federated Learning via Class-Imbalance Reduction

Towards credible visual model interpretation with path attribution

GOAT: A Global Transformer on Large-scale Graphs

CodeIPPrompt: Intellectual Property Infringement Assessment of Code Language Models

On the Stepwise Nature of Self-Supervised Learning

Gradient Descent Finds the Global Optima of Two-Layer Physics-Informed Neural Networks

Optimal Shrinkage for Distributed Second-Order Optimization

Personalized Federated Learning under Mixture of Distributions

Does Continual Learning Equally Forget All Parameters?

Alternately Optimized Graph Neural Networks

Optimizing NOTEARS Objectives via Topological Swaps

Subset Selection Based On Multiple Rankings in the Presence of Bias: Effectiveness of Fairness Constraints for Multiwinner Voting Score Functions

Hiding Data Helps: On the Benefits of Masking for Sparse Coding

Learnability and Algorithm for Continual Learning

Revisiting Structured Variational Autoencoders

Training Deep Surrogate Models with Large Scale Online Learning

KDEformer: Accelerating Transformers via Kernel Density Estimation

SpENCNN: Orchestrating Encoding and Sparsity for Fast Homomorphically Encrypted Neural Network Inference

Efficient Learning of Mesh-Based Physical Simulation with Bi-Stride Multi-Scale Graph Neural Network

Online Prototype Alignment for Few-shot Policy Transfer

Adversarial Parameter Attack on Deep Neural Networks

Mixture Proportion Estimation Beyond Irreducibility

Sketch-Flip-Merge: Mergeable Sketches for Private Distinct Counting

Structure Learning of Latent Factors via Clique Search on Correlation Thresholded Graphs

Conformal Prediction with Missing Values

Multi-Task Off-Policy Learning from Bandit Feedback

Spurious Valleys and Clustering Behavior of Neural Networks

User-defined Event Sampling and Uncertainty Quantification in Diffusion Models for Physical Dynamical Systems

Learning GFlowNets From Partial Episodes For Improved Convergence And Stability

On Second-Order Scoring Rules for Epistemic Uncertainty Quantification

Instrumental Variable Estimation of Average Partial Causal Effects

Stabilizing GANs' Training with Brownian Motion Controller

Primal and Dual Analysis of Entropic Fictitious Play for Finite-sum Problems

Minimalistic Predictions to Schedule Jobs with Online Precedence Constraints

One-vs-the-Rest Loss to Focus on Important Samples in Adversarial Training

Does a Neural Network Really Encode Symbolic Concepts?

FLEX: an Adaptive Exploration Algorithm for Nonlinear Systems

Finding the Missing-half: Graph Complementary Learning for Homophily-prone and Heterophily-prone Graphs

Concept-based Explanations for Out-of-Distribution Detectors

Men Also Do Laundry: Multi-Attribute Bias Amplification

Scaling Laws for Reward Model Overoptimization

HETAL: Efficient Privacy-preserving Transfer Learning with Homomorphic Encryption

Benign Overfitting in Deep Neural Networks under Lazy Training

Gradient Descent Converges Linearly for Logistic Regression on Separable Data

Live in the Moment: Learning Dynamics Model Adapted to Evolving Policy

Multi-Agent Learning from Learners

Neural Network Accelerated Implicit Filtering: Integrating Neural Network Surrogates With Provably Convergent Derivative Free Optimization Methods

Improved Algorithms for Multi-period Multi-class Packing Problems with Bandit Feedback

Automatically Auditing Large Language Models via Discrete Optimization

In or Out? Fixing ImageNet Out-of-Distribution Detection Evaluation

Online Platt Scaling with Calibeating

Stratified Adversarial Robustness with Rejection

Properties of the Mallows Model Depending on the Number of Alternatives: A Warning for an Experimentalist

Motion Question Answering via Modular Motion Programs

Pretraining Language Models with Human Preferences

Improving Bi-level Optimization Based Methods with Inspiration from Humans' Classroom Study Techniques

Quantifying the Variability Collapse of Neural Networks

On the Estimation of Gaussian Mixture Copula Models

Understanding the Distillation Process from Deep Generative Models to Tractable Probabilistic Circuits

Learning Antidote Data to Individual Unfairness

Structured Cooperative Learning with Graphical Model Priors

Coordinate Descent Methods for Fractional Minimization

Optimal Convergence Rates for Agnostic Nyström Kernel Learning

Improved Regret for Efficient Online Reinforcement Learning with Linear Function Approximation

Convergence of First-Order Methods for Constrained Nonconvex Optimization with Dependent Data

Learning Mixtures of Gaussians with Censored Data

Contextual Combinatorial Bandits with Probabilistically Triggered Arms

Chemically Transferable Generative Backmapping of Coarse-Grained Proteins

Transformers Learn In-Context by Gradient Descent

Robust Explanation for Free or At the Cost of Faithfulness

Graph Contrastive Backdoor Attacks

Detecting Adversarial Data by Probing Multiple Perturbations Using Expected Perturbation Score

Identification of the Adversary from a Single Adversarial Example

HarsanyiNet: Computing Accurate Shapley Values in a Single Forward Propagation

Multi-task Representation Learning for Pure Exploration in Linear Bandits

Open-Vocabulary Universal Image Segmentation with MaskCLIP

Integrating Prior Knowledge in Contrastive Learning with Kernel

Returning The Favour: When Regression Benefits From Probabilistic Causal Knowledge

Proper Scoring Rules for Survival Analysis

QAS-Bench: Rethinking Quantum Architecture Search and A Benchmark

On the Power of Foundation Models

Polyhedral Complex Extraction from ReLU Networks using Edge Subdivision

Causal Discovery with Latent Confounders Based on Higher-Order Cumulants

Synergies between Disentanglement and Sparsity: Generalization and Identifiability in Multi-Task Learning

Optimal Sets and Solution Paths of ReLU Networks

Provable Reset-free Reinforcement Learning by No-Regret Reduction

Gibbsian Polar Slice Sampling

Towards Practical Preferential Bayesian Optimization with Skew Gaussian Processes

Subsample Ridge Ensembles: Equivalences and Generalized Cross-Validation

Monotonicity and Double Descent in Uncertainty Estimation with Gaussian Processes

I$^2$SB: Image-to-Image Schrödinger Bridge

Learning Hidden Markov Models When the Locations of Missing Observations are Unknown

GibbsDDRM: A Partially Collapsed Gibbs Sampler for Solving Blind Inverse Problems with Denoising Diffusion Restoration

A Deep Conjugate Direction Method for Iteratively Solving Linear Systems

Von Mises Mixture Distributions for Molecular Conformation Generation

Forget Unlearning: Towards True Data-Deletion in Machine Learning

Understanding Plasticity in Neural Networks

Individually Fair Learning with One-Sided Feedback

Multi-class Graph Clustering via Approximated Effective $p$-Resistance

STEERING : Stein Information Directed Exploration for Model-Based Reinforcement Learning

Optimal No-Regret Learning for One-Sided Lipschitz Functions

Towards a Persistence Diagram that is Robust to Noise and Varied Densities

Robust Speech Recognition via Large-Scale Weak Supervision

LookupFFN: Making Transformers Compute-lite for CPU inference

Differentially Private Distributed Bayesian Linear Regression with MCMC

Are Equivariant Equilibrium Approximators Beneficial?

The Impact of Exploration on Convergence and Performance of Multi-Agent Q-Learning Dynamics

Tight Certification of Adversarially Trained Neural Networks via Nonconvex Low-Rank Semidefinite Relaxations

Fast Online Node Labeling for Very Large Graphs

From Adaptive Query Release to Machine Unlearning

Brauer's Group Equivariant Neural Networks

Cold Analysis of Rao-Blackwellized Straight-Through Gumbel-Softmax Gradient Estimator

Estimating the Contamination Factor's Distribution in Unsupervised Anomaly Detection

Transformed Distribution Matching for Missing Value Imputation

Git-Theta: A Git Extension for Collaborative Development of Machine Learning Models

Moderately Distributional Exploration for Domain Generalization

Improving Expert Predictions with Conformal Prediction

On the Expressive Power of Geometric Graph Neural Networks

CLUSTSEG: Clustering for Universal Segmentation

Mu$^2$SLAM: Multitask, Multilingual Speech and Language Models

Input Perturbation Reduces Exposure Bias in Diffusion Models

RGE: A Repulsive Graph Rectification for Node Classification via Influence

What can online reinforcement learning with function approximation benefit from general coverage conditions?

Quantum Policy Gradient Algorithm with Optimized Action Decoding

A new near-linear time algorithm for k-nearest neighbor search using a compressed cover tree

Lower Bounds for Learning in Revealing POMDPs

Cut your Losses with Squentropy

Statistical Inference and A/B Testing for First-Price Pacing Equilibria

Blossom: an Anytime Algorithm for Computing Optimal Decision Trees

Quantum Ridgelet Transform: Winning Lottery Ticket of Neural Networks with Quantum Computation

Near-Optimal Algorithms for Private Online Optimization in the Realizable Regime

Generalized-Smooth Nonconvex Optimization is As Efficient As Smooth Nonconvex Optimization

Robust Consensus in Ranking Data Analysis: Definitions, Properties and Computational Issues

Propensity Matters: Measuring and Enhancing Balancing for Recommendation

RLEG: Vision-Language Representation Learning with Diffusion-based Embedding Generation

Disentangled Generative Models for Robust Prediction of System Dynamics

Social learning spontaneously emerges by searching optimal heuristics with deep reinforcement learning

Computational Asymmetries in Robust Classification

CircuitNet: A Generic Neural Network to Realize Universal Circuit Motif Modeling

Better Training of GFlowNets with Local Credit and Incomplete Trajectories

(ends 6:30 PM)

6:30 p.m.

Coffee Break

7 p.m.

Oral B1 Language Models: Human Impact [7:00-8:30]

Orals 7:00-8:12

[7:00] When Personalization Harms Performance: Reconsidering the Use of Group Attributes in Prediction

[7:08] Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling

[7:16] Whose Opinions Do Language Models Reflect?

[7:24] A Watermark for Large Language Models

[7:32] DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature

[7:40] Using Large Language Models to Simulate Multiple Humans and Replicate Human Subject Studies

[7:48] Inflow, Outflow, and Reciprocity in Machine Learning

[7:56] Structure-informed Language Models Are Protein Designers

[8:04] Transformers Learn In-Context by Gradient Descent

(ends 8:30 PM)

Oral B2 Language Models: Algorithms and Architecture [7:00-8:30]

Orals 7:00-8:12

[7:00] Pretraining Language Models with Human Preferences

[7:08] Instant Soup: Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models

[7:16] Specializing Smaller Language Models towards Multi-Step Reasoning

[7:24] SparseGPT: Massive Language Models Can be Accurately Pruned in One-Shot

[7:32] Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models

[7:40] FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU

[7:48] BPipe: Memory-Balanced Pipeline Parallelism for Training Large Language Models

[7:56] Tractable Control for Autoregressive Language Generation

[8:04] Equivariant Architectures for Learning in Deep Weight Spaces

(ends 8:30 PM)

Oral B3 Privacy [7:00-8:30]

Orals 7:00-8:20

[7:00] Nonparametric Extensions of Randomized Response for Private Confidence Sets

[7:08] Differentially Private Hierarchical Clustering with Provable Approximation Guarantees

[7:16] Tight Data Access Bounds for Private Top-$k$ Selection

[7:24] JAWS-X: Addressing Efficiency Bottlenecks of Conformal Prediction Under Standard and Feedback Covariate Shift

[7:32] Active Ranking of Experts Based on their Performances in Many Tasks

[7:40] The Price of Differential Privacy under Continual Observation

[7:48] HETAL: Efficient Privacy-preserving Transfer Learning with Homomorphic Encryption

[7:56] Sketch-Flip-Merge: Mergeable Sketches for Private Distinct Counting

[8:04] Fast Private Kernel Density Estimation via Locality Sensitive Quantization

[8:12] Multi-Epoch Matrix Factorization Mechanisms for Private Machine Learning

(ends 8:30 PM)

Oral B4 Robustness / Adversarial / Rl Bandits [7:00-8:30]

Orals 7:00-8:20

[7:00] Adversarial Policies Beat Superhuman Go AIs

[7:08] Adapting to game trees in zero-sum imperfect information games

[7:16] Semi Bandit dynamics in Congestion Games: Convergence to Nash Equilibrium and No-Regret Guarantees.

[7:24] Delving into Noisy Label Detection with Clean Data

[7:32] Robustly Learning a Single Neuron via Sharpness

[7:40] Data Feedback Loops: Model-driven Amplification of Dataset Biases

[7:48] Towards Reliable Neural Specifications

[7:56] Do Perceptually Aligned Gradients Imply Robustness?

[8:04] ODS: Test-Time Adaptation in the Presence of Open-World Data Shift

[8:12] Analysis of Error Feedback in Federated Non-Convex Optimization with Biased Compression: Fast Convergence and Partial Participation

(ends 8:30 PM)

Oral B5 Self/Semi-Supervised Learning and Interpretability / Observing Aspects of NN [7:00-8:30]

Orals 7:00-8:20

[7:00] RankMe: Assessing the Downstream Performance of Pretrained Self-Supervised Representations by Their Rank

[7:08] Evaluating Self-Supervised Learning via Risk Decomposition

[7:16] BEATs: Audio Pre-Training with Acoustic Tokenizers

[7:24] Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language

[7:32] Bidirectional Adaptation for Robust Semi-Supervised Learning with Inconsistent Data Distributions

[7:40] TRAK: Attributing Model Behavior at Scale

[7:48] Understanding Plasticity in Neural Networks

[7:56] Fundamental Limits of Two-layer Autoencoders, and Achieving Them with Gradient Methods

[8:04] Random Classification Noise does not defeat All Convex Potential Boosters Irrespective of Model Choice

[8:12] Brauer's Group Equivariant Neural Networks

(ends 8:30 PM)

Panel:

AI and Marginalized Languages

(ends 8:30 PM)

8:45 p.m.

Town Hall:

ICML Business Meeting - all attendees

(ends 9:15 PM)

10 p.m.

THU 27 JUL

11 a.m.

Registration

(ends 8:00 PM)

11:30 a.m.

Test Of Time:

Learning Fair Representations

(ends 12:00 PM)

12:30 p.m.

Invited Talk:

Proxy objectives in reinforcement learning from human feedback

John Schulman

(ends 1:30 PM)

1 p.m.

Coffee Break

1:30 p.m.

Poster Session 5 [1:30-3:00]

Posters 1:30-3:00

Contrast with Reconstruct: Contrastive 3D Representation Learning Guided by Generative Pretraining

Underspecification Presents Challenges for Credibility in Modern Machine Learning

BEATs: Audio Pre-Training with Acoustic Tokenizers

OMS-DPM: Optimizing the Model Schedule for Diffusion Probabilistic Models

Run-off Election: Improved Provable Defense against Data Poisoning Attacks

Target-Aware Generative Augmentations for Single-Shot Adaptation

AbODE: Ab initio antibody design using conjoined ODEs

Mitigating Spurious Correlations in Multi-modal Models during Fine-tuning

Great Models Think Alike: Improving Model Reliability via Inter-Model Latent Agreement

Out-of-Distribution Generalization of Federated Learning via Implicit Invariant Relationships

Robust Perception through Equivariance

Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition

Privacy-Aware Compression for Federated Learning Through Numerical Mechanism Design

Offline Reinforcement Learning with Closed-Form Policy Improvement Operators

StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis

Tied-Augment: Controlling Representation Similarity Improves Data Augmentation

Contextual Reliability: When Different Features Matter in Different Contexts

Cell-Free Latent Go-Explore

RACE: Improve Multi-Agent Reinforcement Learning with Representation Asymmetry and Collaborative Evolution

CO-BED: Information-Theoretic Contextual Optimization via Bayesian Experimental Design

A Model-free Closeness-of-influence Test for Features in Supervised Learning

Towards Understanding and Reducing Graph Structural Noise for GNNs

A New PHO-rmula for Improved Performance of Semi-Structured Networks

Naive imputation implicitly regularizes high-dimensional linear models

In Search for a Generalizable Method for Source Free Domain Adaptation

Applied Online Algorithms with Heterogeneous Predictors

MultiAdam: Parameter-wise Scale-invariant Optimizer for Multiscale Training of Physics-informed Neural Networks

From Hypergraph Energy Functions to Hypergraph Neural Networks

Explainability as statistical inference

Distribution Free Prediction Sets for Node Classification

Simple Hardware-Efficient Long Convolutions for Sequence Modeling

Taxonomy-Structured Domain Adaptation

Extending Kernel PCA through Dualization: Sparsity, Robustness and Fast Algorithms

Nesterov Meets Optimism: Rate-Optimal Separable Minimax Optimization

Generalized Polyak Step Size for First Order Optimization with Momentum

Learning Globally Smooth Functions on Manifolds

simple diffusion: End-to-end diffusion for high resolution images

Learning Signed Distance Functions from Noisy 3D Point Clouds via Noise to Noise Mapping

Hyperbolic Image-text Representations

MyoDex: A Generalizable Prior for Dexterous Manipulation

Predictable MDP Abstraction for Unsupervised Model-Based RL

Guiding Pretraining in Reinforcement Learning with Large Language Models

Surrogate Model Extension (SME): A Fast and Accurate Weight Update Attack on Federated Learning

Why Is Public Pretraining Necessary for Private Model Training?

Discrete Key-Value Bottleneck

Is Overfitting Necessary for Implicit Video Representation?

Model Ratatouille: Recycling Diverse Models for Out-of-Distribution Generalization

AdaNPC: Exploring Non-Parametric Classifier for Test-Time Adaptation

End-to-End Multi-Object Detection with a Regularized Mixture Model

Linear optimal partial transport embedding

Stochastic Marginal Likelihood Gradients using Neural Tangent Kernels

A Mathematical Model for Curriculum Learning for Parities

Gaussian processes at the Helm(holtz): A more fluid model for ocean currents

Quantized Distributed Training of Large Models with Convergence Guarantees

FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU

Efficiently predicting high resolution mass spectra with graph neural networks

Which Features are Learnt by Contrastive Learning? On the Role of Simplicity Bias in Class Collapse and Feature Suppression

Model-based Reinforcement Learning with Scalable Composite Policy Gradient Estimators

The Role of Entropy and Reconstruction in Multi-View Self-Supervised Learning

Bayes-optimal Learning of Deep Random Networks of Extensive-width

Special Properties of Gradient Descent with Large Learning Rates

Group Equivariant Fourier Neural Operators for Partial Differential Equations

Hierarchical Neural Coding for Controllable CAD Model Generation

PLay: Parametrically Conditioned Layout Generation using Latent Diffusion

Effective Neural Topic Modeling with Embedding Clustering Regularization

A Group Symmetric Stochastic Differential Equation Model for Molecule Multi-modal Pretraining

Relevant Walk Search for Explaining Graph Neural Networks

Exphormer: Sparse Transformers for Graphs

Quantum 3D Graph Learning with Applications to Molecule Embedding

Generative Graph Dictionary Learning

Variational Curriculum Reinforcement Learning for Unsupervised Discovery of Skills

Near-optimal Conservative Exploration in Reinforcement Learning under Episode-wise Constraints

Multiplier Bootstrap-based Exploration

DP-Fast MH: Private, Fast, and Accurate Metropolis-Hastings for Large-Scale Bayesian Inference

Graph Mixup with Soft Alignments

Fighting Fire with Fire: Contrastive Debiasing without Bias-free Data via Generative Bias-transformation

UMD: Unsupervised Model Detection for X2X Backdoor Attacks

Repository-Level Prompt Generation for Large Language Models of Code

Domain Adaptation for Time Series Under Feature and Label Shifts

Not All Semantics are Created Equal: Contrastive Self-supervised Learning with Automatic Temperature Individualization

Fair Densities via Boosting the Sufficient Statistics of Exponential Families

PFNs4BO: In-Context Learning for Bayesian Optimization

Generalized Teacher Forcing for Learning Chaotic Dynamics

Towards a better understanding of representation dynamics under TD-learning

Leveraging Demonstrations to Improve Online Learning: Quality Matters

Flipping Coins to Estimate Pseudocounts for Exploration in Reinforcement Learning

New metrics and search algorithms for weighted causal DAGs

One-Shot Federated Conformal Prediction

The Blessing of Heterogeneity in Federated Q-Learning: Linear Speedup and Beyond

Finding Generalization Measures by Contrasting Signal and Noise

Magneto: A Foundation Transformer

Revisiting Pseudo-Label for Single-Positive Multi-Label Learning

Nonparametric Iterative Machine Teaching

dugMatting: Decomposed-Uncertainty-Guided Matting

For Pre-Trained Vision Models in Motor Control, Not All Policy Learning Methods are Created Equal

Explainable Data-Driven Optimization: From Context to Decision and Back Again

Improved Policy Evaluation for Randomized Trials of Algorithmic Resource Allocation

Shortest Edit Path Crossover: A Theory-driven Solution to the Permutation Problem in Evolutionary Neural Architecture Search

ELSA: Efficient Label Shift Adaptation through the Lens of Semiparametric Models

Reinforcement Learning with General Utilities: Simpler Variance Reduction and Large State-Action Space

Smooth Non-stationary Bandits

Revisiting Simple Regret: Fast Rates for Returning a Good Arm

Tight Data Access Bounds for Private Top-$k$ Selection

Omnipredictors for Constrained Optimization

Analyzing Privacy Leakage in Machine Learning via Multiple Hypothesis Testing: A Lesson From Fano

Escaping saddle points in zeroth-order optimization: the power of two-point estimators

Attribute-Efficient PAC Learning of Low-Degree Polynomial Threshold Functions with Nasty Noise

Statistical Learning under Heterogenous Distribution Shift

Combinatorial Neural Bandits

Does Sparsity Help in Learning Misspecified Linear Bandits?

DIFF2: Differential Private Optimization via Gradient Differences for Nonconvex Distributed Learning

Understanding Self-Distillation in the Presence of Label Noise

Self-Repellent Random Walks on General Graphs - Achieving Minimal Sampling Variance via Nonlinear Markov Chains

DSGD-CECA: Decentralized SGD with Communication-Optimal Exact Consensus Algorithm

Exploring Chemical Space with Score-based Out-of-distribution Generation

Thompson Sampling with Diffusion Generative Prior

Improving Graph Generation by Restricting Graph Bandwidth

Coupled Variational Autoencoder

SDDM: Score-Decomposed Diffusion Models on Manifolds for Unpaired Image-to-Image Translation

Half-Hop: A graph upsampling approach for slowing down message passing

Implicit Jacobian regularization weighted with impurity of probability output

Generative Pretraining for Black-Box Optimization

Learning Unnormalized Statistical Models via Compositional Optimization

Analysis of Error Feedback in Federated Non-Convex Optimization with Biased Compression: Fast Convergence and Partial Participation

MetaDiffuser: Diffusion Model as Conditional Planner for Offline Meta-RL

Grounding Language Models to Images for Multimodal Inputs and Outputs

Change is Hard: A Closer Look at Subpopulation Shift

Task-specific experimental design for treatment effect estimation

2D-Shapley: A Framework for Fragmented Data Valuation

SurProGenes: Survival Risk-Ordered Representation of Cancer Patients and Genes for the Identification of Prognostic Genes

On the Forward Invariance of Neural ODEs

Inferring Relational Potentials in Interacting Systems

Towards Omni-generalizable Neural Methods for Vehicle Routing Problems

Lazy Agents: A New Perspective on Solving Sparse Reward Problem in Multi-agent Reinforcement Learning

Towards Understanding Generalization of Macro-AUC in Multi-label Learning

Supervised Metric Learning to Rank for Retrieval via Contextual Similarity Optimization

Learning Expressive Priors for Generalization and Uncertainty Estimation in Neural Networks

Reflected Diffusion Models

End-to-End Learning for Stochastic Optimization: A Bayesian Perspective

Normalizing Flows for Interventional Density Estimation

Identifiability and Generalizability in Constrained Inverse Reinforcement Learning

Strategic Classification with Unknown User Manipulations

Fast Sampling of Diffusion Models via Operator Learning

Variational Mixture of HyperGenerators for Learning Distributions over Functions

How Do Transformers Learn Topic Structure: Towards a Mechanistic Understanding

On the Relationship Between Explanation and Prediction: A Causal View

Addressing Budget Allocation and Revenue Allocation in Data Market Environments Using an Adaptive Sampling Algorithm

How Many Perturbations Break This Model? Evaluating Robustness Beyond Adversarial Accuracy

Offline Learning in Markov Games with General Function Approximation

When is Realizability Sufficient for Off-Policy Reinforcement Learning?

Optimal Rates and Efficient Algorithms for Online Bayesian Persuasion

Federated Online and Bandit Convex Optimization

Learning useful representations for shifting tasks and distributions

Can Neural Network Memorization Be Localized?

Meta-Learning the Inductive Bias of Simple Neural Circuits

Feature learning in deep classifiers through Intermediate Neural Collapse

SGD with Large Step Sizes Learns Sparse Features

Generative Decoding of Visual Stimuli

Context Consistency Regularization for Label Sparsity in Time Series

Explore and Exploit the Diverse Knowledge in Model Zoo for Domain Generalization

Towards Unbiased Training in Federated Open-world Semi-supervised Learning

Beyond Homophily: Reconstructing Structure for Graph-agnostic Clustering

TIPS: Topologically Important Path Sampling for Anytime Neural Networks

Emergent Asymmetry of Precision and Recall for Measuring Fidelity and Diversity of Generative Models in High Dimensions

IRNeXt: Rethinking Convolutional Network Design for Image Restoration

Whose Opinions Do Language Models Reflect?

Global optimality of Elman-type RNNs in the mean-field regime

Sample Complexity of Probability Divergences under Group Symmetry

Coin Sampling: Gradient-Based Bayesian Inference without Learning Rates

Pruning via Sparsity-indexed ODE: a Continuous Sparsity Viewpoint

DIVISION: Memory Efficient Training via Dual Activation Precision

Transformer-based Stagewise Decomposition for Large-Scale Multistage Stochastic Optimization

Communication-Efficient Federated Hypergradient Computation via Aggregated Iterative Differentiation

Linearly Constrained Bilevel Optimization: A Smoothed Implicit Gradient Approach

Paging with Succinct Predictions

Identifiability of Label Noise Transition Matrix

NP-SemiSeg: When Neural Processes meet Semi-Supervised Semantic Segmentation

Random Grid Neural Processes for Parametric Partial Differential Equations

MODeL: Memory Optimizations for Deep Learning

Bi-directional Masks for Efficient N:M Sparse Training

RSC: Accelerate Graph Neural Networks Training via Randomized Sparse Computations

Ske2Grid: Skeleton-to-Grid Representation Learning for Action Recognition

DDGR: Continual Learning with Deep Diffusion-based Generative Replay

Regression with Sensor Data Containing Incomplete Observations

Causal Isotonic Calibration for Heterogeneous Treatment Effects

Fully-Adaptive Composition in Differential Privacy

Unveiling the Latent Space Geometry of Push-Forward Generative Models

Compressing Tabular Data via Latent Variable Estimation

Discrete Continuous Optimization Framework for Simultaneous Clustering and Training in Mixture Models

Theoretical Bounds on the Network Community Profile from Low-rank Semi-definite Programming

Automatically marginalized MCMC in probabilistic programming

Long Horizon Temperature Scaling

Extending Conformal Prediction to Hidden Markov Models with Exact Validity via de Finetti's Theorem for Markov Chains

Efficient preconditioned stochastic gradient descent for estimation in latent variable models

Principled Reinforcement Learning with Human Feedback from Pairwise or K-wise Comparisons

Adaptive Annealed Importance Sampling with Constant Rate Progress

Fast, Differentiable and Sparse Top-k: a Convex Analysis Perspective

Provable Data Subset Selection For Efficient Neural Networks Training

Speed-Oblivious Online Scheduling: Knowing (Precise) Speeds is not Necessary

Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models

CoDi: Co-evolving Contrastive Diffusion Models for Mixed-type Tabular Synthesis

Deep Generative Symbolic Regression with Monte-Carlo-Tree-Search

High-dimensional Clustering onto Hamiltonian Cycle

RLSbench: Domain Adaptation Under Relaxed Label Shift

Curiosity in Hindsight: Intrinsic Exploration in Stochastic Environments

Vertical Federated Graph Neural Network for Recommender System

Benign Overfitting in Two-layer ReLU Convolutional Neural Networks

On the Global Convergence of Fitted Q-Iteration with Two-layer Neural Network Parametrization

Policy Regularization with Dataset Constraint for Offline Reinforcement Learning

Internally Rewarded Reinforcement Learning

Second-Order Optimization with Lazy Hessians

Semi-Parametric Contextual Pricing Algorithm using Cox Proportional Hazards Model

Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling

ACAT: Adversarial Counterfactual Attention for Classification and Detection in Medical Imaging

Conformal Inference is (almost) Free for Neural Networks Trained with Early Stopping

System Identification of Neural Systems: If We Got It Right, Would We Know?

Modeling Dynamic Environments with Scene Graph Memory

Diagnosis, Feedback, Adaptation: A Human-in-the-Loop Framework for Test-Time Policy Adaptation

MEWL: Few-shot multimodal word learning with referential uncertainty

Revisiting Over-smoothing and Over-squashing Using Ollivier-Ricci Curvature

GLOBE-CE: A Translation Based Approach for Global Counterfactual Explanations

Quantifying the Knowledge in GNNs for Reliable Distillation into MLPs

Model-Aware Contrastive Learning: Towards Escaping the Dilemmas

Adversarial Policies Beat Superhuman Go AIs

Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the Machiavelli Benchmark

AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners

Unifying Nesterov's Accelerated Gradient Methods for Convex and Strongly Convex Objective Functions

Finite-Sample Analysis of Learning High-Dimensional Single ReLU Neuron

Trading-Off Payments and Accuracy in Online Classification with Paid Stochastic Experts

Improved Analysis of Score-based Generative Modeling: User-Friendly Bounds under Minimal Smoothness Assumptions

Delayed Feedback in Kernel Bandits

A Scalable Frank-Wolfe-Based Algorithm for the Max-Cut SDP

R-U-SURE? Uncertainty-Aware Code Suggestions By Maximizing Utility Across Random User Intents

Random Teachers are Good Teachers

Weak Proxies are Sufficient and Preferable for Fairness with Missing Sensitive Attributes

A Critical Revisit of Adversarial Robustness in 3D Point Cloud Recognition with Diffusion-Driven Purification

Towards Coherent Image Inpainting Using Denoising Diffusion Implicit Models

Semi-Offline Reinforcement Learning for Optimized Text Generation

Compressed Decentralized Proximal Stochastic Gradient Method for Nonconvex Composite Problems with Heterogeneous Data

Defects of Convolutional Decoder Networks in Frequency Representation

Cones: Concept Neurons in Diffusion Models for Customized Generation

On the Role of Attention in Prompt-tuning

Diffusion Based Representation Learning

On Data Manifolds Entailed by Structural Causal Models

Distribution-dependent McDiarmid-type Inequalities for Functions of Unbounded Interaction

Towards Learning Geometric Eigen-Lengths Crucial for Fitting Tasks

Topological Singularity Detection at Multiple Scales

Estimating Possible Causal Effects with Latent Variables via Adjustment

Exact Inference in High-order Structured Prediction

Learning Rate Schedules in the Presence of Distribution Shift

Leveraging Offline Data in Online Reinforcement Learning

MixFlows: principled variational inference via mixed flows

Randomized Gaussian Process Upper Confidence Bound with Tighter Bayesian Regret Bounds

Sketched Ridgeless Linear Regression: The Role of Downsampling

Beyond the Edge of Stability via Two-step Gradient Updates

E$(n)$ Equivariant Message Passing Simplicial Networks

Private Statistical Estimation of Many Quantiles

Learning the Dynamics of Sparsely Observed Interacting Systems

The Power of Learned Locally Linear Models for Nonlinear Policy Optimization

Robust Counterfactual Explanations for Neural Networks With Probabilistic Guarantees

Chameleon: Adapting to Peer Images for Planting Durable Backdoors in Federated Learning

Controllable Neural Symbolic Regression

Robust and private stochastic linear bandits

Reducing SO(3) Convolutions to SO(2) for Efficient Equivariant GNNs

Forward-Backward Gaussian Variational Inference via JKO in the Bures-Wasserstein Space

Buying Information for Stochastic Optimization

Hyperbolic Diffusion Embedding and Distance for Hierarchical Representation Learning

Can We Scale Transformers to Predict Parameters of Diverse ImageNet Models?

Robustness in Multimodal Learning under Train-Test Modality Mismatch

Revisiting Sampling for Combinatorial Optimization

Nonparametric Density Estimation under Distribution Drift

Network Effects in Performative Prediction Games

A Game-Theoretic Framework for Managing Risk in Multi-Agent Systems

Momentum Ensures Convergence of SIGNSGD under Weaker Assumptions

DADAO: Decoupled Accelerated Decentralized Asynchronous Optimization

Layered State Discovery for Incremental Autonomous Exploration

How Jellyfish Characterise Alternating Group Equivariant Neural Networks

Optimally-weighted Estimators of the Maximum Mean Discrepancy for Likelihood-Free Inference

Are labels informative in semi-supervised learning? Estimating and leveraging the missing-data mechanism.

Probabilistic Imputation for Time-series Classification with Missing Data

Dividing and Conquering a BlackBox to a Mixture of Interpretable Models: Route, Interpret, Repeat

Delving into Noisy Label Detection with Clean Data

Synthetic Prompting: Generating Chain-of-Thought Demonstrations for Large Language Models

The Hessian perspective into the Nature of Convolutional Neural Networks

N$\text{A}^\text{2}$Q: Neural Attention Additive Model for Interpretable Multi-Agent Q-Learning

Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language

Data Efficient Neural Scaling Law via Model Reusing

Beyond Lipschitz Smoothness: A Tighter Analysis for Nonconvex Optimization

Sketching for First Order Method: Efficient Algorithm for Low-Bandwidth Channel and Vulnerability

Robust and Scalable Bayesian Online Changepoint Detection

On Distribution Dependent Sub-Logarithmic Query Time of Learned Indexing

Near-Optimal $\Phi$-Regret Learning in Extensive-Form Games

Multi-Objective Population Based Training

Divide and Conquer Dynamic Programming: An Almost Linear Time Change Point Detection Methodology in High Dimensions

The Regret of Exploration and the Control of Bad Episodes in Reinforcement Learning

Efficient Quantum Algorithms for Quantum Optimal Control

Understanding the Role of Feedback in Online Learning with Switching Costs

Prometheus: Taming Sample and Communication Complexities in Constrained Decentralized Stochastic Bilevel Learning

The Catalog Problem: Clustering and Ordering Variable-Sized Sets

A General Representation Learning Framework with Generalization Performance Guarantees

On the Initialization of Graph Neural Networks

EM-Network: Oracle Guided Self-distillation for Sequence Learning

Fully Bayesian Autoencoders with Latent Sparse Gaussian Processes

Multi-Task Structural Learning using Local Task Similarity induced Neuron Creation and Removal

Generalization Bounds using Data-Dependent Fractal Dimensions

Improving Graph Neural Networks with Learnable Propagation Operators

CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling

What Can Be Learnt With Wide Convolutional Neural Networks?

(ends 3:00 PM)

3 p.m.

Lunch -(On Your Own)

4:30 p.m.

Poster Session 6 [4:30-6:00]

Posters 4:30-6:00

Effective Structured Prompting by Meta-Learning and Representative Verbalizer

Text Generation with Diffusion Language Models: A Pre-training Approach with Continuous Paragraph Denoise

Parallel $Q$-Learning: Scaling Off-policy Reinforcement Learning under Massively Parallel Simulation

ED-Batch: Efficient Automatic Batching of Dynamic Neural Networks via Learned Finite State Machines

MultiRobustBench: Benchmarking Robustness Against Multiple Attacks

Multi-Environment Pretraining Enables Transfer to Action Limited Datasets

Discovering Object-Centric Generalized Value Functions From Pixels

FedBR: Improving Federated Learning on Heterogeneous Data via Local Learning Bias Reduction

When Sparsity Meets Contrastive Models: Less Graph Data Can Bring Better Class-Balanced Representations

Boosting Graph Contrastive Learning via Graph Contrastive Saliency

Simple Disentanglement of Style and Content in Visual Representations

Meta Optimal Transport

Learning Regions of Interest for Bayesian Optimization with Adaptive Level-Set Estimation

Reparameterized Policy Learning for Multimodal Trajectory Optimization

BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models

GAT: Guided Adversarial Training with Pareto-optimal Auxiliary Tasks

Controlling Type Confounding in Ad Hoc Teamwork with Instance-wise Teammate Feedback Rectification

Revisiting Domain Randomization via Relaxed State-Adversarial Policy Optimization

Representation Learning with Multi-Step Inverse Kinematics: An Efficient and Optimal Approach to Rich-Observation RL

On the Identifiability and Estimation of Causal Location-Scale Noise Models

Improving Adversarial Robustness of Deep Equilibrium Models with Explicit Regulations Along the Neural Dynamics

Estimating Heterogeneous Treatment Effects: Mutual Information Bounds and Learning Algorithms

Margin-based sampling in high dimensions: When being active is less efficient than staying passive

Non-autoregressive Conditional Diffusion Models for Time Series Prediction

DiscoBAX - Discovery of optimal intervention sets in genomic experiment design

Implicit Graph Neural Networks: A Monotone Operator Viewpoint

Multi-Symmetry Ensembles: Improving Diversity and Generalization via Opposing Symmetries

Equivariance with Learned Canonicalization Functions

Graphically Structured Diffusion Models

TRAK: Attributing Model Behavior at Scale

Fast Online Value-Maximizing Prediction Sets with Conformal Cost Control

Recovery Bounds on Class-Based Optimal Transport: A Sum-of-Norms Regularization Framework

Cyclic Block Coordinate Descent With Variance Reduction for Composite Nonconvex Optimization

Toward Efficient Gradient-Based Value Estimation

Conditionally Strongly Log-Concave Generative Models

SinFusion: Training Diffusion Models on a Single Image or Video

Stable and Consistent Prediction of 3D Characteristic Orientation via Invariant Residual Learning

Learning Dense Correspondences between Photos and Sketches

VIMA: Robot Manipulation with Multimodal Prompts

Jump-Start Reinforcement Learning

ILLUME: Rationalizing Vision-Language Models through Human Interactions

LeadFL: Client Self-Defense against Model Poisoning in Federated Learning

Fundamental Limits of Two-layer Autoencoders, and Achieving Them with Gradient Methods

Pairwise Ranking Losses of Click-Through Rates Prediction for Welfare Maximization in Ad Auctions

Learning Instance-Specific Augmentations by Capturing Local Invariances

NTK-approximating MLP Fusion for Efficient Language Model Fine-tuning

Compositional Exemplars for In-context Learning

Provable Multi-instance Deep AUC Maximization with Stochastic Pooling

Towards Quantum Machine Learning for Constrained Combinatorial Optimization: a Quantum QAP Solver

Improving Hyperparameter Learning under Approximate Inference in Gaussian Process Models

Constrained Monotonic Neural Networks

General Covariance Data Augmentation for Neural PDE Solvers

Fast as CHITA: Neural Network Pruning with Combinatorial Optimization

GEAR: A GPU-Centric Experience Replay System for Large Reinforcement Learning Models

Learning to Design Analog Circuits to Meet Threshold Specifications

Overcoming Simplicity Bias in Deep Networks using a Feature Sieve

Low-Variance Gradient Estimation in Unrolled Computation Graphs with ES-Single

Robust One-Class Classification with Signed Distance Function using 1-Lipschitz Neural Networks

Gradient Descent in Neural Networks as Sequential Learning in Reproducing Kernel Banach Space

How Powerful are Shallow Neural Networks with Bandlimited Random Weights?

Spherical Fourier Neural Operators: Learning Stable Dynamics on the Sphere

A Conditional Normalizing Flow for Accelerated Multi-Coil MR Imaging

LongCoder: A Long-Range Pre-trained Language Model for Code Completion

Deep Temporal Sets with Evidential Reinforced Attentions for Unique Behavioral Pattern Discovery

Distortion and Uncertainty Aware Loss for Panoramic Depth Completion

UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers

Equivariant Polynomials for Graph Neural Networks

COMCAT: Towards Efficient Compression and Customization of Attention-Based Vision Models

Towards Better Graph Representation Learning with Parameterized Decomposition & Filtering

Geometric Latent Diffusion Models for 3D Molecule Generation

Constrained Decision Transformer for Offline Safe Reinforcement Learning

Warm-Start Actor-Critic: From Approximation Error to Sub-optimality Gap

Robust Satisficing MDPs

Featured Graph Coarsening with Similarity Guarantees

DeSRA: Detect and Delete the Artifacts of GAN-based Real-World Super-Resolution Models

Revisiting Data-Free Knowledge Distillation with Poisoned Teachers

Abstract-to-Executable Trajectory Translation for One-Shot Task Generalization

Efficient Sequence Transduction by Jointly Predicting Tokens and Durations

LEVER: Learning to Verify Language-to-Code Generation with Execution

Interval Bound Interpolation for Few-shot Learning with Few Tasks

Towards Sustainable Learning: Coresets for Data-efficient Deep Learning

Conformal Prediction Sets for Graph Neural Networks

Continuously Parameterized Mixture Models

Why Random Pruning Is All We Need to Start Sparse

Understanding Self-Predictive Learning for Reinforcement Learning

On the Statistical Benefits of Temporal Difference Learning

A Connection between One-Step RL and Critic Regularization in Reinforcement Learning

Revisiting Bellman Errors for Offline Model Selection

Predictive Flows for Faster Ford-Fulkerson

Differential Privacy has Bounded Impact on Fairness in Classification

Improved Learning-Augmented Algorithms for the Multi-Option Ski Rental Problem via Best-Possible Competitive Analysis

Robust Non-Linear Feedback Coding via Power-Constrained Deep Learning

Feature Directions Matter: Long-Tailed Learning via Rotated Balanced Representation

$\pi$-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation

FREDIS: A Fusion Framework of Refinement and Disambiguation for Unreliable Partial Label Learning

Which Invariance Should We Transfer? A Causal Minimax Learning Approach

Auxiliary Modality Learning with Generalized Curriculum Distillation

Adaptive Smoothing Gradient Learning for Spiking Neural Networks

An Investigation into Pre-Training Object-Centric Representations for Reinforcement Learning

Expertise Trees Resolve Knowledge Limitations in Collective Decision-Making

Regions of Reliability in the Evaluation of Multivariate Probabilistic Forecasts

A/B Testing in Network Data with Covariate-Adaptive Randomization

Alternating Local Enumeration (TnALE): Solving Tensor Network Structure Search with Fewer Evaluations

Consistency of Multiple Kernel Clustering

Near-Minimax-Optimal Risk-Sensitive Reinforcement Learning with CVaR

A Framework for Adapting Offline Algorithms to Solve Combinatorial Multi-Armed Bandit Problems with Bandit Feedback

Weighted Tallying Bandits: Overcoming Intractability via Repeated Exposure Optimality

The Price of Differential Privacy under Continual Observation

Subset-Based Instance Optimality in Private Estimation

A Law of Robustness beyond Isoperimetry

Neural Network Approximations of PDEs Beyond Linearity: A Representational Perspective

Sample Complexity Bounds for Learning High-dimensional Simplices in Noisy Regimes

Phase Transitions in the Detection of Correlated Databases

A Near-Optimal Algorithm for Safe Reinforcement Learning Under Instantaneous Hard Constraints

Sharper Bounds for $\ell_p$ Sensitivity Sampling

Tighter Analysis for ProxSkip

Learning Functional Distributions with Private Labels

Stochastic Gradient Succeeds for Bandits

A Unified Optimization Framework of ANN-SNN Conversion: Towards Optimal Mapping from Activation Values to Firing Rates

Masked Trajectory Models for Prediction, Representation, and Control

Information-Theoretic State Space Model for Multi-View Reinforcement Learning

Node Embedding from Neural Hamiltonian Orbits in Graph Neural Networks

Personalized Subgraph Federated Learning

Leveraging Label Non-Uniformity for Node Classification in Graph Neural Networks

Anti-Exploration by Random Network Distillation

Distance Weighted Supervised Learning for Offline Interaction Data

Likelihood Adjusted Semidefinite Programs for Clustering Heterogeneous Data

Meta Learning of Interface Conditions for Multi-Domain Physics-Informed Neural Networks

Complementary Attention for Multi-Agent Reinforcement Learning

Multi-Modal Classifiers for Open-Vocabulary Object Detection

Hyperparameters in Reinforcement Learning and How To Tune Them

Never mind the metrics---what about the uncertainty? Visualising binary confusion matrix metric distributions to put performance in perspective

Data Feedback Loops: Model-driven Amplification of Dataset Biases

When Personalization Harms Performance: Reconsidering the Use of Group Attributes in Prediction

Fair Neighbor Embedding

Kernel Sufficient Dimension Reduction and Variable Selection for Compositional Data via Amalgamation

Adaptive Whitening in Neural Populations with Gain-modulating Interneurons

Hierarchies of Reward Machines

Learning Deep Time-index Models for Time Series Forecasting

What is Essential for Unseen Goal Generalization of Offline Goal-conditioned RL?

Label Distributionally Robust Losses for Multi-class Classification: Consistency, Robustness and Adaptivity

OpenFE: Automated Feature Generation with Expert-level Performance

Linear Time GPs for Inferring Latent Trajectories from Neural Spike Trains

Unscented Autoencoder

Optimal Goal-Reaching Reinforcement Learning via Quasimetric Learning

Learning in POMDPs is Sample-Efficient with Hindsight Observability

On the Power of Pre-training for Generalization in RL: Provable Benefits and Hardness

Efficient and Degree-Guided Graph Generation via Discrete Diffusion Modeling

GFlowOut: Dropout with Generative Flow Networks

Quantifying Human Priors over Social and Navigation Networks

Biases in Evaluation of Molecular Optimization Methods and Bias Reduction Strategies

Recovering Top-Two Answers and Confusion Probability in Multi-Choice Crowdsourcing

Learning Perturbations to Explain Time Series Predictions

Incentivizing Exploration with Linear Contexts and Combinatorial Actions

From Robustness to Privacy and Back

Learning to Bid in Repeated First-Price Auctions with Budgets

Distributed Linear Bandits under Communication Constraints

Statistical Foundations of Prior-Data Fitted Networks

Neural networks trained with SGD learn distributions of increasing complexity

Scaling Laws for Multilingual Neural Machine Translation

Explaining the effects of non-convergent MCMC in the training of Energy-Based Models

Metagenomic Binning using Connectivity-constrained Variational Autoencoders

Spatial-Temporal Graph Learning with Adversarial Contrastive Adaptation

Beyond In-Domain Scenarios: Robust Density-Aware Calibration

Prefer to Classify: Improving Text Classifiers via Auxiliary Preference Learning

Disentangled Multiplex Graph Representation Learning

Do Not Train It: A Linear Neural Architecture Search of Graph Neural Networks

Provably Learning Object-Centric Representations

BiBench: Benchmarking and Analyzing Network Binarization

Scaling Laws for Generative Mixed-Modal Language Models

Sampling-based Nyström Approximation and Kernel Quadrature

Adversarial robustness of amortized Bayesian inference

Discover-Then-Rank Unlabeled Support Vectors in the Dual Space for Multi-Class Active Learning

Answering Complex Logical Queries on Knowledge Graphs via Query Computation Tree Optimization

Algorithmic Stability of Heavy-Tailed SGD with General Loss Functions

Mixing Predictions for Online Metric Algorithms

Understanding the Impact of Adversarial Robustness on Accuracy Disparity

SeedGNN: Graph Neural Network for Supervised Seeded Graph Matching

BPipe: Memory-Balanced Pipeline Parallelism for Training Large Language Models

DoCoFL: Downlink Compression for Cross-Device Federated Learning

GNN&GBDT-Guided Fast Optimizing Framework for Large-scale Integer Programming

Coarse-to-Fine: a Hierarchical Diffusion Model for Molecule Generation in 3D

NNSplitter: An Active Defense Solution for DNN Model via Automated Weight Obfuscation

Data-Driven Subgroup Identification for Linear Regression

Simplex Random Features

Abstracting Imperfect Information Away from Two-Player Zero-Sum Games

Score Approximation, Estimation and Distribution Recovery of Diffusion Models on Low-Dimensional Data

Proper Losses for Discrete Generative Models

ClusterFuG: Clustering Fully connected Graphs by Multicut

Efficient Graph Field Integrators Meet Point Clouds

Dirichlet Diffusion Score Model for Biological Sequence Generation

Sequential Monte Carlo Learning for Time Series Structure Discovery

Kernel QuantTree

FaDIn: Fast Discretized Inference for Hawkes Processes with General Parametric Kernels

Model-Free Robust Average-Reward Reinforcement Learning

Beyond Exponentially Fast Mixing in Average-Reward Reinforcement Learning via Multi-Level Monte Carlo Actor-Critic

Direct Parameterization of Lipschitz-Bounded Deep Networks

Fast Rates in Time-Varying Strongly Monotone Games

The Value of Out-of-Distribution Data

DecompDiff: Diffusion Models with Decomposed Priors for Structure-Based Drug Design

Everyone's Preference Changes Differently: A Weighted Multi-Interest Model For Retrieval

PFGM++: Unlocking the Potential of Physics-Inspired Generative Models

Algorithmic Collective Action in Machine Learning

Probabilistic Categorical Adversarial Attack and Adversarial Training

Generalization Analysis for Contrastive Representation Learning

Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision Processes

Federated Adversarial Learning: A Framework with Convergence Analysis

An Adaptive Entropy-Regularization Framework for Multi-Agent Reinforcement Learning

Solving Linear Programs with Fast Online Learning Algorithms

Robust Budget Pacing with a Single Sample

Poisoning Language Models During Instruction Tuning

RankMe: Assessing the Downstream Performance of Pretrained Self-Supervised Representations by Their Rank

Bootstrap in High Dimension with Low Computation

Hidden Symmetries of ReLU Networks

Learning to Initiate and Reason in Event-Driven Cascading Processes

Do Embodied Agents Dream of Pixelated Sheep: Embodied Decision Making using Language Guided World Modelling

QASA: Advanced Question Answering on Scientific Articles

Spherical Inducing Features for Orthogonally-Decoupled Gaussian Processes

Oracles & Followers: Stackelberg Equilibria in Deep Multi-Agent Reinforcement Learning

Patch-level Contrastive Learning via Positional Query for Visual Pre-training

Learning to Optimize Differentiable Games

Generating Language Corrections for Teaching Physical Control Tasks

Behavior Contrastive Learning for Unsupervised Skill Discovery

On Penalty-based Bilevel Gradient Descent Method

Beyond Uniform Lipschitz Condition in Differentially Private Optimization

Cooperative Multi-Agent Reinforcement Learning: Asynchronous Communication and Linear Function Approximation

Efficient displacement convex optimization with particle gradient descent

Differentially Private Stochastic Convex Optimization under a Quantile Loss Function

Recasting Self-Attention with Holographic Reduced Representations

Coder Reviewer Reranking for Code Generation

Leveraging Proxy of Training Data for Test-Time Adaptation

Image Shortcut Squeezing: Countering Perturbative Availability Poisons with Compression

NeRFool: Uncovering the Vulnerability of Generalizable Neural Radiance Fields against Adversarial Perturbations

Effective and Efficient Structural Inference with Reservoir Computing

Cross-Modal Fine-Tuning: Align then Refine

Global Context Vision Transformers

GeCoNeRF: Few-shot Neural Radiance Fields via Geometric Consistency

Modality-Agnostic Variational Compression of Implicit Neural Representations

High Fidelity Image Counterfactuals with Probabilistic Causal Models

High-Probability Bounds for Stochastic Optimization and Variational Inequalities: the Case of Unbounded Variance

On the Complexity of Bayesian Generalization

Expectation-Complete Graph Representations with Homomorphisms

Shapley Based Residual Decomposition for Instance Analysis

Efficient Algorithms for Exact Graph Matching on Correlated Stochastic Block Models with Constant Correlation

Critical Points and Convergence Analysis of Generative Deep Linear Networks Trained with Bures-Wasserstein Loss

Performative Reinforcement Learning

Last Switch Dependent Bandits with Monotone Payoff Functions

DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule

Trustworthy Policy Learning under the Counterfactual No-Harm Criterion

Are Gaussian Data All You Need? The Extents and Limits of Universality in High-Dimensional Generalized Linear Estimation

Certified Robust Neural Networks: Generalization and Corruption Resistance

Generalizing Neural Wave Functions

On the convergence of the MLE as an estimator of the learning rate in the Exp3 algorithm

Neural Markov Jump Processes

Learning Control-Oriented Dynamical Structure from Data

Bayesian Estimation of Differential Privacy

Eliminating Adversarial Noise via Information Discard and Robust Representation Restoration

The Numerical Stability of Hyperbolic Representation Learning

Improved Algorithms for White-Box Adversarial Streams

Q-Flow: Generative Modeling for Differential Equations of Open Quantum Dynamics with Normalizing Flows

On the Convergence of Gradient Flow on Multi-layer Linear Models

Online Learning in Stackelberg Games with an Omniscient Follower

Surface Snapping Optimization Layer for Single Image Object Shape Reconstruction

Streaming Active Learning with Deep Neural Networks

Unit Scaling: Out-of-the-Box Low-Precision Training

Linear Causal Disentanglement via Interventions

Competing for Shareable Arms in Multi-Player Multi-Armed Bandits

MANSA: Learning Fast and Slow in Multi-Agent Systems

Thompson Sampling with Less Exploration is Fast and Optimal

SurCo: Learning Linear SURrogates for COmbinatorial Nonlinear Optimization Problems

Faster Rates of Convergence to Stationary Points in Differentially Private Optimization

Quantitative Universal Approximation Bounds for Deep Belief Networks

Vector-Valued Control Variates

Deep Anomaly Detection under Labeling Budget Constraints

Sequential Multi-Dimensional Self-Supervised Learning for Clinical Time Series

Robust Weight Signatures: Gaining Robustness as Easy as Patching Weights?

Better Diffusion Models Further Improve Adversarial Training

Explaining Reinforcement Learning with Shapley Values

Randomized Schur Complement Views for Graph Contrastive Learning

Hierarchical Diffusion for Offline Decision Making

The Unreasonable Effectiveness of Few-shot Learning for Machine Translation

Understand and Modularize Generator Optimization in ELECTRA-style Pretraining

Distribution Free Domain Generalization

Secure Federated Correlation Test and Entropy Estimation

Computational Doob h-transforms for Online Filtering of Discretely Observed Diffusions

Rethinking Warm-Starts with Predictions: Learning Predictions Close to Sets of Optimal Solutions for Faster $\text{L}$-/$\text{L}^\natural$-Convex Function Minimization

Partially Observable Multi-agent RL with (Quasi-)Efficiency: The Blessing of Information Sharing

Fair and Robust Estimation of Heterogeneous Treatment Effects for Policy Learning

Implicit Regularization Leads to Benign Overfitting for Sparse Linear Regression

Is Consensus Acceleration Possible in Decentralized Optimization over Slowly Time-Varying Networks?

On Coresets for Clustering in Small Dimensional Euclidean spaces

Regret-Minimizing Double Oracle for Extensive-Form Games

GC-Flow: A Graph-Based Flow Network for Effective Clustering

Mitigating Memorization of Noisy Labels by Clipping the Model Prediction

Refining Generative Process with Discriminator Guidance in Score-based Diffusion Models

The SSL Interplay: Augmentations, Inductive Bias, and Generalization

Emergence of Adaptive Circadian Rhythms in Deep Reinforcement Learning

Theory on Forgetting and Generalization of Continual Learning

Curriculum Co-disentangled Representation Learning across Multiple Environments for Social Recommendation

Building Neural Networks on Matrix Manifolds: A Gyrovector Space Approach

(ends 6:00 PM)

6 p.m.

Coffee Only Break

Oral C1 Supervised Learning [6:00-7:30]

Orals 6:00-7:20

[6:00] Mimetic Initialization of Self-Attention Layers

[6:08] Difference of submodular minimization via DC programming

[6:16] Simplex Random Features

[6:24] Patch-level Routing in Mixture-of-Experts is Provably Sample-efficient for Convolutional Neural Networks

[6:32] Tilted Sparse Additive Models

[6:40] Dynamic Regularized Sharpness Aware Minimization in Federated Learning: Approaching Global Consistency and Smooth Landscape

[6:48] Hyena Hierarchy: Towards Larger Convolutional Language Models

[6:56] Direct Parameterization of Lipschitz-Bounded Deep Networks

[7:12] Subsample Ridge Ensembles: Equivalences and Generalized Cross-Validation

(ends 7:30 PM)

Oral C2 Time Series / Dynamics / Sequences [6:00-7:30]

Orals 6:00-7:20

[6:00] Neural Continuous-Discrete State Space Models for Irregularly-Sampled Time Series

[6:08] Self-Interpretable Time Series Prediction with Counterfactual Explanations

[6:16] Resurrecting Recurrent Neural Networks for Long Sequences

[6:24] Inferring Relational Potentials in Interacting Systems

[6:32] Memory-Based Dual Gaussian Processes for Sequential Learning

[6:40] H-Likelihood Approach to Deep Neural Networks with Temporal-Spatial Random Effects for High-Cardinality Categorical Features

[6:48] Generalized Teacher Forcing for Learning Chaotic Dynamics

[6:56] Gaussian Process Priors for Systems of Linear Partial Differential Equations with Constant Coefficients

[7:04] Spherical Fourier Neural Operators: Learning Stable Dynamics on the Sphere

[7:12] Learning Control-Oriented Dynamical Structure from Data

(ends 7:30 PM)

Oral C3 Multimodal and Pretaining [6:00-7:30]

Orals 6:00-7:20

[6:00] Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding

[6:08] Calibrating Multimodal Learning

[6:16] StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis

[6:24] ProtST: Multi-Modality Learning of Protein Sequences and Biomedical Texts

[6:32] Cross-Modal Fine-Tuning: Align then Refine

[6:40] Mu$^2$SLAM: Multitask, Multilingual Speech and Language Models

[6:48] Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time

[6:56] Pre-training for Speech Translation: CTC Meets Optimal Transport

[7:04] Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models

[7:12] Spherical Inducing Features for Orthogonally-Decoupled Gaussian Processes

(ends 7:30 PM)

Oral C4 Optimization [6:00-7:30]

Orals 6:00-7:12

[6:00] Second-Order Optimization with Lazy Hessians

[6:08] Unifying Nesterov's Accelerated Gradient Methods for Convex and Strongly Convex Objective Functions

[6:16] Transformer-based Stagewise Decomposition for Large-Scale Multistage Stochastic Optimization

[6:24] Continuation Path Learning for Homotopy Optimization

[6:32] Over-parametrization via Lifting for Low-rank Matrix Sensing: Conversion of Spurious Solutions to Strict Saddle Points

[6:40] Buying Information for Stochastic Optimization

[6:48] A Fully First-Order Method for Stochastic Bilevel Optimization

[6:56] Practical and Matching Gradient Variance Bounds for Black-Box Variational Bayesian Inference

[7:04] Learning-Rate-Free Learning by D-Adaptation

(ends 7:30 PM)

Oral C5 Misc [6:00-7:30]

Orals 6:00-7:04

[6:00] Learning Mixtures of Markov Chains and MDPs

[6:08] Uncertain Evidence in Probabilistic Models and Stochastic Simulators

[6:16] How Bad is Top-$K$ Recommendation under Competing Content Creators?

[6:24] Weighted Flow Diffusion for Local Graph Clustering with Node Attributes: an Algorithm and Statistical Guarantees

[6:32] Equivariant Polynomials for Graph Neural Networks

[6:40] Taming graph kernels with random features

[6:48] Robust Budget Pacing with a Single Sample

[6:56] Multicalibration as Boosting for Regression

(ends 7:30 PM)

Oral C6:

The Societal Impacts of AI

(ends 7:30 PM)

7:30 p.m.

Reception:

Closing Reception

(ends 8:45 PM)

8:45 p.m.

FRI 28 JUL

10 a.m.

11 a.m.

Registration

(ends 7:00 PM)

11:50 a.m.

Workshop:

2nd ICML Workshop on New Frontiers in Adversarial Machine Learning

(ends 8:00 PM)

11:55 a.m.

Workshop:

Workshop on Theory of Mind in Communicating Agents

(ends 8:10 PM)

noon

Workshop:

Structured Probabilistic Inference and Generative Modeling

(ends 8:00 PM)

Workshop:

New Frontiers in Learning, Control, and Dynamical Systems

(ends 7:00 PM)

Workshop:

2nd Workshop on Formal Verification of Machine Learning

(ends 8:00 PM)

Workshop:

Federated Learning and Analytics in Practice: Algorithms, Systems, Applications, and Opportunities

(ends 8:00 PM)

Workshop:

HiLD: High-dimensional Learning Dynamics Workshop

(ends 8:00 PM)

Workshop:

The Synergy of Scientific and Machine Learning Modelling (SynS & ML) Workshop

(ends 8:00 PM)

Workshop:

PAC-Bayes Meets Interactive Learning

(ends 7:45 PM)

Workshop:

2nd Annual Workshop on Topology, Algebra, and Geometry in Machine Learning (TAG-ML)

(ends 8:00 PM)

Workshop:

Challenges in Deployable Generative AI

(ends 8:00 PM)

Workshop:

Differentiable Almost Everything: Differentiable Relaxations, Algorithms, Operators, and Simulators

(ends 8:00 PM)

Workshop:

The Many Facets of Preference-Based Learning

(ends 8:00 PM)

Workshop:

Knowledge and Logical Reasoning in the Era of Data-driven Learning

(ends 8:00 PM)

12:15 p.m.

Workshop:

3rd Workshop on Interpretable Machine Learning in Healthcare (IMLH)

(ends 8:00 PM)

Affinity Workshop:

4th Women in Machine Learning (WiML) Un-Workshop

(ends 7:45 PM)

1 p.m.

Coffee Break

3 p.m.

Lunch -(On Your Own)

6 p.m.

Coffee Break

SAT 29 JUL

11 a.m.

Registration

(ends 2:00 PM)

11:50 a.m.

Workshop:

The Second Workshop on Spurious Correlations, Invariance and Stability

(ends 8:00 PM)

11:55 a.m.

Workshop:

ES-FoMo: Efficient Systems for Foundation Models

(ends 12:00 AM)

noon

Workshop:

Machine Learning for Multimodal Healthcare Data

(ends 8:00 PM)

Workshop:

Interactive Learning with Implicit Human Feedback

(ends 8:00 PM)

Workshop:

“Could it have been different?” Counterfactuals in Minds and Machines

(ends 8:00 PM)

Workshop:

ICML 2023 Workshop on Computational Biology

(ends 8:00 PM)

Workshop:

Localized Learning: Decentralized Model Updates via Non-Global Objectives

(ends 8:00 PM)

Workshop:

Neural Conversational AI Workshop - What’s left to TEACH (Trustworthy, Enhanced, Adaptable, Capable and Human-centric) chatbots?

(ends 8:15 PM)

Workshop:

Generative AI and Law (GenLaw)

(ends 8:00 PM)

Workshop:

Artificial Intelligence & Human Computer Interaction

(ends 8:00 PM)

Workshop:

DMLR Workshop: Data-centric Machine Learning Research

(ends 8:00 PM)

Workshop:

Sampling and Optimization in Discrete Space

(ends 7:45 PM)

Workshop:

Duality Principles for Modern Machine Learning

(ends 8:00 PM)

Workshop:

2nd ICML Workshop on Machine Learning for Astrophysics

(ends 8:00 PM)

Workshop:

Neural Compression: From Information Theory to Applications

(ends 8:00 PM)

12:15 p.m.

Affinity Workshop:

Indigenous in AI Workshop

(ends 8:15 PM)

1 p.m.

Coffee Break

3 p.m.

Lunch -(On Your Own)

6 p.m.

Coffee Break