Skip to yearly menu bar
Skip to main content
Main Navigation
ICML
Help/FAQ
Contact ICML
Downloads
Code of Conduct
Create Profile
Journal To Conference Track
Diversity & Inclusion
Privacy Policy
Future Meetings
Press
Careers
My Stuff
Login
Select Year: (2023)
2025
2024
2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2002
1996
IMLS Archives
Getting Started
Schedule
Tutorials
Main Conference
Invited Talks
Orals
Awards
Test of Time Award
Papers
Workshops
Community
Affinity Events
Socials
Mentorship
Town Hall / Business Meeting
Sponsors
Organizers
Help
FAQ
Presenters Instructions
Moderators Instructions
RocketChat Help
RocketChat Desktop Client
Browse
mini
compact
topic
detail
Showing papers for
.
×
×
title
author
topic
session
shuffle
by
serendipity
bookmarked first
visited first
not visited first
bookmarked but not visited
Enable Javascript in your browser to see the papers page.
Sample Complexity Bounds for Learning High-dimensional Simplices in Noisy Regimes
Grounding Language Models to Images for Multimodal Inputs and Outputs
Training-Free Neural Active Learning with Initialization-Robustness Guarantees
Scaling Vision Transformers to 22 Billion Parameters
ED-Batch: Efficient Automatic Batching of Dynamic Neural Networks via Learned Finite State Machines
Chameleon: Adapting to Peer Images for Planting Durable Backdoors in Federated Learning
Partial Optimality in Cubic Correlation Clustering
Poisoning Language Models During Instruction Tuning
Learning Distributions over Quantum Measurement Outcomes
Recasting Self-Attention with Holographic Reduced Representations
Wasserstein Barycenter Matching for Graph Size Generalization of Message Passing Neural Networks
Unveiling The Mask of Position-Information Pattern Through the Mist of Image Features
Fast Federated Machine Unlearning with Nonlinear Functional Theory
End-to-End Full-Atom Antibody Design
Dimension-independent Certified Neural Network Watermarks via Mollifier Smoothing
Efficient Personalized Federated Learning via Sparse Model-Adaptation
Byzantine-Robust Learning on Heterogeneous Data via Gradient Splitting
Equivariant Polynomials for Graph Neural Networks
Invariant Slot Attention: Object Discovery with Slot-Centric Reference Frames
Fed-CBS: A Heterogeneity-Aware Client Sampling Mechanism for Federated Learning via Class-Imbalance Reduction
QASA: Advanced Question Answering on Scientific Articles
Set-membership Belief State-based Reinforcement Learning for POMDPs
Towards Unbiased Training in Federated Open-world Semi-supervised Learning
NP-SemiSeg: When Neural Processes meet Semi-Supervised Semantic Segmentation
Temporally Consistent Transformers for Video Generation
Decentralized SGD and Average-direction SAM are Asymptotically Equivalent
Tilted Sparse Additive Models
Test-Time Style Shifting: Handling Arbitrary Styles in Domain Generalization
Which is Better for Learning with Noisy Labels: The Semi-supervised Method or Modeling Label Noise?
A Simple Zero-shot Prompt Weighting Technique to Improve Prompt Ensembling in Text-Image Models
A Coupled Flow Approach to Imitation Learning
On Strengthening and Defending Graph Reconstruction Attack with Markov Chain Approximation
Adaptive Annealed Importance Sampling with Constant Rate Progress
DoCoFL: Downlink Compression for Cross-Device Federated Learning
Topological Point Cloud Clustering
Constant Matters: Fine-grained Error Bound on Differentially Private Continual Observation
PixelAsParam: A Gradient View on Diffusion Sampling with Guidance
Adaptive Whitening in Neural Populations with Gain-modulating Interneurons
The Hessian perspective into the Nature of Convolutional Neural Networks
Reprogramming Pretrained Language Models for Antibody Sequence Infilling
Unifying Nesterov's Accelerated Gradient Methods for Convex and Strongly Convex Objective Functions
Shiftable Context: Addressing Training-Inference Context Mismatch in Simultaneous Speech Translation
Complementary Attention for Multi-Agent Reinforcement Learning
MultiRobustBench: Benchmarking Robustness Against Multiple Attacks
Scaling Laws for Reward Model Overoptimization
Reflected Diffusion Models
LEVER: Learning to Verify Language-to-Code Generation with Execution
Learning to Bid in Repeated First-Price Auctions with Budgets
Semi-Dual Unbalanced Quadratic Optimal Transport: fast statistical rates and convergent algorithm.
Multi-task Representation Learning for Pure Exploration in Linear Bandits
Online Restless Bandits with Unobserved States
Banker Online Mirror Descent: A Universal Approach for Delayed Online Bandit Learning
Out-of-Domain Robustness via Targeted Augmentations
Tuning Language Models as Training Data Generators for Augmentation-Enhanced Few-Shot Learning
Test-time Adaptation with Slot-Centric Models
Meta-SAGE: Scale Meta-Learning Scheduled Adaptation with Guided Exploration for Mitigating Scale Shift on Combinatorial Optimization
The Blessing of Heterogeneity in Federated Q-Learning: Linear Speedup and Beyond
Pre-computed memory or on-the-fly encoding? A hybrid approach to retrieval augmentation makes the most of your compute
Conditions and Assumptions for Constraint-based Causal Structure Learning
Fast Online Node Labeling for Very Large Graphs
Selective Machine Learning of the Average Treatment Effect with an Invalid Instrumental Variable
What Makes Entities Similar? A Similarity Flooding Perspective for Multi-sourced Knowledge Graph Embeddings
Drug Discovery under Covariate Shift with Domain-Informed Prior Distributions over Functions
Raising the Cost of Malicious AI-Powered Image Editing
Competitive Gradient Optimization
Graph Inductive Biases in Transformers without Message Passing
Fast Sampling of Diffusion Models via Operator Learning
Model Ratatouille: Recycling Diverse Models for Out-of-Distribution Generalization
Offline Reinforcement Learning with Closed-Form Policy Improvement Operators
Efficient and Degree-Guided Graph Generation via Discrete Diffusion Modeling
On the convergence of the MLE as an estimator of the learning rate in the Exp3 algorithm
Revisiting Discriminative vs. Generative Classifiers: Theory and Implications
Optimizing Hyperparameters with Conformal Quantile Regression
Optimizing DDPM Sampling with Shortcut Fine-Tuning
Federated Conformal Predictors for Distributed Uncertainty Quantification
Future-conditioned Unsupervised Pretraining for Decision Transformer
Learning Subpocket Prototypes for Generalizable Structure-based Drug Design
Synthetic Data, Real Errors: How (Not) to Publish and Use Synthetic Data
Paging with Succinct Predictions
Optimal Shrinkage for Distributed Second-Order Optimization
A new near-linear time algorithm for k-nearest neighbor search using a compressed cover tree
Analyzing Convergence in Quantum Neural Networks: Deviations from Neural Tangent Kernels
On Distribution Dependent Sub-Logarithmic Query Time of Learned Indexing
Principled Reinforcement Learning with Human Feedback from Pairwise or K-wise Comparisons
Beyond Reward: Offline Preference-guided Policy Optimization
Cold Analysis of Rao-Blackwellized Straight-Through Gumbel-Softmax Gradient Estimator
Correcting discount-factor mismatch in on-policy policy gradient methods
The multimarginal optimal transport formulation of adversarial multiclass classification
CLIPood: Generalizing CLIP to Out-of-Distributions
A Law of Robustness beyond Isoperimetry
Differential Privacy has Bounded Impact on Fairness in Classification
Neural Status Registers
Dimensionality Reduction for General KDE Mode Finding
Which Invariance Should We Transfer? A Causal Minimax Learning Approach
Does a Neural Network Really Encode Symbolic Concepts?
AdaBoost is not an Optimal Weak to Strong Learner
The Fast Johnson-Lindenstrauss Transform Is Even Faster
Nonlinear Advantage: Trained Networks Might Not Be As Complex as You Think
Learning the Dynamics of Sparsely Observed Interacting Systems
Constrained Decision Transformer for Offline Safe Reinforcement Learning
Low-Switching Policy Gradient with Exploration via Online Sensitivity Sampling
LinSATNet: The Positive Linear Satisfiability Neural Networks
GEAR: A GPU-Centric Experience Replay System for Large Reinforcement Learning Models
Consistency Models
Comparison of meta-learners for estimating multi-valued treatment heterogeneous effects
Self-supervised learning of Split Invariant Equivariant representations
Privacy-Aware Compression for Federated Learning Through Numerical Mechanism Design
Gaussian processes at the Helm(holtz): A more fluid model for ocean currents
A Two-Stage Active Learning Algorithm for k-Nearest Neighbors
Data-Copying in Generative Models: A Formal Framework
Generalized Reductions: Making any Hierarchical Clustering Fair and Balanced with Low Cost
Accelerated Primal-Dual Methods for Convex-Strongly-Concave Saddle Point Problems
From Noisy Fixed-Point Iterations to Private ADMM for Centralized and Federated Learning
Multi-Symmetry Ensembles: Improving Diversity and Generalization via Opposing Symmetries
Towards Theoretical Understanding of Inverse Reinforcement Learning
Accelerated Cyclic Coordinate Dual Averaging with Extrapolation for Composite Convex Optimization
Learning Preconditioners for Conjugate Gradient PDE Solvers
Anchor Sampling for Federated Learning with Partial Client Participation
MG-GNN: Multigrid Graph Neural Networks for Learning Multilevel Domain Decomposition Methods
Human-Timescale Adaptation in an Open-Ended Task Space
Improving the Model Consistency of Decentralized Federated Learning
Men Also Do Laundry: Multi-Attribute Bias Amplification
Auxiliary Modality Learning with Generalized Curriculum Distillation
ELSA: Efficient Label Shift Adaptation through the Lens of Semiparametric Models
SMURF-THP: Score Matching-based UnceRtainty quantiFication for Transformer Hawkes Process
Prometheus: Taming Sample and Communication Complexities in Constrained Decentralized Stochastic Bilevel Learning
Live in the Moment: Learning Dynamics Model Adapted to Evolving Policy
A General Representation Learning Framework with Generalization Performance Guarantees
SAAL: Sharpness-Aware Active Learning
Brainformers: Trading Simplicity for Efficiency
Multiple Thinking Achieving Meta-Ability Decoupling for Object Navigation
Normalizing Flows for Interventional Density Estimation
Is Overfitting Necessary for Implicit Video Representation?
Fast Combinatorial Algorithms for Min Max Correlation Clustering
Scaling Laws for Generative Mixed-Modal Language Models
DSGD-CECA: Decentralized SGD with Communication-Optimal Exact Consensus Algorithm
A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition
Memory-Based Meta-Learning on Non-Stationary Distributions
LipsNet: A Smooth and Robust Neural Network with Adaptive Lipschitz Constant for High Accuracy Optimal Control
Context-Aware Bayesian Network Actor-Critic Methods for Cooperative Multi-Agent Reinforcement Learning
Scalable Set Encoding with Universal Mini-Batch Consistency and Unbiased Full Set Gradient Approximation
Margin-based Neural Network Watermarking
Vector Quantized Wasserstein Auto-Encoder
Unleashing Mask: Explore the Intrinsic Out-of-Distribution Detection Capability
On the Importance of Feature Decorrelation for Unsupervised Representation Learning in Reinforcement Learning
Geometric Autoencoders - What You See is What You Decode
Estimating Heterogeneous Treatment Effects: Mutual Information Bounds and Learning Algorithms
FAIRER: Fairness as Decision Rationale Alignment
SE(3) diffusion model with application to protein backbone generation
Performative Recommendation: Diversifying Content via Strategic Incentives
How Does Information Bottleneck Help Deep Learning?
Momentum Ensures Convergence of SIGNSGD under Weaker Assumptions
Bidirectional Adaptation for Robust Semi-Supervised Learning with Inconsistent Data Distributions
OpenFE: Automated Feature Generation with Expert-level Performance
Model-based Reinforcement Learning with Scalable Composite Policy Gradient Estimators
On the Expressive Power of Geometric Graph Neural Networks
When and How Does Known Class Help Discover Unknown Ones? Provable Understanding Through Spectral Analysis
Eliminating Adversarial Noise via Information Discard and Robust Representation Restoration
Optimal No-Regret Learning for One-Sided Lipschitz Functions
Horizon-Free and Variance-Dependent Reinforcement Learning for Latent Markov Decision Processes
A Nearly-Optimal Bound for Fast Regression with $\ell_\infty$ Guarantee
Simple and Fast Group Robustness by Automatic Feature Reweighting
XTab: Cross-table Pretraining for Tabular Transformers
Controlled Text Generation with Natural Language Instructions
Adversarial robustness of amortized Bayesian inference
Explaining Reinforcement Learning with Shapley Values
Explaining the effects of non-convergent MCMC in the training of Energy-Based Models
Exploring Chemical Space with Score-based Out-of-distribution Generation
Second-Order Optimization with Lazy Hessians
Great Models Think Alike: Improving Model Reliability via Inter-Model Latent Agreement
Diffusion Models as Artists: Are we Closing the Gap between Humans and Machines?
Hindsight Learning for MDPs with Exogenous Inputs
Brauer's Group Equivariant Neural Networks
Constrained Efficient Global Optimization of Expensive Black-box Functions
Auto-Differentiation of Relational Computations for Very Large Scale Machine Learning
A Deep Conjugate Direction Method for Iteratively Solving Linear Systems
How Powerful are Shallow Neural Networks with Bandlimited Random Weights?
Nonparametric Generative Modeling with Conditional Sliced-Wasserstein Flows
The Wisdom of Hindsight Makes Language Models Better Instruction Followers
Exploring Model Dynamics for Accumulative Poisoning Discovery
High-dimensional Clustering onto Hamiltonian Cycle
Bag of Tricks for Training Data Extraction from Language Models
Unearthing InSights into Mars: Unsupervised Source Separation with Limited Data
Faith-Shap: The Faithful Shapley Interaction Index
Curriculum Co-disentangled Representation Learning across Multiple Environments for Social Recommendation
Understand and Modularize Generator Optimization in ELECTRA-style Pretraining
Vertical Federated Graph Neural Network for Recommender System
Practical and Matching Gradient Variance Bounds for Black-Box Variational Bayesian Inference
Concept-based Explanations for Out-of-Distribution Detectors
Understanding Plasticity in Neural Networks
GeCoNeRF: Few-shot Neural Radiance Fields via Geometric Consistency
Improved Learning-Augmented Algorithms for the Multi-Option Ski Rental Problem via Best-Possible Competitive Analysis
The Statistical Benefits of Quantile Temporal-Difference Learning for Value Estimation
Gradient-Free Structured Pruning with Unlabeled Data
Interpretable Neural-Symbolic Concept Reasoning
XAI Beyond Classification: Interpretable Neural Clustering
Blossom: an Anytime Algorithm for Computing Optimal Decision Trees
Less is More: Task-aware Layer-wise Distillation for Language Model Compression
Probabilistic Concept Bottleneck Models
Optimizing Mode Connectivity for Class Incremental Learning
Smart Initial Basis Selection for Linear Programs
Minimum Width of Leaky-ReLU Neural Networks for Uniform Universal Approximation
Rethink DARTS Search Space and Renovate a New Benchmark
Averaged Method of Multipliers for Bi-Level Optimization without Lower-Level Strong Convexity
Understanding Gradient Regularization in Deep Learning: Efficient Finite-Difference Computation and Implicit Bias
Policy Contrastive Imitation Learning
Statistical Inference and A/B Testing for First-Price Pacing Equilibria
Stratified Adversarial Robustness with Rejection
RLang: A Declarative Language for Describing Partial World Knowledge to Reinforcement Learning Agents
Offline Meta Reinforcement Learning with In-Distribution Online Adaptation
Answering Complex Logical Queries on Knowledge Graphs via Query Computation Tree Optimization
Uncertainty Estimation for Molecules: Desiderata and Methods
Transformers Meet Directed Graphs
Modeling Temporal Data as Continuous Functions with Stochastic Process Diffusion
Crafting Training Degradation Distribution for the Accuracy-Generalization Trade-off in Real-World Super-Resolution
SEGA: Structural Entropy Guided Anchor View for Graph Contrastive Learning
Learning Neural Constitutive Laws from Motion Observations for Generalizable PDE Dynamics
Evolving Semantic Prototype Improves Generative Zero-Shot Learning
Enhancing Activity Prediction Models in Drug Discovery with the Ability to Understand Human Language
Contextual Combinatorial Bandits with Probabilistically Triggered Arms
Dataset Distillation with Convexified Implicit Gradients
Structure-informed Language Models Are Protein Designers
Cross-Modal Fine-Tuning: Align then Refine
Prompting Large Language Model for Machine Translation: A Case Study
Patch-level Contrastive Learning via Positional Query for Visual Pre-training
Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models
Linear optimal partial transport embedding
Understanding Int4 Quantization for Language Models: Latency Speedup, Composability, and Failure Cases
Neural Network Approximations of PDEs Beyond Linearity: A Representational Perspective
Counterfactual Identifiability of Bijective Causal Models
Subequivariant Graph Reinforcement Learning in 3D Environments
Generalized Polyak Step Size for First Order Optimization with Momentum
On Investigating the Conservative Property of Score-Based Generative Models
Oscillation-free Quantization for Low-bit Vision Transformers
Leveraging Demonstrations to Improve Online Learning: Quality Matters
Loss-Guided Diffusion Models for Plug-and-Play Controllable Generation
Causal Isotonic Calibration for Heterogeneous Treatment Effects
What is Essential for Unseen Goal Generalization of Offline Goal-conditioned RL?
Towards Robust and Safe Reinforcement Learning with Benign Off-policy Data
Dynamical Linear Bandits
Beyond the Universal Law of Robustness: Sharper Laws for Random Features and Neural Tangent Kernels
Function-Space Regularization in Neural Networks: A Probabilistic Perspective
Feature Directions Matter: Long-Tailed Learning via Rotated Balanced Representation
Interpolation for Robust Learning: Data Augmentation on Wasserstein Geodesics
Streaming Active Learning with Deep Neural Networks
Towards Omni-generalizable Neural Methods for Vehicle Routing Problems
Adversarial Example Does Good: Preventing Painting Imitation from Diffusion Models via Adversarial Examples
Escaping saddle points in zeroth-order optimization: the power of two-point estimators
UMD: Unsupervised Model Detection for X2X Backdoor Attacks
Infusing Lattice Symmetry Priors in Attention Mechanisms for Sample-Efficient Abstract Geometric Reasoning
Kernel QuantTree
Accuracy on the Curve: On the Nonlinear Correlation of ML Performance Between Data Subpopulations
Long-Term Rhythmic Video Soundtracker
Rethinking Warm-Starts with Predictions: Learning Predictions Close to Sets of Optimal Solutions for Faster $\text{L}$-/$\text{L}^\natural$-Convex Function Minimization
Memory-Based Dual Gaussian Processes for Sequential Learning
Semi-Autoregressive Energy Flows: Exploring Likelihood-Free Training of Normalizing Flows
Multi-Task Differential Privacy Under Distribution Skew
Exploiting locality in high-dimensional Factorial hidden Markov models
Inflow, Outflow, and Reciprocity in Machine Learning
End-to-End Learning for Stochastic Optimization: A Bayesian Perspective
Behavior Contrastive Learning for Unsupervised Skill Discovery
Beyond Homophily: Reconstructing Structure for Graph-agnostic Clustering
Project and Forget: Solving Large-Scale Metric Constrained Problems
Target-based Surrogates for Stochastic Optimization
Transformers as Algorithms: Generalization and Stability in In-context Learning
Sequence Modeling with Multiresolution Convolutional Memory
Let's Make Block Coordinate Descent Converge Faster: Faster Greedy Rules, Message-Passing, Active-Set Complexity, and Superlinear Convergence
Near-Optimal Algorithms for Private Online Optimization in the Realizable Regime
Improved Techniques for Maximum Likelihood Estimation for Diffusion ODEs
ILLUME: Rationalizing Vision-Language Models through Human Interactions
On the Convergence Rates of Policy Gradient Methods
Quantifying Human Priors over Social and Navigation Networks
Cluster-Specific Predictions with Multi-Task Gaussian Processes
Disentangled Multiplex Graph Representation Learning
Beyond Lipschitz Smoothness: A Tighter Analysis for Nonconvex Optimization
Graph Generative Model for Benchmarking Graph Neural Networks
A New PHO-rmula for Improved Performance of Semi-Structured Networks
Federated Online and Bandit Convex Optimization
Towards Trustworthy Explanation: On Causal Rationalization
Emergent Agentic Transformer from Chain of Hindsight Experience
Generating Novel, Designable, and Diverse Protein Structures by Equivariantly Diffusing Oriented Residue Clouds
Rigid Body Flows for Sampling Molecular Crystal Structures
Continual Learning in Linear Classification on Separable Data
Hypervolume Knowledge Gradient: A Lookahead Approach for Multi-Objective Bayesian Optimization with Partial Information
Combinatorial Neural Bandits
Distributed Stochastic Gradient Descent: Nonconvexity, Nonsmoothness, and Convergence to Local Minima
Semi-Parametric Contextual Pricing Algorithm using Cox Proportional Hazards Model
When Sparsity Meets Contrastive Models: Less Graph Data Can Bring Better Class-Balanced Representations
Improved Analysis of Score-based Generative Modeling: User-Friendly Bounds under Minimal Smoothness Assumptions
Constrained Optimization via Exact Augmented Lagrangian and Randomized Iterative Sketching
Competing for Shareable Arms in Multi-Player Multi-Armed Bandits
Regularization-free Diffeomorphic Temporal Alignment Nets
CircuitNet: A Generic Neural Network to Realize Universal Circuit Motif Modeling
Learning Physical Models that Can Respect Conservation Laws
Sampling random graph homomorphisms and applications to network data analysis
Cooperative Open-ended Learning Framework for Zero-Shot Coordination
Lazy Agents: A New Perspective on Solving Sparse Reward Problem in Multi-agent Reinforcement Learning
Multi-Modal Classifiers for Open-Vocabulary Object Detection
Optimizing the Collaboration Structure in Cross-Silo Federated Learning
Effective Neural Topic Modeling with Embedding Clustering Regularization
Probably Anytime-Safe Stochastic Combinatorial Semi-Bandits
Shedding a PAC-Bayesian Light on Adaptive Sliced-Wasserstein Distances
LeadFL: Client Self-Defense against Model Poisoning in Federated Learning
Generative Pretraining for Black-Box Optimization
Diffusion Models for Black-Box Optimization
Robust Budget Pacing with a Single Sample
Mu$^2$SLAM: Multitask, Multilingual Speech and Language Models
Thompson Sampling with Diffusion Generative Prior
Revisiting Pseudo-Label for Single-Positive Multi-Label Learning
Progressive Purification for Instance-Dependent Partial Label Learning
Global Convergence of Sub-gradient Method for Robust Matrix Recovery: Small Initialization, Noisy Measurements, and Over-parameterization
Patch-level Routing in Mixture-of-Experts is Provably Sample-efficient for Convolutional Neural Networks
A Complete Expressiveness Hierarchy for Subgraph GNNs via Subgraph Weisfeiler-Lehman Tests
DDGR: Continual Learning with Deep Diffusion-based Generative Replay
Generalized-Smooth Nonconvex Optimization is As Efficient As Smooth Nonconvex Optimization
A General Theory for Federated Optimization with Asynchronous and Heterogeneous Clients Updates
Free-Form Variational Inference for Gaussian Process State-Space Models
Transformed Distribution Matching for Missing Value Imputation
Identifying Useful Learnwares for Heterogeneous Label Spaces
Graph Neural Networks can Recover the Hidden Features Solely from the Graph Structure
Disentangled Generative Models for Robust Prediction of System Dynamics
Simplifying Momentum-based Positive-definite Submanifold Optimization with Applications to Deep Learning
Linearly Constrained Bilevel Optimization: A Smoothed Implicit Gradient Approach
Dirichlet Diffusion Score Model for Biological Sequence Generation
Lifelong Language Pretraining with Distribution-Specialized Experts
Self-Interpretable Time Series Prediction with Counterfactual Explanations
Robust Perception through Equivariance
Is Learning Summary Statistics Necessary for Likelihood-free Inference?
Simplex Random Features
Demonstration-free Autonomous Reinforcement Learning via Implicit and Bidirectional Curriculum
Efficient Bound of Lipschitz Constant for Convolutional Layers by Gram Iteration
Data Poisoning Attacks Against Multimodal Encoders
Properties of the Mallows Model Depending on the Number of Alternatives: A Warning for an Experimentalist
Data-OOB: Out-of-bag Estimate as a Simple and Efficient Data Value
Diagnosis, Feedback, Adaptation: A Human-in-the-Loop Framework for Test-Time Policy Adaptation
Specializing Smaller Language Models towards Multi-Step Reasoning
SeMAIL: Eliminating Distractors in Visual Imitation via Separated Models
Continual Task Allocation in Meta-Policy Network via Sparse Prompting
Structured Cooperative Learning with Graphical Model Priors
Does Continual Learning Equally Forget All Parameters?
InfoDiffusion: Representation Learning Using Information Maximizing Diffusion Models
Learning-Rate-Free Learning by D-Adaptation
EM-Network: Oracle Guided Self-distillation for Sequence Learning
Coordinated Dynamic Bidding in Repeated Second-Price Auctions with Budgets
Implicit Graph Neural Networks: A Monotone Operator Viewpoint
SurProGenes: Survival Risk-Ordered Representation of Cancer Patients and Genes for the Identification of Prognostic Genes
Rockmate: an Efficient, Fast, Automatic and Generic Tool for Re-materialization in PyTorch
Towards Understanding Ensemble Distillation in Federated Learning
On the Convergence of Federated Averaging with Cyclic Client Participation
One-sided Matrix Completion from Two Observations Per Row
Autoregressive Diffusion Model for Graph Generation
Fast Private Kernel Density Estimation via Locality Sensitive Quantization
Statistical Inference on Multi-armed Bandits with Delayed Feedback
Robust Explanation for Free or At the Cost of Faithfulness
Tight Data Access Bounds for Private Top-$k$ Selection
dugMatting: Decomposed-Uncertainty-Guided Matting
R-U-SURE? Uncertainty-Aware Code Suggestions By Maximizing Utility Across Random User Intents
Learning Controllable Degradation for Real-World Super-Resolution via Constrained Flows
Bandit Multi-linear DR-Submodular Maximization and Its Applications on Adversarial Submodular Bandits
Are Random Decompositions all we need in High Dimensional Bayesian Optimisation?
Computational Doob h-transforms for Online Filtering of Discretely Observed Diffusions
Resurrecting Recurrent Neural Networks for Long Sequences
The Value of Out-of-Distribution Data
Meta Optimal Transport
A Picture of the Space of Typical Learnable Tasks
Discover and Cure: Concept-aware Mitigation of Spurious Correlation
Estimating Joint Treatment Effects by Combining Multiple Experiments
Efficient Graph Field Integrators Meet Point Clouds
Towards Reliable Neural Specifications
On Computing Optimal Tree Ensembles
Learning Mixtures of Gaussians with Censored Data
Nonlinear Causal Discovery with Latent Confounders
Functional Neural Networks: Shift invariant models for functional data with applications to EEG classification
When is Realizability Sufficient for Off-Policy Reinforcement Learning?
Provable Benefit of Mixup for Finding Optimal Decision Boundaries
Near-Optimal Cryptographic Hardness of Agnostically Learning Halfspaces and ReLU Regression under Gaussian Marginals
Performative Reinforcement Learning
Neuro-Symbolic Continual Learning: Knowledge, Reasoning Shortcuts and Concept Rehearsal
Learning Control-Oriented Dynamical Structure from Data
Mitigating Propagation Failures in Physics-informed Neural Networks using Retain-Resample-Release (R3) Sampling
Uncertain Evidence in Probabilistic Models and Stochastic Simulators
Open-VCLIP: Transforming CLIP to an Open-vocabulary Video Model via Interpolated Weight Optimization
POUF: Prompt-Oriented Unsupervised Fine-tuning for Large Pre-trained Models
Shape-Guided Dual-Memory Learning for 3D Anomaly Detection
Temporal Label Smoothing for Early Event Prediction
Identifiability and Generalizability in Constrained Inverse Reinforcement Learning
Deep Clustering with Incomplete Noisy Pairwise Annotations: A Geometric Regularization Approach
Learning Rate Schedules in the Presence of Distribution Shift
GOAT: A Global Transformer on Large-scale Graphs
Local Vertex Colouring Graph Neural Networks
A Hybrid Quantum-Classical Approach based on the Hadamard Transform for the Convolutional Layer
PAC-Bayesian Offline Contextual Bandits With Guarantees
On Balancing Bias and Variance in Unsupervised Multi-Source-Free Domain Adaptation
Multi-class Graph Clustering via Approximated Effective $p$-Resistance
A Framework for Adapting Offline Algorithms to Solve Combinatorial Multi-Armed Bandit Problems with Bandit Feedback
Accelerated Stochastic Optimization Methods under Quasar-convexity
Personalized Subgraph Federated Learning
A Kernel Stein Test of Goodness of Fit for Sequential Models
MultiAdam: Parameter-wise Scale-invariant Optimizer for Multiscale Training of Physics-informed Neural Networks
VIMA: Robot Manipulation with Multimodal Prompts
Gradient Descent Finds the Global Optima of Two-Layer Physics-Informed Neural Networks
On User-Level Private Convex Optimization
Tensor Gaussian Process with Contraction for Multi-Channel Imaging Analysis
Learning to Jump: Thinning and Thickening Latent Counts for Generative Modeling
Are labels informative in semi-supervised learning? Estimating and leveraging the missing-data mechanism.
A Neural PDE Solver with Temporal Stencil Modeling
Understanding Oversquashing in GNNs through the Lens of Effective Resistance
The Numerical Stability of Hyperbolic Representation Learning
Interval Bound Interpolation for Few-shot Learning with Few Tasks
A Model-free Closeness-of-influence Test for Features in Supervised Learning
Generalized Disparate Impact for Configurable Fairness Solutions in ML
Truncating Trajectories in Monte Carlo Reinforcement Learning
Trapdoor Normalization with Irreversible Ownership Verification
For Pre-Trained Vision Models in Motor Control, Not All Policy Learning Methods are Created Equal
HyperTuning: Toward Adapting Large Language Models without Back-propagation
Underspecification Presents Challenges for Credibility in Modern Machine Learning
Doubly Optimal No-Regret Learning in Monotone Games
Kernel Sufficient Dimension Reduction and Variable Selection for Compositional Data via Amalgamation
Active Learning based Structural Inference
Curious Replay for Model-based Adaptation
From Temporal to Contemporaneous Iterative Causal Discovery in the Presence of Latent Confounders
The Power of Uniform Sampling for k-Median
Towards Understanding and Reducing Graph Structural Noise for GNNs
BEATs: Audio Pre-Training with Acoustic Tokenizers
Optimistic Planning by Regularized Dynamic Programming
FedAvg Converges to Zero Training Loss Linearly for Overparameterized Multi-Layer Neural Networks
A Closer Look at Self-Supervised Lightweight Vision Transformers
Feed Two Birds with One Scone: Exploiting Wild Data for Both Out-of-Distribution Generalization and Detection
Defects of Convolutional Decoder Networks in Frequency Representation
An Instrumental Variable Approach to Confounded Off-Policy Evaluation
Multi-agent Online Scheduling: MMS Allocations for Indivisible Items
Cooperation in the Latent Space: The Benefits of Adding Mixture Components in Variational Autoencoders
Learning Belief Representations for Partially Observable Deep RL
Discrete Continuous Optimization Framework for Simultaneous Clustering and Training in Mixture Models
Entropy-driven Unsupervised Keypoint Representation Learning in Videos
Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling
Regret Bounds for Markov Decision Processes with Recursive Optimized Certainty Equivalents
A Reinforcement Learning Framework for Dynamic Mediation Analysis
Subset-Based Instance Optimality in Private Estimation
Returning The Favour: When Regression Benefits From Probabilistic Causal Knowledge
Automatic Data Augmentation via Invariance-Constrained Learning
Exponential Smoothing for Off-Policy Learning
Not all Strongly Rayleigh Distributions Have Small Probabilistic Generating Circuits
Adversarial Policies Beat Superhuman Go AIs
On the Impact of Knowledge Distillation for Model Interpretability
Adaptive Barrier Smoothing for First-Order Policy Gradient with Contact Dynamics
On the Correctness of Automatic Differentiation for Neural Networks with Machine-Representable Parameters
Revisiting Simple Regret: Fast Rates for Returning a Good Arm
simple diffusion: End-to-end diffusion for high resolution images
The Optimal Approximation Factors in Misspecified Off-Policy Value Function Estimation
A Fully First-Order Method for Stochastic Bilevel Optimization
Stochastic Gradient Succeeds for Bandits
Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice
Towards Learning Geometric Eigen-Lengths Crucial for Fitting Tasks
Parallel Online Clustering of Bandits via Hedonic Game
Long-Tailed Recognition by Mutual Information Maximization between Latent Features and Ground-Truth Labels
Model-Free Robust Average-Reward Reinforcement Learning
Dividing and Conquering a BlackBox to a Mixture of Interpretable Models: Route, Interpret, Repeat
Special Properties of Gradient Descent with Large Learning Rates
Inverse Reinforcement Learning without Reinforcement Learning
Do Embodied Agents Dream of Pixelated Sheep: Embodied Decision Making using Language Guided World Modelling
Regularizing Towards Soft Equivariance Under Mixed Symmetries
Probabilistic Imputation for Time-series Classification with Missing Data
The Virtues of Laziness in Model-based RL: A Unified Objective and Algorithms
Multi-User Reinforcement Learning with Low Rank Rewards
Towards Better Graph Representation Learning with Parameterized Decomposition & Filtering
Short-lived High-volume Bandits
Bidirectional Learning for Offline Model-based Biological Sequence Design
DevFormer: A Symmetric Transformer for Context-Aware Device Placement
Generalization on the Unseen, Logic Reasoning and Degree Curriculum
Accounting For Informative Sampling When Learning to Forecast Treatment Outcomes Over Time
Learning Perturbations to Explain Time Series Predictions
Optimality of Thompson Sampling with Noninformative Priors for Pareto Bandits
GAT: Guided Adversarial Training with Pareto-optimal Auxiliary Tasks
Robust Satisficing MDPs
Robust One-Class Classification with Signed Distance Function using 1-Lipschitz Neural Networks
Weakly Supervised Disentangled Generative Causal Representation Learning
Buying Information for Stochastic Optimization
Neural FIM for learning Fisher information metrics from point cloud data
Lowering the Pre-training Tax for Gradient-based Subset Training: A Lightweight Distributed Pre-Training Toolkit
Learning Dense Correspondences between Photos and Sketches
Synthetic data for model selection
Towards Coherent Image Inpainting Using Denoising Diffusion Implicit Models
Learning to Suggest Breaks: Sustainable Optimization of Long-Term User Engagement
Feature Expansion for Graph Neural Networks
D2Match: Leveraging Deep Learning and Degeneracy for Subgraph Matching
How Do Transformers Learn Topic Structure: Towards a Mechanistic Understanding
Randomized Gaussian Process Upper Confidence Bound with Tighter Bayesian Regret Bounds
Scaling of Class-wise Training Losses for Post-hoc Calibration
Q-Flow: Generative Modeling for Differential Equations of Open Quantum Dynamics with Normalizing Flows
TIPS: Topologically Important Path Sampling for Anytime Neural Networks
On the Effectiveness of Offline RL for Dialogue Response Generation
Learning Antidote Data to Individual Unfairness
On the Stepwise Nature of Self-Supervised Learning
Identification of the Adversary from a Single Adversarial Example
Discover-Then-Rank Unlabeled Support Vectors in the Dual Space for Multi-Class Active Learning
Effectively Using Public Data in Privacy Preserving Machine Learning
Multiplier Bootstrap-based Exploration
Gradient Descent in Neural Networks as Sequential Learning in Reproducing Kernel Banach Space
Pareto Manifold Learning: Tackling multiple tasks via ensembles of single-task models
Uncovering Adversarial Risks of Test-Time Adaptation
Self-supervised Neural Factor Analysis for Disentangling Utterance-level Speech Representations
On Coresets for Clustering in Small Dimensional Euclidean spaces
CLUTR: Curriculum Learning via Unsupervised Task Representation Learning
Feature learning in deep classifiers through Intermediate Neural Collapse
Internet Explorer: Targeted Representation Learning on the Open Web
MetricGAN-OKD: Multi-Metric Optimization of MetricGAN via Online Knowledge Distillation for Speech Enhancement
A Category-theoretical Meta-analysis of Definitions of Disentanglement
Random Classification Noise does not defeat All Convex Potential Boosters Irrespective of Model Choice
Understanding the Role of Feedback in Online Learning with Switching Costs
Differentially Private Episodic Reinforcement Learning with Heavy-tailed Rewards
DIVISION: Memory Efficient Training via Dual Activation Precision
Implicit Regularization Leads to Benign Overfitting for Sparse Linear Regression
Hiding Data Helps: On the Benefits of Masking for Sparse Coding
Improving Statistical Fidelity for Neural Image Compression with Implicit Local Likelihood Models
Provably Learning Diverse Features in Multi-View Data with Midpoint Mixup
Improving Adversarial Robustness by Putting More Regularizations on Less Robust Samples
The Price of Differential Privacy under Continual Observation
Delayed Feedback in Kernel Bandits
Continuous Spatiotemporal Transformer
LSDS++ : Dual Sampling for Accelerated k-means++
InfoOT: Information Maximizing Optimal Transport
Neural signature kernels as infinite-width-depth-limits of controlled ResNets
Regression with Label Permutation in Generalized Linear Model
Local Optimization Achieves Global Optimality in Multi-Agent Reinforcement Learning
On the Within-Group Fairness of Screening Classifiers
Analysis of Error Feedback in Federated Non-Convex Optimization with Biased Compression: Fast Convergence and Partial Participation
Achieving High Accuracy with PINNs via Energy Natural Gradient Descent
TRAK: Attributing Model Behavior at Scale
Learning to Incentivize Information Acquisition: Proper Scoring Rules Meet Principal-Agent Model
Two-Scale Gradient Descent Ascent Dynamics Finds Mixed Nash Equilibria of Continuous Games: A Mean-Field Perspective
Provably Efficient Representation Learning with Tractable Planning in Low-Rank POMDP
Enforcing Hard Constraints with Soft Barriers: Safe Reinforcement Learning in Unknown Stochastic Environments
Git-Theta: A Git Extension for Collaborative Development of Machine Learning Models
DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule
Rethinking Backdoor Attacks
Linear CNNs Discover the Statistical Structure of the Dataset Using Only the Most Dominant Frequencies
Finding the Missing-half: Graph Complementary Learning for Homophily-prone and Heterophily-prone Graphs
$\pi$-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation
Compositional Score Modeling for Simulation-Based Inference
Nearly Optimal Algorithms with Sublinear Computational Complexity for Online Kernel Regression
A Watermark for Large Language Models
Online Prototype Alignment for Few-shot Policy Transfer
MetaModulation: Learning Variational Feature Hierarchies for Few-Shot Learning with Fewer Tasks
Flexible Model Aggregation for Quantile Regression
Looped Transformers as Programmable Computers
Deep Perturbation Learning: Enhancing the Network Performance via Image Perturbations
Improving Adversarial Robustness Through the Contrastive-Guided Diffusion Process
A Gromov--Wasserstein Geometric View of Spectrum-Preserving Graph Coarsening
Spherical Fourier Neural Operators: Learning Stable Dynamics on the Sphere
Double-Weighting for Covariate Shift Adaptation
NTK-approximating MLP Fusion for Efficient Language Model Fine-tuning
How Bad is Top-$K$ Recommendation under Competing Content Creators?
Continuation Path Learning for Homotopy Optimization
Random Matrix Analysis to Balance between Supervised and Unsupervised Learning under the Low Density Separation Assumption
Neural Wasserstein Gradient Flows for Discrepancies with Riesz Kernels
A Critical View of Vision-Based Long-Term Dynamics Prediction Under Environment Misalignment
Aligning Language Models with Preferences through $f$-divergence Minimization
SpotEM: Efficient Video Search for Episodic Memory
Disentangled Multi-Fidelity Deep Bayesian Active Learning
Reinforcement Learning with History Dependent Dynamic Contexts
ModelDiff: A Framework for Comparing Learning Algorithms
MetaDiffuser: Diffusion Model as Conditional Planner for Offline Meta-RL
Randomized Schur Complement Views for Graph Contrastive Learning
AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
A Game-Theoretic Framework for Managing Risk in Multi-Agent Systems
On the Convergence Rate of Gaussianization with Random Rotations
SOM-CPC: Unsupervised Contrastive Learning with Self-Organizing Maps for Structured Representations of High-Rate Time Series
CHiLS: Zero-Shot Image Classification with Hierarchical Label Sets
Revisiting Sampling for Combinatorial Optimization
Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision Processes
Consistency of Multiple Kernel Clustering
Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic Shortest Path
Optimal Online Generalized Linear Regression with Stochastic Noise and Its Application to Heteroscedastic Bandits
Non-asymptotic Properties of Individualized Treatment Rules from Sequentially Rule-Adaptive Trials
Long Horizon Temperature Scaling
Improving l1-Certified Robustness via Randomized Smoothing by Leveraging Box Constraints
Using Perturbation to Improve Goodness-of-Fit Tests based on Kernelized Stein Discrepancy
Understanding Self-Predictive Learning for Reinforcement Learning
Online Learning in Stackelberg Games with an Omniscient Follower
Low Complexity Homeomorphic Projection to Ensure Neural-Network Solution Feasibility for Optimization over (Non-)Convex Set
High-dimensional Location Estimation via Norm Concentration for Subgamma Vectors
Fast Inference from Transformers via Speculative Decoding
Semi-Offline Reinforcement Learning for Optimized Text Generation
DRCFS: Doubly Robust Causal Feature Selection
Improving Medical Predictions by Irregular Multimodal Electronic Health Records Modeling
On the Role of Attention in Prompt-tuning
Corruption-Robust Algorithms with Uncertainty Weighting for Nonlinear Contextual Bandits and Markov Decision Processes
Learning to Learn from APIs: Black-Box Data-Free Meta-Learning
Multi-Task Off-Policy Learning from Bandit Feedback
A Statistical Perspective on Retrieval-Based Models
Near-optimal Conservative Exploration in Reinforcement Learning under Episode-wise Constraints
Transformer-based Stagewise Decomposition for Large-Scale Multistage Stochastic Optimization
LegendreTron: Uprising Proper Multiclass Loss Learning
Modality-Agnostic Variational Compression of Implicit Neural Representations
The Persistent Laplacian for Data Science: Evaluating Higher-Order Persistent Spectral Representations of Data
Never mind the metrics---what about the uncertainty? Visualising binary confusion matrix metric distributions to put performance in perspective
SparseGPT: Massive Language Models Can be Accurately Pruned in One-Shot
Off-Policy Average Reward Actor-Critic with Deterministic Policy Search
Quantum Speedups for Zero-Sum Games via Improved Dynamic Gibbs Sampling
Improving Adversarial Robustness of Deep Equilibrium Models with Explicit Regulations Along the Neural Dynamics
Delay-Adapted Policy Optimization and Improved Regret for Adversarial MDP with Delayed Bandit Feedback
Near-Optimal $\Phi$-Regret Learning in Extensive-Form Games
Team Belief DAG: Generalizing the Sequence Form to Team Games for Fast Computation of Correlated Team Max-Min Equilibria via Regret Minimization
Dink-Net: Neural Clustering on Large Graphs
What Can Be Learnt With Wide Convolutional Neural Networks?
Invariance in Policy Optimisation and Partial Identifiability in Reward Learning
SRATTA: Sample Re-ATTribution Attack of Secure Aggregation in Federated Learning.
Theory on Forgetting and Generalization of Continual Learning
Internally Rewarded Reinforcement Learning
Generalization Bounds using Data-Dependent Fractal Dimensions
Flash: Concept Drift Adaptation in Federated Learning
DIFF2: Differential Private Optimization via Gradient Differences for Nonconvex Distributed Learning
Tight and fast generalization error bound of graph embedding in metric space
Primal and Dual Analysis of Entropic Fictitious Play for Finite-sum Problems
Modeling Dynamic Environments with Scene Graph Memory
A Scalable Frank-Wolfe-Based Algorithm for the Max-Cut SDP
High Fidelity Image Counterfactuals with Probabilistic Causal Models
On the Functional Similarity of Robust and Non-Robust Neural Representations
Provably Learning Object-Centric Representations
Open-Vocabulary Universal Image Segmentation with MaskCLIP
Active Policy Improvement from Multiple Black-box Oracles
CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling
Compositional Exemplars for In-context Learning
In Search for a Generalizable Method for Source Free Domain Adaptation
From Relational Pooling to Subgraph GNNs: A Universal Framework for More Expressive Graph Neural Networks
ODS: Test-Time Adaptation in the Presence of Open-World Data Shift
User-level Private Stochastic Convex Optimization with Optimal Rates
NeuralSlice: Neural 3D Triangle Mesh Reconstruction via Slicing 4D Tetrahedral Meshes
Prototype-oriented unsupervised anomaly detection for multivariate time series
Best of Both Worlds Policy Optimization
Bayesian Progressive Deep Topic Model with Knowledge Informed Textual Data Coarsening Process
ESC: Exploration with Soft Commonsense Constraints for Zero-shot Object Navigation
High Probability Convergence of Stochastic Gradient Methods
abess: A Fast Best-Subset Selection Library in Python and R
Covariate balancing using the integral probability metric for causal inference
Scalable Multi-Agent Reinforcement Learning through Intelligent Information Aggregation
Taxonomy-Structured Domain Adaptation
NeuralStagger: Accelerating Physics-constrained Neural PDE Solver with Spatial-temporal Decomposition
Robust Weak Supervision with Variational Auto-Encoders
Delay-agnostic Asynchronous Coordinate Update Algorithm
Task-specific experimental design for treatment effect estimation
Boosting Graph Contrastive Learning via Graph Contrastive Saliency
Explore and Exploit the Diverse Knowledge in Model Zoo for Domain Generalization
Finite-Sample Analysis of Learning High-Dimensional Single ReLU Neuron
Quantized Distributed Training of Large Models with Convergence Guarantees
Cramming: Training a Language Model on a single GPU in one day.
Can We Scale Transformers to Predict Parameters of Diverse ImageNet Models?
Towards Stable and Efficient Adversarial Training against $l_1$ Bounded Adversarial Attacks
Learning Functional Distributions with Private Labels
What do CNNs Learn in the First Layer and Why? A Linear Systems Perspective
Are Equivariant Equilibrium Approximators Beneficial?
Knowledge Hypergraph Embedding Meets Relational Algebra
When do Minimax-fair Learning and Empirical Risk Minimization Coincide?
Random Grid Neural Processes for Parametric Partial Differential Equations
SNeRL: Semantic-aware Neural Radiance Fields for Reinforcement Learning
Weighted Sampling without Replacement for Deep Top-$k$ Classification
Efficient Learning of Mesh-Based Physical Simulation with Bi-Stride Multi-Scale Graph Neural Network
One-vs-the-Rest Loss to Focus on Important Samples in Adversarial Training
Fundamental Tradeoffs in Learning with Prior Information
Model-agnostic Measure of Generalization Difficulty
Scalable Safe Policy Improvement via Monte Carlo Tree Search
Quantum Policy Gradient Algorithm with Optimized Action Decoding
Boosting Offline Reinforcement Learning with Action Preference Query
Tight Certification of Adversarially Trained Neural Networks via Nonconvex Low-Rank Semidefinite Relaxations
Projected Tensor Power Method for Hypergraph Community Recovery
Retrieval-Augmented Multimodal Language Modeling
Neural Network Accelerated Implicit Filtering: Integrating Neural Network Surrogates With Provably Convergent Derivative Free Optimization Methods
BiBench: Benchmarking and Analyzing Network Binarization
On Data Manifolds Entailed by Structural Causal Models
The Acquisition of Physical Knowledge in Generative Neural Networks
Computationally Efficient PAC RL in POMDPs with Latent Determinism and Conditional Embeddings
Federated Heavy Hitter Recovery under Linear Sketching
Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models
Equivariance with Learned Canonicalization Functions
FedDisco: Federated Learning with Discrepancy-Aware Collaboration
Gradient-based Wang--Landau Algorithm: A Novel Sampler for Output Distribution of Neural Networks over the Input Space
Federated Adversarial Learning: A Framework with Convergence Analysis
Trajectory-Aware Eligibility Traces for Off-Policy Reinforcement Learning
Towards Learning to Imitate from a Single Video Demonstration
Speeding Up Bellman Ford via Minimum Violation Permutations
Fully Dynamic Submodular Maximization over Matroids
On the Forward Invariance of Neural ODEs
Hybrid Energy Based Model in the Feature Space for Out-of-Distribution Detection
Graph Neural Networks with Learnable and Optimal Polynomial Bases
Masked Bayesian Neural Networks : Theoretical Guarantee and its Posterior Inference
MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation
Multi-View Masked World Models for Visual Robotic Manipulation
Multisample Flow Matching: Straightening Flows with Minibatch Couplings
Policy Mirror Ascent for Efficient and Independent Learning in Mean Field Games
Do Not Train It: A Linear Neural Architecture Search of Graph Neural Networks
Towards Controlled Data Augmentations for Active Learning
Conformal Prediction for Federated Uncertainty Quantification Under Label Shift
Blackout Diffusion: Generative Diffusion Models in Discrete-State Spaces
Differentially Private Stochastic Convex Optimization under a Quantile Loss Function
SDDM: Score-Decomposed Diffusion Models on Manifolds for Unpaired Image-to-Image Translation
In or Out? Fixing ImageNet Out-of-Distribution Detection Evaluation
Over-parametrization via Lifting for Low-rank Matrix Sensing: Conversion of Spurious Solutions to Strict Saddle Points
Diversity-enhancing Generative Network for Few-shot Hypothesis Adaptation
Retrosynthetic Planning with Dual Value Networks
A Critical Revisit of Adversarial Robustness in 3D Point Cloud Recognition with Diffusion-Driven Purification
Cones: Concept Neurons in Diffusion Models for Customized Generation
Finding Generalization Measures by Contrasting Signal and Noise
On Uni-Modal Feature Learning in Supervised Multi-Modal Learning
Personalized Federated Learning under Mixture of Distributions
Evaluating Unsupervised Denoising Requires Unsupervised Metrics
On the Robustness of Randomized Ensembles to Adversarial Perturbations
A Closer Look at the Intervention Procedure of Concept Bottleneck Models
Offline Learning in Markov Games with General Function Approximation
Decoding Layer Saliency in Language Transformers
Nested Elimination: A Simple Algorithm for Best-Item Identification From Choice-Based Feedback
Provable Dynamic Fusion for Low-Quality Multimodal Data
Block Subsampled Randomized Hadamard Transform for Nyström Approximation on Distributed Architectures
One-shot Imitation in a Non-Stationary Environment via Multi-Modal Skill
Conditional Graph Information Bottleneck for Molecular Relational Learning
Domain Adaptation for Time Series Under Feature and Label Shifts
Strategic Classification with Unknown User Manipulations
Simplified Temporal Consistency Reinforcement Learning
Image Restoration with Mean-Reverting Stochastic Differential Equations
Transcendental Idealism of Planner: Evaluating Perception from Planning Perspective for Autonomous Driving
Stochastic Policy Gradient Methods: Improved Sample Complexity for Fisher-non-degenerate Policies
Implicit Jacobian regularization weighted with impurity of probability output
Adversarial Cheap Talk
Slot-VAE: Object-Centric Scene Generation with Slot Attention
Pre-training for Speech Translation: CTC Meets Optimal Transport
H-Likelihood Approach to Deep Neural Networks with Temporal-Spatial Random Effects for High-Cardinality Categorical Features
Fast $(1+\varepsilon)$-Approximation Algorithms for Binary Matrix Factorization
Robust and private stochastic linear bandits
Algorithms for bounding contribution for histogram estimation under user-level privacy
Bidirectional Looking with A Novel Double Exponential Moving Average to Adaptive and Non-adaptive Momentum Optimizers
A Likelihood Approach to Nonparametric Estimation of a Singular Distribution Using Deep Generative Models
IRNeXt: Rethinking Convolutional Network Design for Image Restoration
TabLeak: Tabular Data Leakage in Federated Learning
Efficient Algorithms for Exact Graph Matching on Correlated Stochastic Block Models with Constant Correlation
Bootstrap in High Dimension with Low Computation
Theoretical Guarantees of Learning Ensembling Strategies with Applications to Time Series Forecasting
Simple Disentanglement of Style and Content in Visual Representations
Reparameterized Policy Learning for Multimodal Trajectory Optimization
Variational Curriculum Reinforcement Learning for Unsupervised Discovery of Skills
Universal Physics-Informed Neural Networks: Symbolic Differential Operator Discovery with Sparse Data
Deep Graph Representation Learning and Optimization for Influence Maximization
A Large-Scale Study of Probabilistic Calibration in Neural Network Regression
Global optimality of Elman-type RNNs in the mean-field regime
Proper Losses for Discrete Generative Models
Adversarial Collaborative Learning on Non-IID Features
Multi-Agent Learning from Learners
On Sampling with Approximate Transport Maps
State and parameter learning with PARIS particle Gibbs
Inferring Relational Potentials in Interacting Systems
BiRT: Bio-inspired Replay in Vision Transformers for Continual Learning
The Computational Complexity of Concise Hypersphere Classification
The Test of Tests: A Framework for Differentially Private Hypothesis Testing
CO-BED: Information-Theoretic Contextual Optimization via Bayesian Experimental Design
A Generalization of ViT/MLP-Mixer to Graphs
A Study on Transformer Configuration and Training Objective
Restoration-Degradation Beyond Linear Diffusions: A Non-Asymptotic Analysis For DDIM-type Samplers
Deep Generative Symbolic Regression with Monte-Carlo-Tree-Search
Hyperparameters in Reinforcement Learning and How To Tune Them
Learning Hidden Markov Models When the Locations of Missing Observations are Unknown
Optimizing NOTEARS Objectives via Topological Swaps
Infinite Action Contextual Bandits with Reusable Data Exhaust
Efficient Quantum Algorithms for Quantum Optimal Control
SLAMB: Accelerated Large Batch Training with Sparse Communication
Random Shuffle Transformer for Image Restoration
Conditional Tree Matching for Inference-Time Adaptation of Tree Prediction Models
Robust Situational Reinforcement Learning in Face of Context Disturbances
Can Neural Network Memorization Be Localized?
Exphormer: Sparse Transformers for Graphs
The Unintended Consequences of Discount Regularization: Improving Regularization in Certainty Equivalence Reinforcement Learning
Model-based Offline Reinforcement Learning with Count-based Conservatism
Topologically Faithful Image Segmentation via Induced Matching of Persistence Barcodes
Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels
Fair Neighbor Embedding
Robust Camera Pose Refinement for Multi-Resolution Hash Encoding
Effective and Efficient Structural Inference with Reservoir Computing
Ewald-based Long-Range Message Passing for Molecular Graphs
General Sequential Episodic Memory Model
End-to-end Training of Deep Boltzmann Machines by Unbiased Contrastive Divergence with Local Mode Initialization
Constrained Monotonic Neural Networks
SparseProp: Efficient Sparse Backpropagation for Faster Training of Neural Networks at the Edge
Random Teachers are Good Teachers
Provably Convergent Schrödinger Bridge with Applications to Probabilistic Time Series Imputation
Improving Graph Neural Networks with Learnable Propagation Operators
Nearly-Optimal Hierarchical Clustering for Well-Clustered Graphs
Nonparametric Density Estimation under Distribution Drift
Tensor Decompositions Meet Control Theory: Learning General Mixtures of Linear Dynamical Systems
Exploring the Benefits of Training Expert Language Models over Instruction Tuning
Low-Variance Gradient Estimation in Unrolled Computation Graphs with ES-Single
Difference-in-Differences Meets Tree-based Methods: Heterogeneous Treatment Effects Estimation with Unmeasured Confounding
Scaling Spherical CNNs
AdaNPC: Exploring Non-Parametric Classifier for Test-Time Adaptation
Iterative Approximate Cross-Validation
Conformal Prediction Sets for Graph Neural Networks
Locally Regularized Neural Differential Equations: Some Black Boxes were meant to remain closed!
TabDDPM: Modelling Tabular Data with Diffusion Models
Out-of-Distribution Generalization of Federated Learning via Implicit Invariant Relationships
Distributed Contextual Linear Bandits with Minimax Optimal Communication Cost
Approximate Causal Effect Identification under Weak Confounding
Self-Attention Amortized Distributional Projection Optimization for Sliced Wasserstein Point-Cloud Reconstruction
Probabilistic Categorical Adversarial Attack and Adversarial Training
Alternately Optimized Graph Neural Networks
Complexity of Block Coordinate Descent with Proximal Regularization and Applications to Wasserstein CP-dictionary Learning
Node Embedding from Neural Hamiltonian Orbits in Graph Neural Networks
Robust Speech Recognition via Large-Scale Weak Supervision
A/B Testing in Network Data with Covariate-Adaptive Randomization
Coarse-to-Fine: a Hierarchical Diffusion Model for Molecule Generation in 3D
MEWL: Few-shot multimodal word learning with referential uncertainty
Fair and Robust Estimation of Heterogeneous Treatment Effects for Policy Learning
Forget Unlearning: Towards True Data-Deletion in Machine Learning
Extending Conformal Prediction to Hidden Markov Models with Exact Validity via de Finetti's Theorem for Markov Chains
Gradient Descent Converges Linearly for Logistic Regression on Separable Data
Regions of Reliability in the Evaluation of Multivariate Probabilistic Forecasts
Learning-augmented private algorithms for multiple quantile release
Learning Deep Time-index Models for Time Series Forecasting
Spatial Implicit Neural Representations for Global-Scale Species Mapping
On Preemption and Learning in Stochastic Scheduling
Controlled Differential Equations on Long Sequences via Non-standard Wavelets
LookupFFN: Making Transformers Compute-lite for CPU inference
Optimistic Online Mirror Descent for Bridging Stochastic and Adversarial Online Convex Optimization
Compressing Tabular Data via Latent Variable Estimation
A Connection between One-Step RL and Critic Regularization in Reinforcement Learning
Denoising MCMC for Accelerating Diffusion-Based Generative Models
Linear Time GPs for Inferring Latent Trajectories from Neural Spike Trains
Few-Sample Feature Selection via Feature Manifold Learning
Representation-Driven Reinforcement Learning
How to Trust Your Diffusion Model: A Convex Optimization Approach to Conformal Risk Control
Taming graph kernels with random features
Causal Discovery with Latent Confounders Based on Higher-Order Cumulants
Improved Regret for Efficient Online Reinforcement Learning with Linear Function Approximation
Neural Stochastic Differential Games for Time-series Analysis
Learning Regions of Interest for Bayesian Optimization with Adaptive Level-Set Estimation
Revisiting the Linear-Programming Framework for Offline RL with General Function Approximation
Delayed Bandits: When Do Intermediate Observations Help?
On Over-Squashing in Message Passing Neural Networks: The Impact of Width, Depth, and Topology
Reducing SO(3) Convolutions to SO(2) for Efficient Equivariant GNNs
Q-learning Decision Transformer: Leveraging Dynamic Programming for Conditional Sequence Modelling in Offline RL
PreNAS: Preferred One-Shot Learning Towards Efficient Neural Architecture Search
Adversarially Robust PAC Learnability of Real-Valued Functions
Weighted Flow Diffusion for Local Graph Clustering with Node Attributes: an Algorithm and Statistical Guarantees
Mixing Predictions for Online Metric Algorithms
Reinforcement Learning in Low-rank MDPs with Density Features
The Power of Preconditioning in Overparameterized Low-Rank Matrix Sensing
Sequential Monte Carlo Learning for Time Series Structure Discovery
Multicalibration as Boosting for Regression
Reasons for the Superiority of Stochastic Estimators over Deterministic Ones: Robustness, Consistency and Perceptual Quality
Cut your Losses with Squentropy
Sparse Learning of Dynamical Systems in RKHS: An Operator-Theoretic Approach
K-SHAP: Policy Clustering Algorithm for Anonymous Multi-Agent State-Action Pairs
Towards Understanding Generalization of Macro-AUC in Multi-label Learning
Controllable Neural Symbolic Regression
Stable Estimation of Heterogeneous Treatment Effects
Gradient Descent Monotonically Decreases the Sharpness of Gradient Flow Solutions in Scalar Networks and Beyond
Constraint Reasoning Embedded Structured Prediction
Optimal Convergence Rates for Agnostic Nyström Kernel Learning
Active Ranking of Experts Based on their Performances in Many Tasks
Efficient Parametric Approximations of Neural Network Function Space Distance
Improving Expert Predictions with Conformal Prediction
Sketched Ridgeless Linear Regression: The Role of Downsampling
mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video
Shortest Edit Path Crossover: A Theory-driven Solution to the Permutation Problem in Evolutionary Neural Architecture Search
Probabilistic Unrolling: Scalable, Inverse-Free Maximum Likelihood Estimation for Latent Gaussian Models
Fisher Information Embedding for Node and Graph Learning
Efficient Exploration via Epistemic-Risk-Seeking Policy Optimization
Efficient Latency-Aware CNN Depth Compression via Two-Stage Dynamic Programming
Naive imputation implicitly regularizes high-dimensional linear models
Single Point-Based Distributed Zeroth-Order Optimization with a Non-Convex Stochastic Objective Function
Likelihood Adjusted Semidefinite Programs for Clustering Heterogeneous Data
Dual Focal Loss for Calibration
Coin Sampling: Gradient-Based Bayesian Inference without Learning Rates
Distribution Free Domain Generalization
Why Target Networks Stabilise Temporal Difference Methods
Are Gaussian Data All You Need? The Extents and Limits of Universality in High-Dimensional Generalized Linear Estimation
Machine Learning Force Fields with Data Cost Aware Training
Principled Offline RL in the Presence of Rich Exogenous Information
Bayesian online change point detection with Hilbert space approximate Student-t process
BNN-DP: Robustness Certification of Bayesian Neural Networks via Dynamic Programming
Why do Nearest Neighbor Language Models Work?
QAS-Bench: Rethinking Quantum Architecture Search and A Benchmark
Bilevel Optimization with Coupled Decision-Dependent Distributions
Curiosity in Hindsight: Intrinsic Exploration in Stochastic Environments
Variational Mixture of HyperGenerators for Learning Distributions over Functions
Nearly Optimal Competitive Ratio for Online Allocation Problems with Two-sided Resource Constraints and Finite Requests
Statistical Indistinguishability of Learning Algorithms
Graph Mixup with Soft Alignments
Provably Efficient Offline Reinforcement Learning with Perturbed Data Sources
Parallel Neurosymbolic Integration with Concordia
Posterior Sampling for Deep Reinforcement Learning
Reinforcement Learning Can Be More Efficient with Multiple Rewards
2D-Shapley: A Framework for Fragmented Data Valuation
One-Shot Federated Conformal Prediction
E$(n)$ Equivariant Message Passing Simplicial Networks
Thompson Sampling for High-Dimensional Sparse Linear Contextual Bandits
DeSRA: Detect and Delete the Artifacts of GAN-based Real-World Super-Resolution Models
Recovery Bounds on Class-Based Optimal Transport: A Sum-of-Norms Regularization Framework
From Adaptive Query Release to Machine Unlearning
Recovering Top-Two Answers and Confusion Probability in Multi-Choice Crowdsourcing
Optimal LP Rounding and Linear-Time Approximation Algorithms for Clustering Edge-Colored Hypergraphs
Surrogate Module Learning: Reduce the Gradient Error Accumulation in Training Spiking Neural Networks
Abstract-to-Executable Trajectory Translation for One-Shot Task Generalization
Automatically Auditing Large Language Models via Discrete Optimization
Efficient Transformed Gaussian Processes for Non-Stationary Dependent Multi-class Classification
Learning to Initiate and Reason in Event-Driven Cascading Processes
Robust and Scalable Bayesian Online Changepoint Detection
In Search of Insights, Not Magic Bullets: Towards Demystification of the Model Selection Dilemma in Heterogeneous Treatment Effect Estimation
Graph Positional Encoding via Random Feature Propagation
Estimating Possible Causal Effects with Latent Variables via Adjustment
Benign Overfitting in Deep Neural Networks under Lazy Training
A Near-Optimal Algorithm for Safe Reinforcement Learning Under Instantaneous Hard Constraints
Explainability as statistical inference
Learning Prescriptive ReLU Networks
Learning Dynamic Query Combinations for Transformer-based Object Detection and Segmentation
On the Robustness of Text Vectorizers
Kernel Logistic Regression Approximation of an Understandable ReLU Neural Network
Learning Compiler Pass Orders using Coreset and Normalized Value Prediction
Robustness in Multimodal Learning under Train-Test Modality Mismatch
Achieving Linear Speedup in Non-IID Federated Bilevel Learning
Estimation Beyond Data Reweighting: Kernel Method of Moments
General Covariance Data Augmentation for Neural PDE Solvers
Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles
PASTA: Pessimistic Assortment Optimization
Hypothesis Transfer Learning with Surrogate Classification Losses: Generalization Bounds through Algorithmic Stability
Thompson Sampling with Less Exploration is Fast and Optimal
FARE: Provably Fair Representation Learning with Practical Certificates
Stabilizing Transformer Training by Preventing Attention Entropy Collapse
Optimally-weighted Estimators of the Maximum Mean Discrepancy for Likelihood-Free Inference
Tuning Computer Vision Models With Task Rewards
Quantifying the Variability Collapse of Neural Networks
Polarity Is All You Need to Learn and Transfer Faster
A Unified Optimization Framework of ANN-SNN Conversion: Towards Optimal Mapping from Activation Values to Firing Rates
Faster Gradient-Free Algorithms for Nonsmooth Nonconvex Stochastic Optimization
Learning to Design Analog Circuits to Meet Threshold Specifications
Constrained Causal Bayesian Optimization
Prefer to Classify: Improving Text Classifiers via Auxiliary Preference Learning
A Modern Look at the Relationship between Sharpness and Generalization
Learning Unnormalized Statistical Models via Compositional Optimization
Half-Hop: A graph upsampling approach for slowing down message passing
Probabilistic Contrastive Learning Recovers the Correct Aleatoric Uncertainty of Ambiguous Inputs
Attributing Image Generative Models using Latent Fingerprints
Last Switch Dependent Bandits with Monotone Payoff Functions
Distribution Free Prediction Sets for Node Classification
StriderNet: A Graph Reinforcement Learning Approach to Optimize Atomic Structures on Rough Energy Landscapes
Deterministic equivalent and error universality of deep random features learning
Demystifying Disagreement-on-the-Line in High Dimensions
MixFlows: principled variational inference via mixed flows
GRAFENNE: Learning on Graphs with Heterogeneous and Dynamic Feature Sets
Nesterov Meets Optimism: Rate-Optimal Separable Minimax Optimization
A Kernelized Stein Discrepancy for Biological Sequences
Minimax estimation of discontinuous optimal transport maps: The semi-discrete case
Estimating Causal Effects using a Multi-task Deep Ensemble
Unveiling the Latent Space Geometry of Push-Forward Generative Models
Model-Aware Contrastive Learning: Towards Escaping the Dilemmas
Theoretical Bounds on the Network Community Profile from Low-rank Semi-definite Programming
Conditionally Strongly Log-Concave Generative Models
Nugget: Neural Agglomerative Embeddings of Text
Stein Variational Goal Generation for adaptive Exploration in Multi-Goal Reinforcement Learning
BPipe: Memory-Balanced Pipeline Parallelism for Training Large Language Models
The Role of Entropy and Reconstruction in Multi-View Self-Supervised Learning
PAL: Program-aided Language Models
Causal Proxy Models for Concept-based Model Explanations
Multi-Environment Pretraining Enables Transfer to Action Limited Datasets
Semi-Supervised Offline Reinforcement Learning with Action-Free Trajectories
An Information-Theoretic Analysis of Nonstationary Bandit Learning
GuardHFL: Privacy Guardian for Heterogeneous Federated Learning
Federated Linear Contextual Bandits with User-level Differential Privacy
NerfDiff: Single-image View Synthesis with NeRF-guided Distillation from 3D-aware Diffusion
Context Consistency Regularization for Label Sparsity in Time Series
A Conditional Normalizing Flow for Accelerated Multi-Coil MR Imaging
Evaluating Self-Supervised Learning via Risk Decomposition
Homomorphism AutoEncoder --- Learning Group Structured Representations from Observed Transitions
Global optimality for Euclidean CCCP under Riemannian convexity
Approximation and Estimation Ability of Transformers for Sequence-to-Sequence Functions with Infinite Dimensional Input
Cooperative Multi-Agent Reinforcement Learning: Asynchronous Communication and Linear Function Approximation
PaLM-E: An Embodied Multimodal Language Model
Universal Morphology Control via Contextual Modulation
Dissecting the Effects of SGD Noise in Distinct Regimes of Deep Learning
Learning Control by Iterative Inversion
Uncertainty Estimation by Fisher Information-based Evidential Deep Learning
Do Perceptually Aligned Gradients Imply Robustness?
Analyzing Diffusion as Serial Reproduction
Certified Robust Neural Networks: Generalization and Corruption Resistance
FP-Diffusion: Improving Score-based Diffusion Models by Enforcing the Underlying Score Fokker-Planck Equation
Synthetic Prompting: Generating Chain-of-Thought Demonstrations for Large Language Models
Emergent Asymmetry of Precision and Recall for Measuring Fidelity and Diversity of Generative Models in High Dimensions
MANSA: Learning Fast and Slow in Multi-Agent Systems
Spherical Inducing Features for Orthogonally-Decoupled Gaussian Processes
Generative Decoding of Visual Stimuli
Beyond the Edge of Stability via Two-step Gradient Updates
Few-bit Backward: Quantized Gradients of Activation Functions for Memory Footprint Reduction
Contextual Conservative Interleaving Bandits
Efficient List-Decodable Regression using Batches
Do Machine Learning Models Learn Statistical Rules Inferred from Data?
Emergence of Adaptive Circadian Rhythms in Deep Reinforcement Learning
Deep Temporal Sets with Evidential Reinforced Attentions for Unique Behavioral Pattern Discovery
Learning Optimal Group-structured Individualized Treatment Rules with Many Treatments
Tighter Bounds on the Expressivity of Transformer Encoders
Parameter-Level Soft-Masking for Continual Learning
Learnability and Algorithm for Continual Learning
TGRL: An Algorithm for Teacher Guided Reinforcement Learning
Importance Weighted Expectation-Maximization for Protein Sequence Design
Graph Switching Dynamical Systems
Protecting Language Generation Models via Invisible Watermarking
ReDi: Efficient Learning-Free Diffusion Inference via Trajectory Retrieval
Perturbation Analysis of Neural Collapse
Contrast with Reconstruct: Contrastive 3D Representation Learning Guided by Generative Pretraining
UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers
Von Mises Mixture Distributions for Molecular Conformation Generation
The Power of Learned Locally Linear Models for Nonlinear Policy Optimization
Straightening Out the Straight-Through Estimator: Overcoming Optimization Challenges in Vector Quantized Networks
System Identification of Neural Systems: If We Got It Right, Would We Know?
Multi-Agent Best Arm Identification with Private Communications
Muse: Text-To-Image Generation via Masked Generative Transformers
Learn to Accumulate Evidence from All Training Samples: Theory and Practice
Pairwise Ranking Losses of Click-Through Rates Prediction for Welfare Maximization in Ad Auctions
Addressing Budget Allocation and Revenue Allocation in Data Market Environments Using an Adaptive Sampling Algorithm
Robust Counterfactual Explanations for Neural Networks With Probabilistic Guarantees
Communication-Constrained Bandits under Additive Gaussian Noise
Everyone's Preference Changes Differently: A Weighted Multi-Interest Model For Retrieval
Poisoning Generative Replay in Continual Learning to Promote Forgetting
Differential Privacy, Linguistic Fairness, and Training Data Influence: Impossibility and Possibility Theorems for Multilingual Language Models
The Regret of Exploration and the Control of Bad Episodes in Reinforcement Learning
STEP: Learning N:M Structured Sparsity Masks from Scratch with Precondition
Supervised Metric Learning to Rank for Retrieval via Contextual Similarity Optimization
Constrained Phi-Equilibria
Optimal Rates and Efficient Algorithms for Online Bayesian Persuasion
Flexible Phase Dynamics for Bio-Plausible Contrastive Learning
How much does Initialization Affect Generalization?
Exploring the Limits of Model-Targeted Indiscriminate Data Poisoning Attacks
ClusterFuG: Clustering Fully connected Graphs by Multicut
Motion Question Answering via Modular Motion Programs
Statistical Foundations of Prior-Data Fitted Networks
RLEG: Vision-Language Representation Learning with Diffusion-based Embedding Generation
Partially Observable Multi-agent RL with (Quasi-)Efficiency: The Blessing of Information Sharing
Linear Causal Disentanglement via Interventions
Neural Algorithmic Reasoning with Causal Regularisation
A theory of representation learning gives a deep generalisation of kernel methods
PromptBoosting: Black-Box Text Classification with Ten Forward Passes
Hierarchies of Reward Machines
Nearly-tight Bounds for Deep Kernel Learning
Simple Embodied Language Learning as a Byproduct of Meta-Reinforcement Learning
Fractional Denoising for 3D Molecular Pre-training
GNN&GBDT-Guided Fast Optimizing Framework for Large-scale Integer Programming
Optimization for Amortized Inverse Problems
Causal Bounds in Quasi-Markovian Graphs
spred: Solving L1 Penalty with SGD
Evidential Interactive Learning for Medical Image Captioning
SegCLIP: Patch Aggregation with Learnable Centers for Open-Vocabulary Semantic Segmentation
RGE: A Repulsive Graph Rectification for Node Classification via Influence
Investigating the Role of Model-Based Learning in Exploration and Transfer
Leveraging Label Non-Uniformity for Node Classification in Graph Neural Networks
Efficient RL via Disentangled Environment and Agent Representations
Divide and Conquer Dynamic Programming: An Almost Linear Time Change Point Detection Methodology in High Dimensions
Online Mechanism Design for Information Acquisition
Optimal Stochastic Non-smooth Non-convex Optimization through Online-to-Non-convex Conversion
Directed Chain Generative Adversarial Networks
Layered State Discovery for Incremental Autonomous Exploration
Bayes-optimal Learning of Deep Random Networks of Extensive-width
CataBEEM: Integrating Latent Interaction Categories in Node-wise Community Detection Models for Network Data
Maximum Optimality Margin: A Unified Approach for Contextual Linear Programming and Inverse Linear Programming
Under-Counted Tensor Completion with Neural Incorporation of Attributes
Polyhedral Complex Extraction from ReLU Networks using Edge Subdivision
Nearly-Linear Time and Streaming Algorithms for Outlier-Robust PCA
A Closer Look at Few-shot Classification Again
QuantumDARTS: Differentiable Quantum Architecture Search for Variational Quantum Algorithms
Sequential Multi-Dimensional Self-Supervised Learning for Clinical Time Series
X-Paste: Revisiting Scalable Copy-Paste for Instance Segmentation using CLIP and StableDiffusion
GibbsDDRM: A Partially Collapsed Gibbs Sampler for Solving Blind Inverse Problems with Denoising Diffusion Restoration
Quantum 3D Graph Learning with Applications to Molecule Embedding
Submodular Order Functions and Assortment Optimization
Go Beyond Imagination: Maximizing Episodic Reachability with World Models
Fascinating Supervisory Signals and Where to Find Them: Deep Anomaly Detection with Scale Learning
Bit Allocation using Optimization
Interventional Causal Representation Learning
Convergence of Proximal Point and Extragradient-Based Methods Beyond Monotonicity: the Case of Negative Comonotonicity
Tied-Augment: Controlling Representation Similarity Improves Data Augmentation
Mimetic Initialization of Self-Attention Layers
Text-To-4D Dynamic Scene Generation
Towards Quantum Machine Learning for Constrained Combinatorial Optimization: a Quantum QAP Solver
Minimalistic Predictions to Schedule Jobs with Online Precedence Constraints
Convex Geometry of ReLU-layers, Injectivity on the Ball and Local Reconstruction
Robust Consensus in Ranking Data Analysis: Definitions, Properties and Computational Issues
Subset Selection Based On Multiple Rankings in the Presence of Bias: Effectiveness of Fairness Constraints for Multiwinner Voting Score Functions
Phase-aware Adversarial Defense for Improving Adversarial Robustness
Bayesian Design Principles for Frequentist Sequential Learning
Weighted Tallying Bandits: Overcoming Intractability via Repeated Exposure Optimality
Language Instructed Reinforcement Learning for Human-AI Coordination
Generated Graph Detection
The Catalog Problem: Clustering and Ordering Variable-Sized Sets
Adaptive Smoothing Gradient Learning for Spiking Neural Networks
A Theoretical Analysis of the Learning Dynamics under Class Imbalance
PCA-based Multi-Task Learning: a Random Matrix Approach
CoCo: A Coupled Contrastive Framework for Unsupervised Domain Adaptive Graph Classification
Efficient preconditioned stochastic gradient descent for estimation in latent variable models
Multi-Layer Neural Networks as Trainable Ladders of Hilbert Spaces
Lower Bounds for Learning in Revealing POMDPs
On the Relationship Between Explanation and Prediction: A Causal View
Revisiting Domain Randomization via Relaxed State-Adversarial Policy Optimization
SemSup-XC: Semantic Supervision for Zero and Few-shot Extreme Classification
Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models
Convergence of First-Order Methods for Constrained Nonconvex Optimization with Dependent Data
Expected Gradients of Maxout Networks and Consequences to Parameter Initialization
Task-Specific Skill Localization in Fine-tuned Language Models
Facial Expression Recognition with Adaptive Frame Rate based on Multiple Testing Correction
Efficient Approximations of Complete Interatomic Potentials for Crystal Property Prediction
Integrating Prior Knowledge in Contrastive Learning with Kernel
Polynomial Preconditioning for Gradient Methods
From Robustness to Privacy and Back
Fast as CHITA: Neural Network Pruning with Combinatorial Optimization
Private Federated Learning with Autotuned Compression
Proper Scoring Rules for Survival Analysis
CRISP: Curriculum based Sequential neural decoders for Polar code family
FLEX: an Adaptive Exploration Algorithm for Nonlinear Systems
Parallel $Q$-Learning: Scaling Off-policy Reinforcement Learning under Massively Parallel Simulation
Graphically Structured Diffusion Models
On Many-Actions Policy Gradient
Monge, Bregman and Occam: Interpretable Optimal Transport in High-Dimensions with Feature-Sparse Maps
The Statistical Scope of Multicalibration
Hardware-Aware Compression with Random Operation Access Specific Tile (ROAST) Hashing
Ske2Grid: Skeleton-to-Grid Representation Learning for Action Recognition
Stabilizing GANs' Training with Brownian Motion Controller
Featured Graph Coarsening with Similarity Guarantees
Biases in Evaluation of Molecular Optimization Methods and Bias Reduction Strategies
Fighting Fire with Fire: Contrastive Debiasing without Bias-free Data via Generative Bias-transformation
Improved Online Learning Algorithms for CTR Prediction in Ad Auctions
Improved Policy Evaluation for Randomized Trials of Algorithmic Resource Allocation
Width and Depth Limits Commute in Residual Networks
Continual Learners are Incremental Model Generalizers
Online Platt Scaling with Calibeating
SpENCNN: Orchestrating Encoding and Sparsity for Fast Homomorphically Encrypted Neural Network Inference
Data Feedback Loops: Model-driven Amplification of Dataset Biases
Variational Autoencoding Neural Operators
Meta-Learning the Inductive Bias of Simple Neural Circuits
Cyclic Block Coordinate Descent With Variance Reduction for Composite Nonconvex Optimization
Predictive Flows for Faster Ford-Fulkerson
Neural Latent Aligner: Cross-trial Alignment for Learning Representations of Complex, Naturalistic Neural Data
Can Large Language Models Reason about Program Invariants?
Provable Reset-free Reinforcement Learning by No-Regret Reduction
On the Impact of Algorithmic Recourse on Social Segregation
Understanding and Generalizing Contrastive Learning from the Inverse Optimal Transport Perspective
Dropout Reduces Underfitting
Towards a Persistence Diagram that is Robust to Noise and Varied Densities
Conformal Prediction with Missing Values
Diffusion Models are Minimax Optimal Distribution Estimators
DualHSIC: HSIC-Bottleneck and Alignment for Continual Learning
MODeL: Memory Optimizations for Deep Learning
Optimal Sets and Solution Paths of ReLU Networks
Traversing Between Modes in Function Space for Fast Ensembling
Fast Rates in Time-Varying Strongly Monotone Games
Tighter Information-Theoretic Generalization Bounds from Supersamples
Learning in POMDPs is Sample-Efficient with Hindsight Observability
Reduce, Reuse, Recycle: Compositional Generation with Energy-Based Diffusion Models and MCMC
Atari-5: Distilling the Arcade Learning Environment down to Five Games
Data-Driven Subgroup Identification for Linear Regression
Supported Trust Region Optimization for Offline Reinforcement Learning
Multi-Objective Population Based Training
Differentially Private Sharpness-Aware Training
Nonparametric Extensions of Randomized Response for Private Confidence Sets
Statistical Learning under Heterogenous Distribution Shift
Spatial-Temporal Graph Learning with Adversarial Contrastive Adaptation
Lookahead When It Matters: Adaptive Non-causal Transformers for Streaming Neural Transducers
Adaptive Coordination in Social Embodied Rearrangement
Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks
MonoFlow: Rethinking Divergence GANs via the Perspective of Wasserstein Gradient Flows
Fundamental Limits of Two-layer Autoencoders, and Achieving Them with Gradient Methods
Bigger, Better, Faster: Human-level Atari with human-level efficiency
PLay: Parametrically Conditioned Layout Generation using Latent Diffusion
Repository-Level Prompt Generation for Large Language Models of Code
On Pre-Training for Visuo-Motor Control: Revisiting a Learning-from-Scratch Baseline
Bandits with Knapsacks: Advice on Time-Varying Demands
Does Sparsity Help in Learning Misspecified Linear Bandits?
The SSL Interplay: Augmentations, Inductive Bias, and Generalization
Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding
Improved Algorithms for White-Box Adversarial Streams
GREAD: Graph Neural Reaction-Diffusion Networks
Structural Re-weighting Improves Graph Domain Adaptation
One-Step Estimator for Permuted Sparse Recovery
Preprocessors Matter! Realistic Decision-Based Attacks on Machine Learning Systems
PAC Generalization via Invariant Representations
Bayesian Neural Networks Avoid Encoding Complex and Perturbation-Sensitive Concepts
Adaptive IMLE for Few-shot Pretraining-free Generative Modelling
Polynomial Time and Private Learning of Unbounded Gaussian Mixture Models
On the Interplay Between Misspecification and Sub-optimality Gap in Linear Contextual Bandits
On Excess Mass Behavior in Gaussian Mixture Models with Orlicz-Wasserstein Distances
Toward Large Kernel Models
Optimal Horizon-Free Reward-Free Exploration for Linear Mixture MDPs
Mirror Sinkhorn: Fast Online Optimization on Transport Polytopes
Efficient Online Reinforcement Learning with Offline Data
Self-Repellent Random Walks on General Graphs - Achieving Minimal Sampling Variance via Nonlinear Markov Chains
Moccasin: Efficient Tensor Rematerialization for Neural Networks
On the Global Convergence of Risk-Averse Policy Gradient Methods with Expected Conditional Risk Measures
"Why did the Model Fail?": Attributing Model Performance Changes to Distribution Shifts
Wrapped Cauchy Distributed Angular Softmax for Long-Tailed Visual Recognition
Change is Hard: A Closer Look at Subpopulation Shift
Achieving Hierarchy-Free Approximation for Bilevel Programs with Equilibrium Constraints
Analyzing Privacy Leakage in Machine Learning via Multiple Hypothesis Testing: A Lesson From Fano
Cocktail Party Attack: Breaking Aggregation-Based Privacy in Federated Learning Using Independent Component Analysis
On Penalty-based Bilevel Gradient Descent Method
Fairness in Matching under Uncertainty
Meta Learning of Interface Conditions for Multi-Domain Physics-Informed Neural Networks
Entity Divider with Language Grounding in Multi-Agent Reinforcement Learning
Improving Hyperparameter Learning under Approximate Inference in Gaussian Process Models
On Regularization and Inference with Label Constraints
Why does Throwing Away Data Improve Worst-Group Error?
Unsupervised Skill Discovery for Learning Shared Structures across Changing Environments
Leveraging Proxy of Training Data for Test-Time Adaptation
Generating Language Corrections for Teaching Physical Control Tasks
Predictable MDP Abstraction for Unsupervised Model-Based RL
Variance Control for Distributional Reinforcement Learning
Anti-Exploration by Random Network Distillation
Revisiting Bellman Errors for Offline Model Selection
Robust Weight Signatures: Gaining Robustness as Easy as Patching Weights?
Interactive Object Placement with Reinforcement Learning
Compressed Decentralized Proximal Stochastic Gradient Method for Nonconvex Composite Problems with Heterogeneous Data
GC-Flow: A Graph-Based Flow Network for Effective Clustering
Warm-Start Actor-Critic: From Approximation Error to Sub-optimality Gap
Coordinate Descent Methods for Fractional Minimization
Neural Continuous-Discrete State Space Models for Irregularly-Sampled Time Series
Network Effects in Performative Prediction Games
On Enhancing Expressive Power via Compositions of Single Fixed-Size ReLU Network
Best Arm Identification in Multi-Agent Multi-Armed Bandits
AudioLDM: Text-to-Audio Generation with Latent Diffusion Models
Approximation Algorithms for Fair Range Clustering
Probabilistic Attention-to-Influence Neural Models for Event Sequences
RankMe: Assessing the Downstream Performance of Pretrained Self-Supervised Representations by Their Rank
How Many Perturbations Break This Model? Evaluating Robustness Beyond Adversarial Accuracy
Improving Fair Training under Correlation Shifts
ACAT: Adversarial Counterfactual Attention for Classification and Detection in Medical Imaging
Representation Learning with Multi-Step Inverse Kinematics: An Efficient and Optimal Approach to Rich-Observation RL
End-to-End Multi-Object Detection with a Regularized Mixture Model
Confidence and Dispersity Speak: Characterizing Prediction Matrix for Unsupervised Accuracy Estimation
Towards Explaining Distribution Shifts
RSC: Accelerate Graph Neural Networks Training via Randomized Sparse Computations
Delving into Noisy Label Detection with Clean Data
Smooth Non-stationary Bandits
On Kinetic Optimal Probability Paths for Generative Models
Multi-Fidelity Covariance Estimation in the Log-Euclidean Geometry
Controlling Type Confounding in Ad Hoc Teamwork with Instance-wise Teammate Feedback Rectification
Restoration based Generative Models
MAGANet: Achieving Combinatorial Generalization by Modeling a Group Action
Feature Programming for Multivariate Time Series Prediction
Reliable Measures of Spread in High Dimensional Latent Spaces
Bayesian Estimation of Differential Privacy
Learning useful representations for shifting tasks and distributions
Toward Efficient Gradient-Based Value Estimation
All in a Row: Compressed Convolution Networks for Graphs
Dynamics-inspired Neuromorphic Visual Representation Learning
Stable and Consistent Prediction of 3D Characteristic Orientation via Invariant Residual Learning
Neural networks trained with SGD learn distributions of increasing complexity
Rethinking Weak Supervision in Helping Contrastive Learning
Abstracting Imperfect Information Away from Two-Player Zero-Sum Games
Learning Intuitive Policies Using Action Features
DecompDiff: Diffusion Models with Decomposed Priors for Structure-Based Drug Design
Sharper Bounds for $\ell_p$ Sensitivity Sampling
Gaussian Process Priors for Systems of Linear Partial Differential Equations with Constant Coefficients
Formalizing Preferences Over Runtime Distributions
Effective Minkowski Dimension of Deep Nonparametric Regression: Function Approximation and Statistical Theories
Pricing Experimental Design: Causal Effect, Expected Revenue and Tail Risk
Differentiable and Transportable Structure Learning
Proximal Causal Learning of Conditional Average Treatment Effects
Score Approximation, Estimation and Distribution Recovery of Diffusion Models on Low-Dimensional Data
Weakly Supervised Regression with Interval Targets
Deep Latent State Space Models for Time-Series Generation
Implicit Neural Spatial Representations for Time-dependent PDEs
Improving Bi-level Optimization Based Methods with Inspiration from Humans' Classroom Study Techniques
DiscoBAX - Discovery of optimal intervention sets in genomic experiment design
Sample Complexity of Probability Divergences under Group Symmetry
Learning Instance-Specific Augmentations by Capturing Local Invariances
HarsanyiNet: Computing Accurate Shapley Values in a Single Forward Propagation
Learning Temporally AbstractWorld Models without Online Experimentation
Input uncertainty propagation through trained neural networks
Sequential Counterfactual Risk Minimization
Applied Online Algorithms with Heterogeneous Predictors
Omnipredictors for Constrained Optimization
Semi Bandit dynamics in Congestion Games: Convergence to Nash Equilibrium and No-Regret Guarantees.
What can online reinforcement learning with function approximation benefit from general coverage conditions?
An Effective Meaningful Way to Evaluate Survival Models
The Dormant Neuron Phenomenon in Deep Reinforcement Learning
On the Global Convergence of Fitted Q-Iteration with Two-layer Neural Network Parametrization
Counterfactual Analysis in Dynamic Latent State Models
On Bridging the Gap between Mean Field and Finite Width Deep Random Multilayer Perceptron with Batch Normalization
Fully-Adaptive Composition in Differential Privacy
Flipping Coins to Estimate Pseudocounts for Exploration in Reinforcement Learning
Settling the Reward Hypothesis
Learning Lightweight Object Detectors via Multi-Teacher Progressive Distillation
Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition
TAN Without a Burn: Scaling Laws of DP-SGD
Quantile Credit Assignment
The Benefits of Model-Based Generalization in Reinforcement Learning
SpeedDETR: Speed-aware Transformers for End-to-end Object Detection
Provably and Practically Efficient Neural Contextual Bandits
Quantum Ridgelet Transform: Winning Lottery Ticket of Neural Networks with Quantum Computation
On the Power of Pre-training for Generalization in RL: Provable Benefits and Hardness
Understanding Incremental Learning of Gradient Descent: A Fine-grained Analysis of Matrix Sensing
Distributed Linear Bandits under Communication Constraints
A Unifying Framework to the Analysis of Interaction Methods using Synergy Functions
Sequential Kernelized Independence Testing
Sequential Strategic Screening
Automated Search for Conjectures on Mathematical Constants using Analysis of Integer Sequences
Provable Multi-instance Deep AUC Maximization with Stochastic Pooling
TIDE: Time Derivative Diffusion for Deep Learning on Graphs
Geometric Latent Diffusion Models for 3D Molecule Generation
On the Statistical Benefits of Temporal Difference Learning
Information-Theoretic State Space Model for Multi-View Reinforcement Learning
Continual Vision-Language Representation Learning with Off-Diagonal Information
Private Statistical Estimation of Many Quantiles
AbODE: Ab initio antibody design using conjoined ODEs
Trustworthy Policy Learning under the Counterfactual No-Harm Criterion
Propensity Matters: Measuring and Enhancing Balancing for Recommendation
Improving Graph Generation by Restricting Graph Bandwidth
Solving Linear Programs with Fast Online Learning Algorithms
LESS-VFL: Communication-Efficient Feature Selection for Vertical Federated Learning
Robust Collaborative Learning with Linear Gradient Overhead
Towards Understanding and Improving GFlowNet Training
MALTS: Matching After Learning to Stretch
PINA: Leveraging Side Information in eXtreme Multi-label Classification via Predicted Instance Neighborhood Aggregation
Efficient Training of Language Models using Few-Shot Learning
A Universal Unbiased Method for Classification from Aggregate Observations
On the Convergence of SARSA with Linear Function Approximation
Mitigating Memorization of Noisy Labels by Clipping the Model Prediction
PAC-Bayesian Generalization Bounds for Adversarial Generative Models
Fairness in Streaming Submodular Maximization over a Matroid Constraint
Optimal randomized multilevel Monte Carlo for repeatedly nested expectations
PAC Prediction Sets for Large Language Models of Code
Scalable Adaptive Computation for Iterative Generation
ConCerNet: A Contrastive Learning Based Framework for Automated Conservation Law Discovery and Trustworthy Dynamical System Prediction
Sequential Changepoint Detection via Backward Confidence Sequences
Architecture-Agnostic Masked Image Modeling -- From ViT back to CNN
Distribution-dependent McDiarmid-type Inequalities for Functions of Unbounded Interaction
On the Privacy-Robustness-Utility Trilemma in Distributed Learning
Identifiability of Label Noise Transition Matrix
Model Transferability with Responsive Decision Subjects
Intrinsic Sliced Wasserstein Distances for Comparing Collections of Probability Distributions on Manifolds and Graphs
Concurrent Shuffle Differential Privacy Under Continual Observation
Weak Proxies are Sufficient and Preferable for Fairness with Missing Sensitive Attributes
Efficient displacement convex optimization with particle gradient descent
PPG Reloaded: An Empirical Study on What Matters in Phasic Policy Gradient
Collaborative Causal Inference with Fair Incentives
Fair yet Asymptotically Equal Collaborative Learning
Global Context Vision Transformers
Distortion and Uncertainty Aware Loss for Panoramic Depth Completion
A Kernel-Based View of Language Model Fine-Tuning
From Perception to Programs: Regularize, Overparameterize, and Amortize
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
Geometric Clifford Algebra Networks
Efficiently predicting high resolution mass spectra with graph neural networks
Learning Mixtures of Markov Chains and MDPs
Generalized Implicit Follow-The-Regularized-Leader
Spurious Valleys and Clustering Behavior of Neural Networks
Dynamic Regularized Sharpness Aware Minimization in Federated Learning: Approaching Global Consistency and Smooth Landscape
A Robust Test for the Stationarity Assumption in Sequential Decision Making
Towards a better understanding of representation dynamics under TD-learning
Efficient Rate Optimal Regret for Adversarial Contextual MDPs Using Online Function Approximation
Rethinking Explaining Graph Neural Networks via Non-parametric Subgraph Matching
Label Distributionally Robust Losses for Multi-class Classification: Consistency, Robustness and Adaptivity
Explainable Data-Driven Optimization: From Context to Decision and Back Again
Why Random Pruning Is All We Need to Start Sparse
Direct Parameterization of Lipschitz-Bounded Deep Networks
FREDIS: A Fusion Framework of Refinement and Disambiguation for Unreliable Partial Label Learning
Generalization Analysis for Contrastive Representation Learning
An Investigation into Pre-Training Object-Centric Representations for Reinforcement Learning
Predicting Rare Events by Shrinking Towards Proportional Odds
Online Nonstochastic Control with Adversarial and Static Constraints
Multi-task Hierarchical Adversarial Inverse Reinforcement Learning
Surface Snapping Optimization Layer for Single Image Object Shape Reconstruction
Relevant Walk Search for Explaining Graph Neural Networks
VectorMapNet: End-to-end Vectorized HD Map Learning
Trading-Off Payments and Accuracy in Online Classification with Paid Stochastic Experts
Representer Point Selection for Explaining Regularized High-dimensional Models
Estimating the Contamination Factor's Distribution in Unsupervised Anomaly Detection
Communication-Efficient Federated Hypergradient Computation via Aggregated Iterative Differentiation
Cell-Free Latent Go-Explore
Towards Understanding Generalization of Graph Neural Networks
The Implicit Regularization of Dynamical Stability in Stochastic Gradient Descent
Fair and Optimal Classification via Post-Processing
Stochastic Marginal Likelihood Gradients using Neural Tangent Kernels
Graph Neural Tangent Kernel: Convergence on Large Graphs
Tractable Control for Autoregressive Language Generation
Speed-Oblivious Online Scheduling: Knowing (Precise) Speeds is not Necessary
Graph Contrastive Backdoor Attacks
Jump-Start Reinforcement Learning
COMCAT: Towards Efficient Compression and Customization of Attention-Based Vision Models
The Impact of Exploration on Convergence and Performance of Multi-Agent Q-Learning Dynamics
Vector-Valued Control Variates
Algorithmic Collective Action in Machine Learning
Causal Structure Learning for Latent Intervened Non-stationary Data
Neural Inverse Operators for Solving PDE Inverse Problems
A Distribution Optimization Framework for Confidence Bounds of Risk Measures
Exact Inference in High-order Structured Prediction
On the Complexity of Bayesian Generalization
Attribute-Efficient PAC Learning of Low-Degree Polynomial Threshold Functions with Nasty Noise
SGD with AdaGrad Stepsizes: Full Adaptivity with High Probability to Unknown Parameters, Unbounded Gradients and Affine Variance
On the Convergence of Gradient Flow on Multi-layer Linear Models
Unscented Autoencoder
Individually Fair Learning with One-Sided Feedback
Hierarchical Grammar-Induced Geometry for Data-Efficient Molecular Property Prediction
Training Deep Surrogate Models with Large Scale Online Learning
Quantum Lower Bounds for Finding Stationary Points of Nonconvex Functions
Near-Optimal Quantum Coreset Construction Algorithms for Clustering
CSP: Self-Supervised Contrastive Spatial Pre-Training for Geospatial-Visual Representations
Differentiable Tree Operations Promote Compositional Generalization
CrossSplit: Mitigating Label Noise Memorization through Data Splitting
Generalizing Neural Wave Functions
Deep Laplacian-based Options for Temporally-Extended Exploration
Fourmer: An Efficient Global Modeling Paradigm for Image Restoration
Shapley Based Residual Decomposition for Instance Analysis
A Group Symmetric Stochastic Differential Equation Model for Molecule Multi-modal Pretraining
Reachability-Aware Laplacian Representation in Reinforcement Learning
Provably Invariant Learning without Domain Information
Improved Online Conformal Prediction via Strongly Adaptive Online Learning
Total Variation Graph Neural Networks
ReLOAD: Reinforcement Learning with Optimistic Ascent-Descent for Last-Iterate Convergence in Constrained MDPs
Hidden Symmetries of ReLU Networks
Topological Singularity Detection at Multiple Scales
Better Training of GFlowNets with Local Credit and Incomplete Trajectories
Do You Remember? Overcoming Catastrophic Forgetting for Fake Audio Detection
Symmetry-Aware Robot Design with Structured Subgroups
Automatic Intrinsic Reward Shaping for Exploration in Deep Reinforcement Learning
Provable Data Subset Selection For Efficient Neural Networks Training
GFlowOut: Dropout with Generative Flow Networks
FedCR: Personalized Federated Learning Based on Across-Client Common Representation with Conditional Mutual Information Regularization
Controllability-Aware Unsupervised Skill Discovery
ChiPFormer: Transferable Chip Placement via Offline Decision Transformer
Towards credible visual model interpretation with path attribution
Learning Signed Distance Functions from Noisy 3D Point Clouds via Noise to Noise Mapping
Towards Practical Preferential Bayesian Optimization with Skew Gaussian Processes
The Edge of Orthogonality: A Simple View of What Makes BYOL Tick
Reinforcement Learning with General Utilities: Simpler Variance Reduction and Large State-Action Space
Hyperbolic Image-text Representations
LongCoder: A Long-Range Pre-trained Language Model for Code Completion
WL meet VC
Regret-Minimizing Double Oracle for Extensive-Form Games
Adaptive Identification of Populations with Treatment Benefit in Clinical Trials: Machine Learning Challenges and Solutions
Grounding Large Language Models in Interactive Environments with Online Reinforcement Learning
Personalized Federated Learning with Inferred Collaboration Graphs
Run-off Election: Improved Provable Defense against Data Poisoning Attacks
Regret Minimization and Convergence to Equilibria in General-sum Markov Games
A Fast, Well-Founded Approximation to the Empirical Neural Tangent Kernel
Learning Expressive Priors for Generalization and Uncertainty Estimation in Neural Networks
Hyperbolic Representation Learning: Revisiting and Advancing
RACE: Improve Multi-Agent Reinforcement Learning with Representation Asymmetry and Collaborative Evolution
No One Idles: Efficient Heterogeneous Federated Learning with Parallel Edge and Server Computation
Magneto: A Foundation Transformer
Minimizing Trajectory Curvature of ODE-based Generative Models
How Jellyfish Characterise Alternating Group Equivariant Neural Networks
Hyperbolic Diffusion Embedding and Distance for Hierarchical Representation Learning
Safe Offline Reinforcement Learning with Real-Time Budget Constraints
Improved Active Multi-Task Representation Learning via Lasso
Rethinking Visual Reconstruction: Experience-Based Content Completion Guided by Visual Cues
SlotGAT: Slot-based Message Passing for Heterogeneous Graphs
Hierarchical Diffusion for Offline Decision Making
Stochastic Gradient Descent under Markovian Sampling Schemes
Mitigating the Effects of Non-Identifiability on Inference for Bayesian Neural Networks with Latent Variables
Difference of submodular minimization via DC programming
Variational Sparse Inverse Cholesky Approximation for Latent Gaussian Processes via Double Kullback-Leibler Minimization
Nonparametric Iterative Machine Teaching
A Fast Optimistic Method for Monotone Variational Inequalities
Deep Regression Unlearning
A Robust Optimisation Perspective on Counterexample-Guided Repair of Neural Networks
StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis
Decentralized Stochastic Bilevel Optimization with Improved per-Iteration Complexity
Existence and Estimation of Critical Batch Size for Training Generative Adversarial Networks with Two Time-Scale Update Rule
Neural Prediction Errors enable Analogical Visual Reasoning in Human Standard Intelligence Tests
Minimal Width for Universal Property of Deep RNN
Mixture Proportion Estimation Beyond Irreducibility
Sliced-Wasserstein on Symmetric Positive Definite Matrices for M/EEG Signals
Reinforcement Learning from Passive Data via Latent Intentions
Refined Regret for Adversarial MDPs with Linear Function Approximation
Label differential privacy and private training data release
Metagenomic Binning using Connectivity-constrained Variational Autoencoders
Automatically marginalized MCMC in probabilistic programming
Calibrating Multimodal Learning
On the Optimality of Misspecified Kernel Ridge Regression
Actor-Critic Alignment for Offline-to-Online Reinforcement Learning
Harmonic Neural Networks
Approximately Optimal Core Shapes for Tensor Decompositions
COLA: Orchestrating Error Coding and Learning for Robust Neural Network Inference Against Hardware Defects
A Flexible Diffusion Model
LazyGNN: Large-Scale Graph Neural Networks via Lazy Propagation
Generative Graph Dictionary Learning
Solving High-Dimensional PDEs with Latent Spectral Models
Distance Weighted Supervised Learning for Offline Interaction Data
Optimal Arms Identification with Knapsacks
Effective Structured Prompting by Meta-Learning and Representative Verbalizer
Markovian Gaussian Process Variational Autoencoders
Active causal structure learning with advice
New metrics and search algorithms for weighted causal DAGs
End-to-end Differentiable Clustering with Associative Memories
Trainability, Expressivity and Interpretability in Gated Neural ODEs
Differentially Private Hierarchical Clustering with Provable Approximation Guarantees
Adapting to game trees in zero-sum imperfect information games
Fast Excess Risk Rates via Offset Rademacher Complexity
Alternating Local Enumeration (TnALE): Solving Tensor Network Structure Search with Fewer Evaluations
Reward-Mixing MDPs with Few Latent Contexts are Learnable
Maximal Initial Learning Rates in Deep ReLU Networks
Improving Visual Prompt Tuning for Self-supervised Vision Transformers
One-Shot Compression of Large Edge-Exchangeable Graphs using Bits-Back Coding
Instrumental Variable Estimation of Average Partial Causal Effects
Path Neural Networks: Expressive and Accurate Graph Neural Networks
Action Matching: Learning Stochastic Dynamics from Samples
How to address monotonicity for model risk management?
IncDSI: Incrementally Updatable Document Retrieval
Computational Asymmetries in Robust Classification
Instant Soup: Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models
GNOT: A General Neural Operator Transformer for Operator Learning
NUNO: A General Framework for Learning Parametric PDEs with Non-Uniform Data
Robust Subtask Learning for Compositional Generalization
One Transformer Fits All Distributions in Multi-Modal Diffusion at Scale
DRew: Dynamically Rewired Message Passing with Delay
Graph Ladling: Shockingly Simple Parallel GNN Training without Intermediate Communication
Are Large Kernels Better Teachers than Transformers for ConvNets?
GLOBE-CE: A Translation Based Approach for Global Counterfactual Explanations
The case for 4-bit precision: k-bit Inference Scaling Laws
SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient
Generating Private Synthetic Data with Genetic Algorithms
The Flan Collection: Designing Data and Methods for Effective Instruction Tuning
Causal Strategic Classification: A Tale of Two Shifts
Revisiting Over-smoothing and Over-squashing Using Ollivier-Ricci Curvature
DoMo-AC: Doubly Multi-step Off-policy Actor-Critic Algorithm
Bootstrapped Representations in Reinforcement Learning
CleanRL: High-quality Single-file Implementations of Deep Reinforcement Learning Algorithms
VA-learning as a more efficient alternative to Q-learning
Robust Non-Linear Feedback Coding via Power-Constrained Deep Learning
Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities
An SDE for Modeling SAM: Theory and Insights
SeedGNN: Graph Neural Network for Supervised Seeded Graph Matching
Adaptively Weighted Data Augmentation Consistency Regularization for Robust Optimization under Concept Shift
Learning Unforeseen Robustness from Out-of-distribution Data Using Equivariant Domain Translator
Learning the Right Layers a Data-Driven Layer-Aggregation Strategy for Semi-Supervised Learning on Multilayer Graphs
Causal Modeling of Policy Interventions From Treatment–Outcome Sequences
Surrogate Model Extension (SME): A Fast and Accurate Weight Update Attack on Federated Learning
Towards Sustainable Learning: Coresets for Data-efficient Deep Learning
Controlling Posterior Collapse by an Inverse Lipschitz Constraint on the Decoder Network
The Monge Gap: A Regularizer to Learn All Transport Maps
Tighter Lower Bounds for Shuffling SGD: Random Permutations and Beyond
Refining Generative Process with Discriminator Guidance in Score-based Diffusion Models
FlexRound: Learnable Rounding based on Element-wise Division for Post-Training Quantization
Hierarchical Programmatic Reinforcement Learning via Learning to Compose Programs
Marginalization is not Marginal: No Bad VAE Local Minima when Learning Optimal Sparse Representations
Extrapolative Controlled Sequence Generation via Iterative Refinement
Sampling-based Nyström Approximation and Kernel Quadrature
Global Optimization with Parametric Function Approximation
Towards Constituting Mathematical Structures for Learning to Optimize
On the Initialization of Graph Neural Networks
From Hypergraph Energy Functions to Hypergraph Neural Networks
Data Efficient Neural Scaling Law via Model Reusing
Learning to Optimize Differentiable Games
Cluster Explanation via Polyhedral Descriptions
Certifying Ensembles: A General Certification Theory with S-Lipschitzness
SurCo: Learning Linear SURrogates for COmbinatorial Nonlinear Optimization Problems
Rotation and Translation Invariant Representation Learning with Implicit Neural Representations
Searching Large Neighborhoods for Integer Linear Programs with Contrastive Learning
Non-stationary Reinforcement Learning under General Function Approximation
Mechanistic Mode Connectivity
Understanding and Defending Patched-based Adversarial Attacks for Vision Transformer
PFGM++: Unlocking the Potential of Physics-Inspired Generative Models
FusionRetro: Molecule Representation Fusion via In-Context Learning for Retrosynthetic Planning
CodeIPPrompt: Intellectual Property Infringement Assessment of Code Language Models
Differentially Private Distributed Bayesian Linear Regression with MCMC
Whose Opinions Do Language Models Reflect?
Emergence of Sparse Representations from Noise
Pareto Regret Analyses in Multi-objective Multi-armed Bandit
Transformers Learn In-Context by Gradient Descent
Which Tricks are Important for Learning to Rank?
Chemically Transferable Generative Backmapping of Coarse-Grained Proteins
Detecting Adversarial Directions in Deep Reinforcement Learning to Make Robust Decisions
Incentivizing Exploration with Linear Contexts and Combinatorial Actions
JAWS-X: Addressing Efficiency Bottlenecks of Conformal Prediction Under Standard and Feedback Covariate Shift
Multi-Epoch Matrix Factorization Mechanisms for Private Machine Learning
HOPE: High-order Graph ODE For Modeling Interacting Dynamics
Moderately Distributional Exploration for Domain Generalization
Social learning spontaneously emerges by searching optimal heuristics with deep reinforcement learning
Unlocking Slot Attention by Changing Optimal Transport Costs
Efficient Sequence Transduction by Jointly Predicting Tokens and Durations
Fast, Differentiable and Sparse Top-k: a Convex Analysis Perspective
On Heterogeneous Treatment Effects in Heterogeneous Causal Graphs
Faster Rates of Convergence to Stationary Points in Differentially Private Optimization
Adaptive Compositional Continual Meta-Learning
Learning Affinity with Hyperbolic Representation for Spatial Propagation
Dual Propagation: Accelerating Contrastive Hebbian Learning with Dyadic Neurons
Theoretical Behavior of XAI Methods in the Presence of Suppressor Variables
On the Occupancy Measure of Non-Markovian Policies in Continuous MDPs
Loss Balancing for Fair Supervised Learning
Target-Aware Generative Augmentations for Single-Shot Adaptation
Who Needs to Know? Minimal Knowledge for Optimal Coordination
Online Learning with Feedback Graphs: The True Shape of Regret
AutoCoreset: An Automatic Practical Coreset Construction Framework
Image Shortcut Squeezing: Countering Perturbative Availability Poisons with Compression
An Adaptive Entropy-Regularization Framework for Multi-Agent Reinforcement Learning
Eventual Discounting Temporal Logic Counterfactual Experience Replay
Extrapolated Random Tree for Regression
On the Identifiability and Estimation of Causal Location-Scale Noise Models
Orthogonality-Enforced Latent Space in Autoencoders: An Approach to Learning Disentangled Representations
Understanding the Impact of Adversarial Robustness on Accuracy Disparity
MABe22: A Multi-Species Multi-Task Benchmark for Learned Representations of Behavior
Pruning via Sparsity-indexed ODE: a Continuous Sparsity Viewpoint
LESSON: Learning to Integrate Exploration Strategies for Reinforcement Learning via an Option Framework
Adaptive Estimation of Graphical Models under Total Positivity
Algorithmic Stability of Heavy-Tailed SGD with General Loss Functions
Additive Causal Bandits with Unknown Graph
Multi-Task Structural Learning using Local Task Similarity induced Neuron Creation and Removal
FedVS: Straggler-Resilient and Privacy-Preserving Vertical Federated Learning for Split Models
Quantitative Universal Approximation Bounds for Deep Belief Networks
Are Neurons Actually Collapsed? On the Fine-Grained Structure in Neural Representations
Expectation-Complete Graph Representations with Homomorphisms
Group Equivariant Fourier Neural Operators for Partial Differential Equations
SGD with Large Step Sizes Learns Sparse Features
MAHALO: Unifying Offline Reinforcement Learning and Imitation Learning from Observations
STEERING : Stein Information Directed Exploration for Model-Based Reinforcement Learning
Auxiliary Learning as an Asymmetric Bargaining Game
Equivariant Architectures for Learning in Deep Weight Spaces
InGram: Inductive Knowledge Graph Embedding via Relation Graphs
CoDi: Co-evolving Contrastive Diffusion Models for Mixed-type Tabular Synthesis
Reconstructive Neuron Pruning for Backdoor Defense
Learning Deductive Reasoning from Synthetic Corpus based on Formal Logic
Unifying Molecular and Textual Representations via Multi-task Language Modelling
A Toy Model of Universality: Reverse Engineering how Networks Learn Group Operations
Revisiting Data-Free Knowledge Distillation with Poisoned Teachers
Adaptive Computation with Elastic Input Sequence
The Ideal Continual Learner: An Agent That Never Forgets
Collaborative Multi-Agent Heterogeneous Multi-Armed Bandits
Deep Anomaly Detection under Labeling Budget Constraints
Differentiable Simulations for Enhanced Sampling of Rare Events
Coupled Variational Autoencoder
On Second-Order Scoring Rules for Epistemic Uncertainty Quantification
Dynamic Constrained Submodular Optimization with Polylogarithmic Update Time
$H$-Consistency Bounds for Pairwise Misranking Loss Surrogates
Fast Algorithms for Distributed k-Clustering with Outliers
Revisiting Weighted Aggregation in Federated Learning with Neural Networks
FedBR: Improving Federated Learning on Heterogeneous Data via Local Learning Bias Reduction
When does Privileged information Explain Away Label Noise?
On Pitfalls of Test-Time Adaptation
Distributional Offline Policy Evaluation with Predictive Error Guarantees
Distilling Internet-Scale Vision-Language Models into Embodied Agents
Forward-Backward Gaussian Variational Inference via JKO in the Bures-Wasserstein Space
Text Generation with Diffusion Language Models: A Pre-training Approach with Continuous Paragraph Denoise
On the Training Instability of Shuffling SGD with Batch Normalization
Doubly Adversarial Federated Bandits
Measuring the Impact of Programming Language Distribution
Expertise Trees Resolve Knowledge Limitations in Collective Decision-Making
DADAO: Decoupled Accelerated Decentralized Asynchronous Optimization
Why Is Public Pretraining Necessary for Private Model Training?
Prototype-Sample Relation Distillation: Towards Replay-Free Continual Learning
Critical Points and Convergence Analysis of Generative Deep Linear Networks Trained with Bures-Wasserstein Loss
Detecting Adversarial Data by Probing Multiple Perturbations Using Expected Perturbation Score
Detecting Out-of-distribution Data through In-distribution Class Prior
Large Language Models Can Be Easily Distracted by Irrelevant Context
PFNs4BO: In-Context Learning for Bayesian Optimization
Meta-learning Parameterized Skills
Learning Globally Smooth Functions on Manifolds
MyoDex: A Generalizable Prior for Dexterous Manipulation
Diverse and Faithful Knowledge-Grounded Dialogue Generation via Sequential Posterior Inference
FAENet: Frame Averaging Equivariant GNN for Materials Modeling
Beyond In-Domain Scenarios: Robust Density-Aware Calibration
Learning to Decouple Complex Systems
Linkless Link Prediction via Relational Distillation
Cross-Entropy Loss Functions: Theoretical Analysis and Applications
Can Forward Gradient Match Backpropagation?
Identifying Interpretable Subspaces in Image Representations
Global Selection of Contrastive Batches via Optimization on Sample Permutations
Differentiable Multi-Target Causal Bayesian Experimental Design
Quantifying the Knowledge in GNNs for Reliable Distillation into MLPs
Bandit Online Linear Optimization with Hints and Queries
OMS-DPM: Optimizing the Model Schedule for Diffusion Probabilistic Models
LoSparse: Structured Compression of Large Language Models based on Low-Rank and Sparse Approximation
ClimaX: A foundation model for weather and climate
TR0N: Translator Networks for 0-Shot Plug-and-Play Conditional Generation
Generative Adversarial Symmetry Discovery
The Benefits of Mixup for Feature Learning
Towards Robust Graph Incremental Learning on Evolving Graphs
FedHPO-Bench: A Benchmark Suite for Federated Hyperparameter Optimization
FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU
Learning GFlowNets From Partial Episodes For Improved Convergence And Stability
A theory of continuous generative flow networks
N$\text{A}^\text{2}$Q: Neural Attention Additive Model for Interpretable Multi-Agent Q-Learning
The Saddle-Point Method in Differential Privacy
Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
Contrastive Energy Prediction for Exact Energy-Guided Diffusion Sampling in Offline Reinforcement Learning
Unsupervised Out-of-Distribution Detection with Diffusion Inpainting
SinFusion: Training Diffusion Models on a Single Image or Video
Simple Hardware-Efficient Long Convolutions for Sequence Modeling
LIV: Language-Image Representations and Rewards for Robotic Control
Optimal Goal-Reaching Reinforcement Learning via Quasimetric Learning
Sampling-Based Accuracy Testing of Posterior Estimators for General Inference
Leveraging Offline Data in Online Reinforcement Learning
Learning Noisy OR Bayesian Networks with Max-Product Belief Propagation
RLSbench: Domain Adaptation Under Relaxed Label Shift
Learning to Boost Training by Periodic Nowcasting Near Future Weights
Bayesian Reparameterization of Reward-Conditioned Reinforcement Learning with Energy-based Models
Differentially Private Optimization on Large Model at Small Cost
Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the Machiavelli Benchmark
Predicting Ordinary Differential Equations with Transformers
DP-Fast MH: Private, Fast, and Accurate Metropolis-Hastings for Large-Scale Bayesian Inference
Understanding the Distillation Process from Deep Generative Models to Tractable Probabilistic Circuits
I$^2$SB: Image-to-Image Schrödinger Bridge
GFlowNet-EM for Learning Compositional Latent Variable Models
FeDXL: Provable Federated Learning for Deep X-Risk Optimization
Blockwise Stochastic Variance-Reduced Methods with Parallel Speedup for Multi-Block Bilevel Optimization
Not All Semantics are Created Equal: Contrastive Self-supervised Learning with Automatic Temperature Individualization
Text-To-Concept (and Back) via Cross-Model Alignment
Conformal Inference is (almost) Free for Neural Networks Trained with Early Stopping
Continuously Parameterized Mixture Models
FaDIn: Fast Discretized Inference for Hawkes Processes with General Parametric Kernels
Regression with Sensor Data Containing Incomplete Observations
Superhuman Fairness
Extending Kernel PCA through Dualization: Sparsity, Robustness and Fast Algorithms
PWSHAP: A Path-Wise Explanation Model for Targeted Variables
Outline, Then Details: Syntactically Guided Coarse-To-Fine Code Generation
Discovering Object-Centric Generalized Value Functions From Pixels
Multi-channel Autobidding with Budget and ROI Constraints
Principled Acceleration of Iterative Numerical Methods Using Machine Learning
Beam Tree Recursive Cells
Monotonic Location Attention for Length Generalization
GraphCleaner: Detecting Mislabelled Samples in Popular Graph Learning Benchmarks
UPSCALE: Unconstrained Channel Pruning
Trompt: Towards a Better Deep Neural Network for Tabular Data
Gibbsian Polar Slice Sampling
Graph Reinforcement Learning for Network Control via Bi-Level Optimization
DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation
On the Estimation of Gaussian Mixture Copula Models
Monotonicity and Double Descent in Uncertainty Estimation with Gaussian Processes
A Study of Global and Episodic Bonuses for Exploration in Contextual MDPs
Fully Bayesian Autoencoders with Latent Sparse Gaussian Processes
B-Learner: Quasi-Oracle Bounds on Heterogeneous Causal Effects Under Hidden Confounding
Efficient and Equivariant Graph Networks for Predicting Quantum Hamiltonian
On the Connection Between MPNN and Graph Transformer
Policy Gradient in Robust MDPs with Global Convergence Guarantee
Unit Scaling: Out-of-the-Box Low-Precision Training
Masked Trajectory Models for Prediction, Representation, and Control
A Three-regime Model of Network Pruning
DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature
Data Structures for Density Estimation
Training Normalizing Flows from Dependent Data
Scaling Up Dataset Distillation to ImageNet-1K with Constant Memory
Overcoming Simplicity Bias in Deep Networks using a Feature Sieve
Multi-Objective GFlowNets
Discrete Key-Value Bottleneck
Hyena Hierarchy: Towards Larger Convolutional Language Models
EF21-P and Friends: Improved Theoretical Communication Complexity for Distributed Optimization with Bidirectional Compression
Tighter Analysis for ProxSkip
High-Probability Bounds for Stochastic Optimization and Variational Inequalities: the Case of Unbounded Variance
KDEformer: Accelerating Transformers via Kernel Density Estimation
Streaming Submodular Maximization with Differential Privacy
Learning Neural PDE Solvers with Parameter-Guided Channel Attention
Margin-based sampling in high dimensions: When being active is less efficient than staying passive
MonoNeRF: Learning Generalizable NeRFs from Monocular Videos without Camera Poses
Stochastic Gradient Descent-Induced Drift of Representation in a Two-Layer Neural Network
Large Language Models Struggle to Learn Long-Tail Knowledge
Sequential Predictive Conformal Inference for Time Series
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Hierarchical Neural Coding for Controllable CAD Model Generation
Opponent-Limited Online Search for Imperfect Information Games
Fair Densities via Boosting the Sufficient Statistics of Exponential Families
User-defined Event Sampling and Uncertainty Quantification in Diffusion Models for Physical Dynamical Systems
Matrix Estimation for Individual Fairness
Better Diffusion Models Further Improve Adversarial Training
Beyond Exponentially Fast Mixing in Average-Reward Reinforcement Learning via Multi-Level Monte Carlo Actor-Critic
Variational Open-Domain Question Answering
Learning for Edge-Weighted Online Bipartite Matching with Robustness Guarantees
Synergies between Disentanglement and Sparsity: Generalization and Identifiability in Multi-Task Learning
Generalized Teacher Forcing for Learning Chaotic Dynamics
Structure Learning of Latent Factors via Clique Search on Correlation Thresholded Graphs
Understanding Self-Distillation in the Presence of Label Noise
Beyond Uniform Lipschitz Condition in Differentially Private Optimization
Sequential Underspecified Instrument Selection for Cause-Effect Estimation
Diffusion Based Representation Learning
Which Features are Learnt by Contrastive Learning? On the Role of Simplicity Bias in Class Collapse and Feature Suppression
Coder Reviewer Reranking for Code Generation
Data-Efficient Contrastive Self-supervised Learning: Most Beneficial Examples for Supervised Learning Contribute the Least
Mitigating Spurious Correlations in Multi-modal Models during Fine-tuning
Langevin Thompson Sampling with Logarithmic Communication: Bandits and Reinforcement Learning
Neural Collapse in Deep Linear Networks: From Balanced to Imbalanced Data
Conformalization of Sparse Generalized Linear Models
Fast Rates for Maximum Entropy Exploration
The Unreasonable Effectiveness of Few-shot Learning for Machine Translation
Scaling Laws for Multilingual Neural Machine Translation
Secure Federated Correlation Test and Entropy Estimation
Image generation with shortest path diffusion
Multiply Robust Off-policy Evaluation and Learning under Truncation by Death
On Provable Copyright Protection for Generative Models
Hardness of Independent Learning and Sparse Equilibrium Computation in Markov Games
Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language
Generative Causal Representation Learning for Out-of-Distribution Motion Forecasting
Input Perturbation Reduces Exposure Bias in Diffusion Models
Demystifying Uneven Vulnerability of Link Stealing Attacks against Graph Neural Networks
Tight Regret Bounds for Single-pass Streaming Multi-armed Bandits
Understanding Backdoor Attacks through the Adaptability Hypothesis
Learning to Maximize Mutual Information for Dynamic Feature Selection
Sketching Meets Differential Privacy: Fast Algorithm for Dynamic Kronecker Projection Maintenance
Sketching for First Order Method: Efficient Algorithm for Low-Bandwidth Channel and Vulnerability
Off-Policy Evaluation for Large Action Spaces via Conjunct Effect Modeling
Accelerated Infeasibility Detection of Constrained Optimization and Fixed-Point Iterations
Pretraining Language Models with Human Preferences
Sketch-Flip-Merge: Mergeable Sketches for Private Distinct Counting
NeRFool: Uncovering the Vulnerability of Generalizable Neural Radiance Fields against Adversarial Perturbations
Master-ASR: Achieving Multilingual Scalability and Low-Resource Adaptation in ASR with Modular Learning
Fast Online Value-Maximizing Prediction Sets with Conformal Cost Control
Unconstrained Online Learning with Unbounded Losses
Sample and Predict Your Latent: Modality-free Sequential Disentanglement via Contrastive Estimation
Improved Algorithms for Multi-period Multi-class Packing Problems with Bandit Feedback
Towards Deep Attention in Graph Neural Networks: Problems and Remedies
Adversarial Parameter Attack on Deep Neural Networks
Phase Transitions in the Detection of Correlated Databases
Understanding the Complexity Gains of Single-Task RL with a Curriculum
When Personalization Harms Performance: Reconsidering the Use of Group Attributes in Prediction
OCD: Learning to Overfit with Conditional Diffusion Models
Towards Bridging the Gaps between the Right to Explanation and the Right to be Forgotten
Neural Wave Machines: Learning Spatiotemporally Structured Representations with Locally Coupled Oscillatory Recurrent Neural Networks
Latent Traversals in Generative Models as Potential Flows
DUET: 2D Structured and Approximately Equivariant Representations
ProtST: Multi-Modality Learning of Protein Sequences and Biomedical Texts
Contextual Reliability: When Different Features Matter in Different Contexts
HETAL: Efficient Privacy-preserving Transfer Learning with Homomorphic Encryption
Hierarchical Imitation Learning with Vector Quantized Models
Using Large Language Models to Simulate Multiple Humans and Replicate Human Subject Studies
Deep linear networks can benignly overfit when shallow ones do
Policy Evaluation and Temporal-Difference Learning in Continuous Time and Space: A Martingale Approach
Multi-Agent Online Optimization with Delays: Asynchronicity, Adaptivity, and Optimism
Data-Derived Weak Universal Consistency
Existence, Stability and Scalability of Orthogonal Convolutional Neural Networks
Adversarial Classification: Necessary Conditions and Geometric Flows
On Generalizations of Some Distance Based Classifiers for HDLSS Data
Model-Bellman Inconsistency for Model-based Offline Reinforcement Learning
CLUSTSEG: Clustering for Universal Segmentation
Bi-directional Masks for Efficient N:M Sparse Training
Composer: Creative and Controllable Image Synthesis with Composable Conditions
Learning to acquire novel cognitive tasks with evolution, plasticity and meta-meta-learning
Semiparametrically Efficient Off-Policy Evaluation in Linear Markov Decision Processes
Enabling First-Order Gradient-Based Learning for Equilibrium Computation in Markets
Non-autoregressive Conditional Diffusion Models for Time Series Prediction
On the Power of Foundation Models
Neural Diffusion Processes
Building Neural Networks on Matrix Manifolds: A Gyrovector Space Approach
Contrastive Learning Meets Homophily: Two Birds with One Stone
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both Worlds in Stochastic and Deterministic Environments
On the Generalization of Multi-modal Contrastive Learning
Near-Minimax-Optimal Risk-Sensitive Reinforcement Learning with CVaR
ContraBAR: Contrastive Bayes-Adaptive Deep RL
Are Diffusion Models Vulnerable to Membership Inference Attacks?
Data Representations' Study of Latent Image Manifolds
Is Consensus Acceleration Possible in Decentralized Optimization over Slowly Time-Varying Networks?
Benign Overfitting in Two-layer ReLU Convolutional Neural Networks
Attention-Based Recurrence for Multi-Agent Reinforcement Learning under Stochastic Partial Observability
Second-order regression models exhibit progressive sharpening to the edge of stability
SAM operates far from home: eigenvalue regularization as a dynamical phenomenon
Subsample Ridge Ensembles: Equivalences and Generalized Cross-Validation
Policy Regularization with Dataset Constraint for Offline Reinforcement Learning
Guiding Pretraining in Reinforcement Learning with Large Language Models
A Mathematical Model for Curriculum Learning for Parities
Revisiting Gradient Clipping: Stochastic bias and tight convergence guarantees
Two Losses Are Better Than One: Faster Optimization Using a Cheaper Proxy
MolDiff: Addressing the Atom-Bond Inconsistency Problem in 3D Molecule Diffusion Generation
NNSplitter: An Active Defense Solution for DNN Model via Automated Weight Obfuscation
Learning Representations without Compositional Assumptions
Fair and Accurate Decision Making through Group-Aware Learning
CocktailSGD: Fine-tuning Foundation Models over 500Mbps Networks
Adversarial Learning of Distributional Reinforcement Learning
Online Local Differential Private Quantile Inference via Self-normalization
Horizon-free Learning for Markov Decision Processes and Games: Stochastically Bounded Rewards and Improved Bounds
Approximate Stein Classes for Truncated Density Estimation
SinDDM: A Single Image Denoising Diffusion Model
Robustly Learning a Single Neuron via Sharpness
Neural Markov Jump Processes
A Model-Based Method for Minimizing CVaR and Beyond
Revisiting Structured Variational Autoencoders
Oracles & Followers: Stackelberg Equilibria in Deep Multi-Agent Reinforcement Learning
Lottery Tickets in Evolutionary Optimization: On Sparse Backpropagation-Free Trainability
We use cookies to store which papers have been visited.
I agree
Successful Page Load
ICML uses cookies for essential functions only. We do not sell your personal information.
Our Privacy Policy »
Accept Cookies
We use cookies to store which papers have been visited.
I agree