Skip to yearly menu bar
Skip to main content
Main Navigation
ICML
Help/FAQ
Contact ICML
Downloads
Code of Conduct
Create Profile
Journal To Conference Track
Diversity & Inclusion
Privacy Policy
Future Meetings
Press
Careers
My Stuff
Login
Select Year: (2022)
2025
2024
2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2002
1996
IMLS Archives
Getting Started
Schedule
Tutorials
Main Conference
Invited Talks
Orals
Awards
Test of Time Award
Papers
Workshops
Community
Affinity Events
Socials
Mentorship
Sponsors
Organization
Help
FAQ
Presenters Instructions
Moderators Instructions
RocketChat Help
Browse
Visualization
mini
compact
topic
detail
Showing papers for
.
×
×
title
author
topic
session
shuffle
by
serendipity
bookmarked first
visited first
not visited first
bookmarked but not visited
Enable Javascript in your browser to see the papers page.
Communicating via Markov Decision Processes
Inferring Cause and Effect in the Presence of Heteroscedastic Noise
Simplex Neural Population Learning: Any-Mixture Bayes-Optimality in Symmetric Zero-sum Games
Sample and Communication-Efficient Decentralized Actor-Critic Algorithms with Finite-Time Analysis
Anticorrelated Noise Injection for Improved Generalization
Scalable First-Order Bayesian Optimization via Structured Automatic Differentiation
When AUC meets DRO: Optimizing Partial AUC for Deep Learning with Non-Convex Convergence Guarantee
Regularizing a Model-based Policy Stationary Distribution to Stabilize Offline Reinforcement Learning
The Primacy Bias in Deep Reinforcement Learning
Understanding Gradient Descent on the Edge of Stability in Deep Learning
Connect, Not Collapse: Explaining Contrastive Learning for Unsupervised Domain Adaptation
On Distribution Shift in Learning-based Bug Detectors
SPECTRE: Spectral Conditioning Helps to Overcome the Expressivity Limits of One-shot Graph Generators
Spatial-Channel Token Distillation for Vision MLPs
Understanding The Robustness in Vision Transformers
NeuroFluid: Fluid Dynamics Grounding with Particle-Driven Neural Radiance Fields
Measuring Representational Robustness of Neural Networks Through Shared Invariances
Efficient Approximate Inference for Stationary Kernel on Frequency Domain
Goal Misgeneralization in Deep Reinforcement Learning
Topology-aware Generalization of Decentralized SGD
Human-in-the-loop: Provably Efficient Preference-based Reinforcement Learning with General Function Approximation
Decision-Focused Learning: Through the Lens of Learning to Rank
On Implicit Bias in Overparameterized Bilevel Optimization
Welfare Maximization in Competitive Equilibrium: Reinforcement Learning for Markov Exchange Economy
EquiBind: Geometric Deep Learning for Drug Binding Structure Prediction
Understanding Gradual Domain Adaptation: Improved Analysis, Optimal Path and Beyond
Linear-Time Gromov Wasserstein Distances using Low Rank Couplings and Costs
Input-agnostic Certified Group Fairness via Gaussian Parameter Smoothing
Channel Importance Matters in Few-Shot Image Classification
RECAPP: Crafting a More Efficient Catalyst for Convex Optimization
From Noisy Prediction to True Label: Noisy Prediction Calibration via Generative Model
Supervised Off-Policy Ranking
Steerable 3D Spherical Neurons
Neural Inverse Kinematic
SE(3) Equivariant Graph Neural Networks with Complete Local Frames
DNNR: Differential Nearest Neighbors Regression
Injecting Logical Constraints into Neural Networks via Straight-Through Estimators
Active Sampling for Min-Max Fairness
Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning
Versatile Dueling Bandits: Best-of-both World Analyses for Learning from Relative Preferences
More Than a Toy: Random Matrix Models Predict How Real-World Neural Representations Generalize
The Combinatorial Brain Surgeon: Pruning Weights That Cancel One Another in Neural Networks
Towards Understanding Sharpness-Aware Minimization
AGNAS: Attention-Guided Micro- and Macro-Architecture Search
Improving Policy Optimization with Generalist-Specialist Learning
Short-Term Plasticity Neurons Learning to Learn and Forget
Adaptive Model Design for Markov Decision Process
Unsupervised Ground Metric Learning Using Wasserstein Singular Vectors
Penalizing Gradient Norm for Efficiently Improving Generalization in Deep Learning
Orchestra: Unsupervised Federated Learning via Globally Consistent Clustering
Matching Learned Causal Effects of Neural Networks with Domain Priors
Bayesian Deep Embedding Topic Meta-Learner
REvolveR: Continuous Evolutionary Models for Robot-to-robot Policy Transfer
Exploring the Gap between Collapsed & Whitened Features in Self-Supervised Learning
Guarantees for Epsilon-Greedy Reinforcement Learning with Function Approximation
Accurate Quantization of Measures via Interacting Particle-based Optimization
Planning with Diffusion for Flexible Behavior Synthesis
An Exact Symbolic Reduction of Linear Smart Predict+Optimize to Mixed Integer Linear Programming
Equivariant Priors for compressed sensing with unknown orientation
Neural Language Models are not Born Equal to Fit Brain Data, but Training Helps
On the Role of Discount Factor in Offline Reinforcement Learning
Adapting the Linearised Laplace Model Evidence for Modern Deep Learning
Communication-Efficient Adaptive Federated Learning
Content Addressable Memory Without Catastrophic Forgetting by Heteroassociation with a Fixed Scaffold
Continual Learning via Sequential Function-Space Variational Inference
Bitwidth Heterogeneous Federated Learning with Progressive Weight Dequantization
Do More Negative Samples Necessarily Hurt In Contrastive Learning?
Fast Population-Based Reinforcement Learning on a Single Machine
NeuralEF: Deconstructing Kernels by Deep Neural Networks
Private Streaming SCO in $\ell_p$ geometry with Applications in High Dimensional Online Decision Making
Robin Hood and Matthew Effects: Differential Privacy Has Disparate Impact on Synthetic Data
Robust Training of Neural Networks Using Scale Invariant Architectures
Generalization and Robustness Implications in Object-Centric Learning
Knowledge-Grounded Self-Rationalization via Extractive and Natural Language Explanations
Multicoated Supermasks Enhance Hidden Networks
Time Is MattEr: Temporal Self-supervision for Video Transformers
Privacy for Free: How does Dataset Condensation Help Privacy?
Discovering Generalizable Spatial Goal Representations via Graph-based Active Reward Learning
Stable Conformal Prediction Sets
Correct-N-Contrast: a Contrastive Approach for Improving Robustness to Spurious Correlations
Selective Regression under Fairness Criteria
Towards Scaling Difference Target Propagation by Learning Backprop Targets
Revisiting Label Smoothing and Knowledge Distillation Compatibility: What was Missing?
Personalized Federated Learning through Local Memorization
SkexGen: Autoregressive Generation of CAD Construction Sequences with Disentangled Codebooks
Lyapunov Density Models: Constraining Distribution Shift in Learning-Based Control
Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding
Deep Safe Incomplete Multi-view Clustering: Theorem and Algorithm
On the Adversarial Robustness of Causal Algorithmic Recourse
Virtual Homogeneity Learning: Defending against Data Heterogeneity in Federated Learning
Coin Flipping Neural Networks
Neurotoxin: Durable Backdoors in Federated Learning
Dataset Condensation with Contrastive Signals
How to Stay Curious while avoiding Noisy TVs using Aleatoric Uncertainty Estimation
Deep Network Approximation in Terms of Intrinsic Parameters
A Convergent and Dimension-Independent Min-Max Optimization Algorithm
Adversarially Robust Models may not Transfer Better: Sufficient Conditions for Domain Transferability from the View of Regularization
Implicit Regularization with Polynomial Growth in Deep Tensor Factorization
Demystifying the Adversarial Robustness of Random Transformation Defenses
Marginal Tail-Adaptive Normalizing Flows
Image-to-Image Regression with Distribution-Free Uncertainty Quantification and Applications in Imaging
NP-Match: When Neural Processes meet Semi-Supervised Learning
Representation Topology Divergence: A Method for Comparing Neural Network Representations.
Understanding Contrastive Learning Requires Incorporating Inductive Biases
Neural Implicit Dictionary Learning via Mixture-of-Expert Training
POET: Training Neural Networks on Tiny Devices with Integrated Rematerialization and Paging
Score-based Generative Modeling of Graphs via the System of Stochastic Differential Equations
Retroformer: Pushing the Limits of End-to-end Retrosynthesis Transformer
SpaceMAP: Visualizing High-Dimensional Data by Space Expansion
Meta-Learning Hypothesis Spaces for Sequential Decision-making
Lazy Estimation of Variable Importance for Large Neural Networks
A Single-Loop Gradient Descent and Perturbed Ascent Algorithm for Nonconvex Functional Constrained Optimization
Partial disentanglement for domain adaptation
Near-Optimal Algorithms for Autonomous Exploration and Multi-Goal Stochastic Shortest Path
Convergence Rates of Non-Convex Stochastic Gradient Descent Under a Generic Lojasiewicz Condition and Local Smoothness
Langevin Monte Carlo for Contextual Bandits
Influence-Augmented Local Simulators: a Scalable Solution for Fast Deep RL in Large Networked Systems
Tractable Uncertainty for Structure Learning
Multi-slots Online Matching with High Entropy
Data Determines Distributional Robustness in Contrastive Language Image Pre-training (CLIP)
Self-Supervised Models of Audio Effectively Explain Human Cortical Responses to Speech
GNNRank: Learning Global Rankings from Pairwise Comparisons via Directed Graph Neural Networks
Deconfounded Value Decomposition for Multi-Agent Reinforcement Learning
Wide Bayesian neural networks have a simple weight posterior: theory and accelerated sampling
On the Generalization Analysis of Adversarial Learning
Extended Unconstrained Features Model for Exploring Deep Neural Collapse
Proximal Denoiser for Convergent Plug-and-Play Optimization with Nonconvex Regularization
Zero-Shot Reward Specification via Grounded Natural Language
VLMixer: Unpaired Vision-Language Pre-training via Cross-Modal CutMix
Benefits of Overparameterized Convolutional Residual Networks: Function Approximation under Smoothness Constraint
Generative Cooperative Networks for Natural Language Generation
Multi-scale Feature Learning Dynamics: Insights for Double Descent
On Convergence of Gradient Descent Ascent: A Tight Local Analysis
Reachability Constrained Reinforcement Learning
A State-Distribution Matching Approach to Non-Episodic Reinforcement Learning
TAM: Topology-Aware Margin Loss for Class-Imbalanced Node Classification
3DLinker: An E(3) Equivariant Variational Autoencoder for Molecular Linker Design
Deep Variational Graph Convolutional Recurrent Network for Multivariate Time Series Anomaly Detection
Differentially Private Coordinate Descent for Composite Empirical Risk Minimization
Calibrated Learning to Defer with One-vs-All Classifiers
Multiclass learning with margin: exponential rates with no bias-variance trade-off
Prompting Decision Transformer for Few-Shot Policy Generalization
TSPipe: Learn from Teacher Faster with Pipelines
Self-conditioning Pre-Trained Language Models
Fast Aquatic Swimmer Optimization with Differentiable Projective Dynamics and Neural Network Hydrodynamic Models
Subspace Learning for Effective Meta-Learning
The power of first-order smooth optimization for black-box non-smooth problems
Proving Theorems using Incremental Learning and Hindsight Experience Replay
Rethinking Attention-Model Explainability through Faithfulness Violation Test
Inductive Biases and Variable Creation in Self-Attention Mechanisms
A data-driven approach for learning to control computers
Causal Imitation Learning under Temporally Correlated Noise
Delayed Reinforcement Learning by Imitation
Minimizing Control for Credit Assignment with Strong Feedback
Markov Chain Monte Carlo for Continuous-Time Switching Dynamical Systems
ProxSkip: Yes! Local Gradient Steps Provably Lead to Communication Acceleration! Finally!
Generalizing to Evolving Domains with Latent Structure-Aware Sequential Autoencoder
A Unified Weight Initialization Paradigm for Tensorial Convolutional Neural Networks
CtrlFormer: Learning Transferable State Representation for Visual Control via Transformer
On the Robustness of CountSketch to Adaptive Inputs
An Initial Alignment between Neural Network and Target is Needed for Gradient Descent to Learn
Utilizing Expert Features for Contrastive Learning of Time-Series Representations
Region-Based Semantic Factorization in GANs
Evaluating the Adversarial Robustness of Adaptive Test-time Defenses
The Teaching Dimension of Regularized Kernel Learners
Selling Data To a Machine Learner: Pricing via Costly Signaling
Learning to Incorporate Texture Saliency Adaptive Attention to Image Cartoonization
Hindering Adversarial Attacks with Implicit Neural Representations
Improved Convergence Rates for Sparse Approximation Methods in Kernel-Based Learning
Unsupervised Detection of Contextualized Embedding Bias with Application to Ideology
Context-Aware Drift Detection
Neural Fisher Discriminant Analysis: Optimal Neural Network Embeddings in Polynomial Time
Efficient Learning for AlphaZero via Path Consistency
Zero-shot AutoML with Pretrained Models
Offline RL Policies Should Be Trained to be Adaptive
Improving Out-of-Distribution Robustness via Selective Augmentation
Feature Learning and Signal Propagation in Deep Neural Networks
Learning to Predict Graphs with Fused Gromov-Wasserstein Barycenters
LeNSE: Learning To Navigate Subgraph Embeddings for Large-Scale Combinatorial Optimisation
Continuous-Time Analysis of Accelerated Gradient Methods via Conservation Laws in Dilated Coordinate Systems
Stabilizing Off-Policy Deep Reinforcement Learning from Pixels
DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with Prototypical Representations
Bayesian Model Selection, the Marginal Likelihood, and Generalization
Structure-Aware Transformer for Graph Representation Learning
Generalized Leverage Scores: Geometric Interpretation and Applications
Flow-Guided Sparse Transformer for Video Deblurring
FEDformer: Frequency Enhanced Decomposed Transformer for Long-term Series Forecasting
StreamingQA: A Benchmark for Adaptation to New Knowledge over Time in Question Answering Models
Learning from a Learning User for Optimal Recommendations
Towards Evaluating Adaptivity of Model-Based Reinforcement Learning Methods
Modular Conformal Calibration
PACE: A Parallelizable Computation Encoder for Directed Acyclic Graphs
Private optimization in the interpolation regime: faster rates and hardness results
Neuron Dependency Graphs: A Causal Abstraction of Neural Networks
Searching for BurgerFormer with Micro-Meso-Macro Space Design
Rethinking Graph Neural Networks for Anomaly Detection
The Geometry of Robust Value Functions
A Tighter Analysis of Spectral Clustering, and Beyond
Directed Acyclic Transformer for Non-Autoregressive Machine Translation
Lagrangian Method for Q-Function Learning (with Applications to Machine Translation)
Model-Value Inconsistency as a Signal for Epistemic Uncertainty
Exact Learning of Preference Structure: Single-peaked Preferences and Beyond
Training Characteristic Functions with Reinforcement Learning: XAI-methods play Connect Four
Boosting Graph Structure Learning with Dummy Nodes
Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents
Learning General Halfspaces with Adversarial Label Noise via Online Gradient Descent
Robust Fine-Tuning of Deep Neural Networks with Hessian-based Generalization Guarantees
Sequential- and Parallel- Constrained Max-value Entropy Search via Information Lower Bound
Approximate Frank-Wolfe Algorithms over Graph-structured Support Sets
A Statistical Manifold Framework for Point Cloud Data
Stochastic smoothing of the top-K calibrated hinge loss for deep imbalanced classification
Iterative Double Sketching for Faster Least-Squares Optimization
Test-Time Training Can Close the Natural Distribution Shift Performance Gap in Deep Learning Based Compressed Sensing
The Poisson Binomial Mechanism for Unbiased Federated Learning with Secure Aggregation
Query-Efficient and Scalable Black-Box Adversarial Attacks on Discrete Sequential Data via Bayesian Optimization
Optimizing Tensor Network Contraction Using Reinforcement Learning
Distribution Regression with Sliced Wasserstein Kernels
Quantification and Analysis of Layer-wise and Pixel-wise Information Discarding
Understanding Doubly Stochastic Clustering
Functional Generalized Empirical Likelihood Estimation for Conditional Moment Restrictions
On the Sample Complexity of Learning Infinite-horizon Discounted Linear Kernel MDPs
Latent Outlier Exposure for Anomaly Detection with Contaminated Data
Simultaneously Learning Stochastic and Adversarial Bandits with General Graph Feedback
Set Based Stochastic Subsampling
Neural Tangent Kernel Analysis of Deep Narrow Neural Networks
Fully-Connected Network on Noncompact Symmetric Space and Ridgelet Transform based on Helgason-Fourier Analysis
Tackling Data Heterogeneity: A New Unified Framework for Decentralized SGD with Sample-induced Topology
Co-training Improves Prompt-based Learning for Large Language Models
Variational Feature Pyramid Networks
Probabilistic Bilevel Coreset Selection
Efficient Distributionally Robust Bayesian Optimization with Worst-case Sensitivity
Perfectly Balanced: Improving Transfer and Robustness of Supervised Contrastive Learning
XAI for Transformers: Better Explanations through Conservative Propagation
General-purpose, long-context autoregressive modeling with Perceiver AR
Equivalence Analysis between Counterfactual Regret Minimization and Online Mirror Descent
TURF: Two-Factor, Universal, Robust, Fast Distribution Learning Algorithm
Flashlight: Enabling Innovation in Tools for Machine Learning
Model Agnostic Sample Reweighting for Out-of-Distribution Learning
Nonparametric Involutive Markov Chain Monte Carlo
Identity-Disentangled Adversarial Augmentation for Self-supervised Learning
Fast and Provable Nonconvex Tensor RPCA
Certified Robustness Against Natural Language Attacks by Causal Intervention
A Simple Unified Framework for High Dimensional Bandit Problems
Interventional Contrastive Learning with Meta Semantic Regularizer
Parametric Visual Program Induction with Function Modularization
Generalized Beliefs for Cooperative AI
Pocket2Mol: Efficient Molecular Sampling Based on 3D Protein Pockets
Unaligned Supervision for Automatic Music Transcription in The Wild
Fisher SAM: Information Geometry and Sharpness Aware Minimisation
The Algebraic Path Problem for Graph Metrics
Overcoming Oscillations in Quantization-Aware Training
Individual Reward Assisted Multi-Agent Reinforcement Learning
Causal Inference Through the Structural Causal Marginal Problem
Improving Ensemble Distillation With Weight Averaging and Diversifying Perturbation
Sparse Mixed Linear Regression with Guarantees: Taming an Intractable Problem with Invex Relaxation
SCHA-VAE: Hierarchical Context Aggregation for Few-Shot Generation
Permutation Search of Tensor Network Structures via Local Sampling
Exploiting Redundancy: Separable Group Convolutional Networks on Lie Groups
AnyMorph: Learning Transferable Polices By Inferring Agent Morphology
Adaptive Conformal Predictions for Time Series
Unified Scaling Laws for Routed Language Models
Bayesian Optimization for Distributionally Robust Chance-constrained Problem
Provable Domain Generalization via Invariant-Feature Subspace Recovery
Revisiting Consistency Regularization for Deep Partial Label Learning
PoF: Post-Training of Feature Extractor for Improving Generalization
A Closer Look at Smoothness in Domain Adversarial Training
SpeqNets: Sparsity-aware permutation-equivariant graph networks
Off-Policy Fitted Q-Evaluation with Differentiable Function Approximators: Z-Estimation and Inference Theory
Fast Lossless Neural Compression with Integer-Only Discrete Flows
How Powerful are Spectral Graph Neural Networks
A Deep Learning Approach for the Segmentation of Electroencephalography Data in Eye Tracking Applications
Align-RUDDER: Learning From Few Demonstrations by Reward Redistribution
Proximal Exploration for Model-guided Protein Sequence Design
Active Learning on a Budget: Opposite Strategies Suit High and Low Budgets
Discriminator-Weighted Offline Imitation Learning from Suboptimal Demonstrations
Data Augmentation as Feature Manipulation
Efficient Low Rank Convex Bounds for Pairwise Discrete Graphical Models
UniRank: Unimodal Bandit Algorithms for Online Ranking
VLUE: A Multi-Task Multi-Dimension Benchmark for Evaluating Vision-Language Pre-training
Label-Descriptive Patterns and Their Application to Characterizing Classification Errors
Online and Consistent Correlation Clustering
FedNest: Federated Bilevel, Minimax, and Compositional Optimization
Near-optimal rate of consistency for linear models with missing values
Optimal and Efficient Dynamic Regret Algorithms for Non-Stationary Dueling Bandits
Bayesian Optimization under Stochastic Delayed Feedback
Online Nonsubmodular Minimization with Delayed Costs: From Full Information to Bandit Feedback
Learning of Cluster-based Feature Importance for Electronic Health Record Time-series
Global Optimization of K-Center Clustering
EDEN: Communication-Efficient and Robust Distributed Mean Estimation for Federated Learning
Variational nearest neighbor Gaussian process
A Completely Tuning-Free and Robust Approach to Sparse Precision Matrix Estimation
Generalized Data Distribution Iteration
Anytime Information Cascade Popularity Prediction via Self-Exciting Processes
Secure Quantized Training for Deep Learning
Iterative Hard Thresholding with Adaptive Regularization: Sparser Solutions Without Sacrificing Runtime
Improving Adversarial Robustness via Mutual Information Estimation
Black-Box Tuning for Language-Model-as-a-Service
A Context-Integrated Transformer-Based Neural Network for Auction Design
Statistical inference with implicit SGD: proximal Robbins-Monro vs. Polyak-Ruppert
Neurocoder: General-Purpose Computation Using Stored Neural Programs
Learning Infinite-horizon Average-reward Markov Decision Process with Constraints
Stochastic Contextual Dueling Bandits under Linear Stochastic Transitivity Models
Sparse Invariant Risk Minimization
Winning the Lottery Ahead of Time: Efficient Early Network Pruning
Implicit Bias of the Step Size in Linear Diagonal Neural Networks
Intriguing Properties of Input-Dependent Randomized Smoothing
Latent Diffusion Energy-Based Model for Interpretable Text Modelling
Scalable MCMC Sampling for Nonsymmetric Determinantal Point Processes
Bregman Neural Networks
ModLaNets: Learning Generalisable Dynamics via Modularity and Physical Inductive Bias
Graph Neural Architecture Search Under Distribution Shifts
Quantifying and Learning Linear Symmetry-Based Disentanglement
Certifying Out-of-Domain Generalization for Blackbox Functions
Constants Matter: The Performance Gains of Active Learning
AutoSNN: Towards Energy-Efficient Spiking Neural Networks
A Modern Self-Referential Weight Matrix That Learns to Modify Itself
The Complexity of k-Means Clustering when Little is Known
Gradient Based Clustering
Modeling Adversarial Noise for Adversarial Training
Balancing Discriminability and Transferability for Source-Free Domain Adaptation
Closed-Form Diffeomorphic Transformations for Time Series Alignment
Dynamic Regret of Online Markov Decision Processes
A Multi-objective / Multi-task Learning Framework Induced by Pareto Stationarity
Spectral Representation of Robustness Measures for Optimization Under Input Uncertainty
Individual Preference Stability for Clustering
Koopman Q-learning: Offline Reinforcement Learning via Symmetries of Dynamics
Importance Weighted Kernel Bayes' Rule
Instrumental Variable Regression with Confounder Balancing
Optimal Clustering with Noisy Queries via Multi-Armed Bandit
Optimization-Derived Learning with Essential Convergence Analysis of Training and Hyper-training
It’s Raw! Audio Generation with State-Space Models
Nyström Kernel Mean Embeddings
Pessimistic Minimax Value Iteration: Provably Efficient Equilibrium Learning from Offline Datasets
An Asymptotic Test for Conditional Independence using Analytic Kernel Embeddings
Causal Transformer for Estimating Counterfactual Outcomes
A Theoretical Analysis on Independence-driven Importance Weighting for Covariate-shift Generalization
Convolutional and Residual Networks Provably Contain Lottery Tickets
Interpretable Off-Policy Learning via Hyperbox Search
Soft Truncation: A Universal Training Technique of Score-based Diffusion Model for High Precision Score Estimation
UNIREX: A Unified Learning Framework for Language Model Rationale Extraction
Self-Organized Polynomial-Time Coordination Graphs
Model-based Meta Reinforcement Learning using Graph Structured Surrogate Models and Amortized Policy Search
HyperTransformer: Model Generation for Supervised and Semi-Supervised Few-Shot Learning
Predicting Out-of-Distribution Error with the Projection Norm
Solving Stackelberg Prediction Game with Least Squares Loss via Spherically Constrained Least Squares Reformulation
Generative Modeling for Multi-task Visual Learning
An Intriguing Property of Geophysics Inversion
Phasic Self-Imitative Reduction for Sparse-Reward Goal-Conditioned Reinforcement Learning
A Marriage between Adversarial Team Games and 2-player Games: Enabling Abstractions, No-regret Learning, and Subgame Solving
Let Invariant Rationale Discovery Inspire Graph Contrastive Learning
Fast rates for noisy interpolation require rethinking the effect of inductive bias
Provably Adversarially Robust Nearest Prototype Classifiers
Continual Learning with Guarantees via Weight Interval Constraints
You Only Cut Once: Boosting Data Augmentation with a Single Cut
A Dynamical System Perspective for Lipschitz Neural Networks
Modality Competition: What Makes Joint Training of Multi-modal Network Fail in Deep Learning? (Provably)
Strategic Representation
A Convergence Theory for SVGD in the Population Limit under Talagrand's Inequality T1
On the Learning of Non-Autoregressive Transformers
Actor-Critic based Improper Reinforcement Learning
Byzantine Machine Learning Made Easy By Resilient Averaging of Momentums
Reinforcement Learning from Partial Observation: Linear Function Approximation with Provable Sample Efficiency
Understanding and Improving Knowledge Graph Embedding for Entity Alignment
Plan Better Amid Conservatism: Offline Multi-Agent Reinforcement Learning with Actor Rectification
A$^3$T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing
Dialog Inpainting: Turning Documents into Dialogs
How to Fill the Optimum Set? Population Gradient Descent with Harmless Diversity
Measuring dissimilarity with diffeomorphism invariance
Learning Pseudometric-based Action Representations for Offline Reinforcement Learning
Fast Composite Optimization and Statistical Recovery in Federated Learning
Efficiently Learning the Topology and Behavior of a Networked Dynamical System Via Active Queries
MASER: Multi-Agent Reinforcement Learning with Subgoals Generated from Experience Replay Buffer
Fair and Fast k-Center Clustering for Data Summarization
Deep Reference Priors: What is the best way to pretrain a model?
Improving Task-free Continual Learning by Distributionally Robust Memory Evolution
ProGCL: Rethinking Hard Negative Mining in Graph Contrastive Learning
Continual Repeated Annealed Flow Transport Monte Carlo
HyperImpute: Generalized Iterative Imputation with Automatic Model Selection
Revisiting Online Submodular Minimization: Gap-Dependent Regret Bounds, Best of Both Worlds and Adversarial Robustness
Data Scaling Laws in NMT: The Effect of Noise and Architecture
Improving Mini-batch Optimal Transport via Partial Transportation
DepthShrinker: A New Compression Paradigm Towards Boosting Real-Hardware Efficiency of Compact Neural Networks
Learning Iterative Reasoning through Energy Minimization
Efficient Computation of Higher-Order Subgraph Attribution via Message Passing
Linearity Grafting: Relaxed Neuron Pruning Helps Certifiable Robustness
Robust Imitation Learning against Variations in Environment Dynamics
Accelerated Federated Learning with Decoupled Adaptive Optimization
A Consistent and Efficient Evaluation Strategy for Attribution Methods
Private Adaptive Optimization with Side information
A Difference Standardization Method for Mutual Transfer Learning
Memory-Based Model Editing at Scale
Revisiting End-to-End Speech-to-Text Translation From Scratch
Transformer Neural Processes: Uncertainty-Aware Meta Learning Via Sequence Modeling
The Fundamental Price of Secure Aggregation in Differentially Private Federated Learning
Causal structure-based root cause analysis of outliers
Learning to Estimate and Refine Fluid Motion with Physical Dynamics
Comprehensive Analysis of Negative Sampling in Knowledge Graph Representation Learning
Showing Your Offline Reinforcement Learning Work: Online Evaluation Budget Matters
Generalised Policy Improvement with Geometric Policy Composition
Popular decision tree algorithms are provably noise tolerant
Only tails matter: Average-Case Universality and Robustness in the Convex Regime
Revisiting Some Common Practices in Cooperative Multi-Agent Reinforcement Learning
Scalable Deep Reinforcement Learning Algorithms for Mean Field Games
Bisimulation Makes Analogies in Goal-Conditioned Reinforcement Learning
DAVINZ: Data Valuation using Deep Neural Networks at Initialization
What Dense Graph Do You Need for Self-Attention?
Finite-Sum Coupled Compositional Stochastic Optimization: Theory and Applications
Mitigating Modality Collapse in Multimodal VAEs via Impartial Optimization
Inductive Matrix Completion: No Bad Local Minima and a Fast Algorithm
A Unified View on PAC-Bayes Bounds for Meta-Learning
Online Continual Learning through Mutual Information Maximization
Dimension-free Complexity Bounds for High-order Nonconvex Finite-sum Optimization
Learning Dynamics and Generalization in Deep Reinforcement Learning
Certified Adversarial Robustness Under the Bounded Support Set
ASAP.SGD: Instance-based Adaptiveness to Staleness in Asynchronous SGD
Breaking Down Out-of-Distribution Detection: Many Methods Based on OOD Training Data Estimate a Combination of the Same Core Quantities
Reducing Variance in Temporal-Difference Value Estimation via Ensemble of Deep Networks
Near-Optimal Learning of Extensive-Form Games with Imperfect Information
Contrastive Learning with Boosted Memorization
Improving Language Models by Retrieving from Trillions of Tokens
UAST: Uncertainty-Aware Siamese Tracking
Blurs Behave Like Ensembles: Spatial Smoothings to Improve Accuracy, Uncertainty, and Robustness
Online Learning for Min Sum Set Cover and Pandora’s Box
Faster Privacy Accounting via Evolving Discretization
Gradient-Free Method for Heavily Constrained Nonconvex Optimization
An Equivalence Between Data Poisoning and Byzantine Gradient Attacks
Disentangling Sources of Risk for Distributional Multi-Agent Reinforcement Learning
A Temporal-Difference Approach to Policy Gradient Estimation
Framework for Evaluating Faithfulness of Local Explanations
Learning Multiscale Transformer Models for Sequence Generation
Collaboration of Experts: Achieving 80% Top-1 Accuracy on ImageNet with 100M FLOPs
Mirror Learning: A Unifying Framework of Policy Optimisation
Denoised MDPs: Learning World Models Better Than the World Itself
Identifiability Conditions for Domain Adaptation
Nested Bandits
Federated Minimax Optimization: Improved Convergence Analyses and Algorithms
A Simple yet Universal Strategy for Online Convex Optimization
SPDY: Accurate Pruning with Speedup Guarantees
Interpretable and Generalizable Graph Learning via Stochastic Attention Mechanism
Functional Output Regression with Infimal Convolution: Exploring the Huber and $\epsilon$-insensitive Losses
Tractable Dendritic RNNs for Reconstructing Nonlinear Dynamical Systems
Smoothed Adversarial Linear Contextual Bandits with Knapsacks
A Hierarchical Bayesian Approach to Inverse Reinforcement Learning with Symbolic Reward Machines
On Collective Robustness of Bagging Against Data Poisoning
CITRIS: Causal Identifiability from Temporal Intervened Sequences
Mitigating Gender Bias in Face Recognition using the von Mises-Fisher Mixture Model
Gating Dropout: Communication-efficient Regularization for Sparsely Activated Transformers
Reinforcement Learning with Action-Free Pre-Training from Videos
A new similarity measure for covariate shift with applications to nonparametric regression
State Transition of Dendritic Spines Improves Learning of Sparse Spiking Neural Networks
Self-supervised learning with random-projection quantizer for speech recognition
Learning Symmetric Embeddings for Equivariant World Models
How Tempering Fixes Data Augmentation in Bayesian Neural Networks
Provably Efficient Offline Reinforcement Learning for Partially Observable Markov Decision Processes
Fighting Fire with Fire: Avoiding DNN Shortcuts through Priming
HousE: Knowledge Graph Embedding with Householder Parameterization
Calibrated and Sharp Uncertainties in Deep Learning via Density Estimation
Investigating Why Contrastive Learning Benefits Robustness against Label Noise
Informed Learning by Wide Neural Networks: Convergence, Generalization and Sampling Complexity
Frustratingly Easy Transferability Estimation
Minimum Cost Intervention Design for Causal Effect Identification
Action-Sufficient State Representation Learning for Control with Structural Constraints
Value Function based Difference-of-Convex Algorithm for Bilevel Hyperparameter Selection Problems
GSmooth: Certified Robustness against Semantic Transformations via Generalized Randomized Smoothing
Blocks Assemble! Learning to Assemble with Large-Scale Structured Reinforcement Learning
Leveraging Approximate Symbolic Models for Reinforcement Learning via Skill Diversity
Tackling covariate shift with node-based Bayesian neural networks
DNS: Determinantal Point Process Based Neural Network Sampler for Ensemble Reinforcement Learning
Transfer and Marginalize: Explaining Away Label Noise with Privileged Information
Staged Training for Transformer Language Models
Deletion Robust Submodular Maximization over Matroids
GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
Fictitious Play and Best-Response Dynamics in Identical Interest and Zero-Sum Stochastic Games
Deciphering Lasso-based Classification Through a Large Dimensional Analysis of the Iterative Soft-Thresholding Algorithm
Estimating and Penalizing Induced Preference Shifts in Recommender Systems
The Unsurprising Effectiveness of Pre-Trained Vision Models for Control
Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline Reinforcement Learning
Multi-Grained Vision Language Pre-Training: Aligning Texts with Visual Concepts
Efficient Model-based Multi-agent Reinforcement Learning via Optimistic Equilibrium Computation
Efficient Learning of CNNs using Patch Based Features
Biological Sequence Design with GFlowNets
One-Pass Algorithms for MAP Inference of Nonsymmetric Determinantal Point Processes
Private frequency estimation via projective geometry
Counterfactual Prediction for Outcome-Oriented Treatments
3D Infomax improves GNNs for Molecular Property Prediction
Generating Distributional Adversarial Examples to Evade Statistical Detectors
How to Leverage Unlabeled Data in Offline Reinforcement Learning
Linear Adversarial Concept Erasure
Optimistic Linear Support and Successor Features as a Basis for Optimal Policy Transfer
MetAug: Contrastive Learning via Meta Feature Augmentation
Lie Point Symmetry Data Augmentation for Neural PDE Solvers
Universal Hopfield Networks: A General Framework for Single-Shot Associative Memory Models
Stochastic Continuous Submodular Maximization: Boosting via Non-oblivious Function
Conformal Prediction Sets with Limited False Positives
Personalized Federated Learning via Variational Bayesian Inference
Path-Gradient Estimators for Continuous Normalizing Flows
Consistent Polyhedral Surrogates for Top-k Classification and Variants
Nearly Minimax Optimal Reinforcement Learning with Linear Function Approximation
Interactive Inverse Reinforcement Learning for Cooperative Games
Multiple-Play Stochastic Bandits with Shareable Finite-Capacity Arms
Be Like Water: Adaptive Floating Point for Machine Learning
Diffusion bridges vector quantized variational autoencoders
Deduplicating Training Data Mitigates Privacy Risks in Language Models
Lightweight Projective Derivative Codes for Compressed Asynchronous Gradient Descent
Learning Efficient and Robust Ordinary Differential Equations via Invertible Neural Networks
Revisiting and Advancing Fast Adversarial Training Through The Lens of Bi-Level Optimization
Scalable Deep Gaussian Markov Random Fields for General Graphs
Transformer Quality in Linear Time
Visual Attention Emerges from Recurrent Sparse Reconstruction
Confidence Score for Source-Free Unsupervised Domain Adaptation
Unsupervised Image Representation Learning with Deep Latent Particles
Marginal Distribution Adaptation for Discrete Sets via Module-Oriented Divergence Minimization
A Rigorous Study of Integrated Gradients Method and Extensions to Internal Neuron Attributions
IDYNO: Learning Nonparametric DAGs from Interventional Dynamic Data
On the Surrogate Gap between Contrastive and Supervised Losses
Understanding Instance-Level Impact of Fairness Constraints
Robust Group Synchronization via Quadratic Programming
Deep Causal Metric Learning
Flowformer: Linearizing Transformers with Conservation Flows
Graph-Coupled Oscillator Networks
Variational Inference for Infinitely Deep Neural Networks
Reverse Engineering the Neural Tangent Kernel
Low-Precision Stochastic Gradient Langevin Dynamics
Matching Normalizing Flows and Probability Paths on Manifolds
Asymptotically-Optimal Gaussian Bandits with Side Observations
Sparse Double Descent: Where Network Pruning Aggravates Overfitting
Contrastive Mixture of Posteriors for Counterfactual Inference, Data Integration and Fairness
Distributionally-Aware Kernelized Bandit Problems for Risk Aversion
The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns via Spotlights of Attention
AdAUC: End-to-end Adversarial AUC Optimization Against Long-tail Problems
Kernelized Multiplicative Weights for 0/1-Polyhedral Games: Bridging the Gap Between Learning in Extensive-Form and Normal-Form Games
Principled Knowledge Extrapolation with GANs
Not All Poisons are Created Equal: Robust Training against Data Poisoning
DisPFL: Towards Communication-Efficient Personalized Federated Learning via Decentralized Sparse Training
DynaMixer: A Vision MLP Architecture with Dynamic Mixing
Fenrir: Physics-Enhanced Regression for Initial Value Problems
Unified Fourier-based Kernel and Nonlinearity Design for Equivariant Networks on Homogeneous Spaces
LCANets: Lateral Competition Improves Robustness Against Corruption and Attack
PAGE-PG: A Simple and Loopless Variance-Reduced Policy Gradient Method with Probabilistic Gradient Estimation
Prototype-Anchored Learning for Learning with Imperfect Annotations
Improved StyleGAN-v2 based Inversion for Out-of-Distribution Images
Unraveling Attention via Convex Duality: Analysis and Interpretations of Vision Transformers
Adaptive Gaussian Process Change Point Detection
Hardness and Algorithms for Robust and Sparse Optimization
Multi Resolution Analysis (MRA) for Approximate Self-Attention
A Functional Information Perspective on Model Interpretation
PAC-Net: A Model Pruning Approach to Inductive Transfer Learning
Robust Policy Learning over Multiple Uncertainty Sets
Fishr: Invariant Gradient Variances for Out-of-Distribution Generalization
Fast Finite Width Neural Tangent Kernel
Robustness in Multi-Objective Submodular Optimization: a Quantile Approach
Optimal Estimation of Policy Gradient via Double Fitted Iteration
Learning from Demonstration: Provably Efficient Adversarial Policy Imitation with Linear Function Approximation
Rethinking Fano’s Inequality in Ensemble Learning
Convergence of Uncertainty Sampling for Active Learning
The Role of Deconfounding in Meta-learning
Active fairness auditing
FITNESS: (Fine Tune on New and Similar Samples) to detect anomalies in streams with drift and outliers
Deep Hierarchy in Bandits
Adversarially Trained Actor Critic for Offline Reinforcement Learning
Pairwise Conditional Gradients without Swap Steps and Sparser Kernel Herding
Reconstructing Nonlinear Dynamical Systems from Multi-Modal Time Series
GACT: Activation Compressed Training for Generic Network Architectures
Disentangling Disease-related Representation from Obscure for Disease Prediction
Provable Reinforcement Learning with a Short-Term Memory
Constrained Optimization with Dynamic Bound-scaling for Effective NLP Backdoor Defense
Secure Distributed Training at Scale
Guided-TTS: A Diffusion Model for Text-to-Speech via Classifier Guidance
Neuro-Symbolic Hierarchical Rule Induction
pathGCN: Learning General Graph Spatial Operators from Paths
Design-Bench: Benchmarks for Data-Driven Offline Model-Based Optimization
Conditional GANs with Auxiliary Discriminative Classifier
Retrieval-Augmented Reinforcement Learning
Generative Flow Networks for Discrete Probabilistic Modeling
Molecular Representation Learning via Heterogeneous Motif Graph Neural Networks
Adversarially trained neural representations are already as robust as biological neural representations
A Parametric Class of Approximate Gradient Updates for Policy Optimization
Interactively Learning Preference Constraints in Linear Bandits
Robust Training under Label Noise by Over-parameterization
Causal Conceptions of Fairness and their Consequences
Flow-based Recurrent Belief State Learning for POMDPs
Approximate Bayesian Computation with Domain Expert in the Loop
Multi-Task Learning as a Bargaining Game
Investigating Generalization by Controlling Normalized Margin
Continuous-Time Modeling of Counterfactual Outcomes Using Neural Controlled Differential Equations
Learning Markov Games with Adversarial Opponents: Efficient Algorithms and Fundamental Limits
MAML and ANIL Provably Learn Representations
Model-Free Opponent Shaping
Optimizing Sequential Experimental Design with Deep Reinforcement Learning
Quant-BnB: A Scalable Branch-and-Bound Method for Optimal Decision Trees with Continuous Features
The Multivariate Community Hawkes Model for Dependent Relational Events in Continuous-time Networks
Distributional Hamilton-Jacobi-Bellman Equations for Continuous-Time Reinforcement Learning
Last Iterate Risk Bounds of SGD with Decaying Stepsize for Overparameterized Linear Regression
Variational Inference with Locally Enhanced Bounds for Hierarchical Models
Branching Reinforcement Learning
On the Convergence of the Shapley Value in Parametric Bayesian Learning Games
Metric-Fair Classifier Derandomization
Partial Counterfactual Identification from Observational and Experimental Data
Correlated Quantization for Distributed Mean Estimation and Optimization
Transformers are Meta-Reinforcement Learners
Temporal Difference Learning for Model Predictive Control
PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance
Fair Generalized Linear Models with a Convex Penalty
A Simple Guard for Learned Optimizers
Federated Learning with Positive and Unlabeled Data
A Langevin-like Sampler for Discrete Distributions
Robust Meta-learning with Sampling Noise and Label Noise via Eigen-Reptile
Saute RL: Almost Surely Safe Reinforcement Learning Using State Augmentation
Data-SUITE: Data-centric identification of in-distribution incongruous examples
LyaNet: A Lyapunov Framework for Training Neural ODEs
The Neural Race Reduction: Dynamics of Abstraction in Gated Networks
Bit Prioritization in Variational Autoencoders via Progressive Coding
Wide Neural Networks Forget Less Catastrophically
Model Selection in Batch Policy Optimization
On the Difficulty of Defending Self-Supervised Learning against Model Extraction
To Smooth or Not? When Label Smoothing Meets Noisy Labels
Improving Screening Processes via Calibrated Subset Selection
Adaptive Inertia: Disentangling the Effects of Adaptive Learning Rate and Momentum
A Regret Minimization Approach to Multi-Agent Control
Asking for Knowledge (AFK): Training RL Agents to Query External Knowledge Using Language
Streaming Inference for Infinite Feature Models
Examining Scaling and Transfer of Language Model Architectures for Machine Translation
On the Impossibility of Learning to Cooperate with Adaptive Partner Strategies in Repeated Games
FedNL: Making Newton-Type Methods Applicable to Federated Learning
Improving Robustness against Real-World and Worst-Case Distribution Shifts through Decision Region Quantification
Exact Optimal Accelerated Complexity for Fixed-Point Iterations
Fairness with Adaptive Weights
Efficient Representation Learning via Adaptive Context Pooling
Anarchic Federated Learning
A Branch and Bound Framework for Stronger Adversarial Attacks of ReLU Networks
On Measuring Causal Contributions via do-interventions
Making Linear MDPs Practical via Contrastive Representation Learning
Building Robust Ensembles via Margin Boosting
On Last-Iterate Convergence Beyond Zero-Sum Games
Learning fair representation with a parametric integral probability metric
Extracting Latent State Representations with Linear Dynamics from Rich Observations
Forward Operator Estimation in Generative Models with Kernel Transfer Operators
Decentralized Online Convex Optimization in Networked Systems
Burst-Dependent Plasticity and Dendritic Amplification Support Target-Based Learning and Hierarchical Imitation Learning
Contextual Bandits with Large Action Spaces: Made Practical
Learning-based Optimisation of Particle Accelerators Under Partial Observability Without Real-World Training
On the Optimization Landscape of Neural Collapse under MSE Loss: Global Optimality with Unconstrained Features
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
Feature Space Particle Inference for Neural Network Ensembles
Linear Complexity Randomized Self-attention Mechanism
Improved No-Regret Algorithms for Stochastic Shortest Path with Linear MDP
Rethinking Image-Scaling Attacks: The Interplay Between Vulnerabilities in Machine Learning Systems
Robust Counterfactual Explanations for Tree-Based Ensembles
A New Perspective on the Effects of Spectrum in Graph Neural Networks
The Power of Exploiter: Provable Multi-Agent RL in Large State Spaces
Contextual Bandits with Smooth Regret: Efficient Learning in Continuous Action Spaces
Datamodels: Understanding Predictions with Data and Data with Predictions
Approximately Equivariant Networks for Imperfectly Symmetric Dynamics
ActiveHedge: Hedge meets Active Learning
Principal Component Flows
Maslow's Hammer in Catastrophic Forgetting: Node Re-Use vs. Node Activation
Learning to Separate Voices by Spatial Regions
On Numerical Integration in Neural Ordinary Differential Equations
Cliff Diving: Exploring Reward Surfaces in Reinforcement Learning Environments
On the Convergence of Local Stochastic Compositional Gradient Descent with Momentum
Domain Adaptation for Time Series Forecasting via Attention Sharing
DNA: Domain Generalization with Diversified Neural Averaging
Architecture Agnostic Federated Learning for Neural Networks
Leverage Score Sampling for Tensor Product Matrices in Input Sparsity Time
Addressing Optimism Bias in Sequence Modeling for Reinforcement Learning
Understanding Robust Generalization in Learning Regular Languages
Minimax M-estimation under Adversarial Contamination
Disentangled Federated Learning for Tackling Attributes Skew via Invariant Aggregation and Diversity Transferring
A Stochastic Multi-Rate Control Framework For Modeling Distributed Optimization Algorithms
Easy Variational Inference for Categorical Models via an Independent Binary Approximation
Training Your Sparse Neural Network Better with Any Mask
How to Steer Your Adversary: Targeted and Efficient Model Stealing Defenses with Gradient Redirection
Learning to Cut by Looking Ahead: Cutting Plane Selection via Imitation Learning
Coordinated Attacks against Contextual Bandits: Fundamental Limits and Defense Mechanisms
On Learning Mixture of Linear Regressions in the Non-Realizable Setting
Sparsity in Partially Controllable Linear Systems
On the Equivalence Between Temporal and Static Equivariant Graph Representations
ButterflyFlow: Building Invertible Layers with Butterfly Matrices
Structured Stochastic Gradient MCMC
Scaling Out-of-Distribution Detection for Real-World Settings
Towards Uniformly Superhuman Autonomy via Subdominance Minimization
RUMs from Head-to-Head Contests
Set Norm and Equivariant Skip Connections: Putting the Deep in Deep Sets
Learning inverse folding from millions of predicted structures
When Are Linear Stochastic Bandits Attackable?
VarScene: A Deep Generative Model for Realistic Scene Graph Synthesis
Streaming Algorithms for High-Dimensional Robust Statistics
Practical Almost-Linear-Time Approximation Algorithms for Hybrid and Overlapping Graph Clustering
Distributionally Robust $Q$-Learning
Optimally Controllable Perceptual Lossy Compression
On Finite-Sample Identifiability of Contrastive Learning-Based Nonlinear Independent Component Analysis
Volatility Based Kernels and Moving Average Means for Accurate Forecasting with Gaussian Processes
Learning Stochastic Shortest Path with Linear Function Approximation
Gaussian Mixture Variational Autoencoder with Contrastive Learning for Multi-Label Classification
Exploiting Independent Instruments: Identification and Distribution Generalization
Thompson Sampling for Robust Transfer in Multi-Task Bandits
Finding Global Homophily in Graph Neural Networks When Meeting Heterophily
A Model-Agnostic Randomized Learning Framework based on Random Hypothesis Subspace Sampling
Cycle Representation Learning for Inductive Relation Prediction
A Simple Reward-free Approach to Constrained Reinforcement Learning
Partial and Asymmetric Contrastive Learning for Out-of-Distribution Detection in Long-Tailed Recognition
Refined Convergence Rates for Maximum Likelihood Estimation under Finite Mixture Models
Coordinated Double Machine Learning
Causal Dynamics Learning for Task-Independent State Abstraction
Identification of Linear Non-Gaussian Latent Hierarchical Structure
Robust Multi-Objective Bayesian Optimization Under Input Noise
FedNew: A Communication-Efficient and Privacy-Preserving Newton-Type Method for Federated Learning
EqR: Equivariant Representations for Data-Efficient Reinforcement Learning
Robustness Verification for Contrastive Learning
Does the Data Induce Capacity Control in Deep Learning?
Fast and Reliable Evaluation of Adversarial Robustness with Minimum-Margin Attack
Training Discrete Deep Generative Models via Gapped Straight-Through Estimator
A Framework for Learning to Request Rich and Contextually Useful Information from Humans
Regret Minimization with Performative Feedback
Hierarchical Shrinkage: Improving the accuracy and interpretability of tree-based models.
Variational Wasserstein gradient flow
Hermite Polynomial Features for Private Data Generation
N-Penetrate: Active Learning of Neural Collision Handler for Complex 3D Mesh Deformations
Locally Sparse Neural Networks for Tabular Biomedical Data
Measuring the Effect of Training Data on Deep Learning Predictions via Randomized Experiments
On Non-local Convergence Analysis of Deep Linear Networks
No-Regret Learning in Partially-Informed Auctions
Universal and data-adaptive algorithms for model selection in linear contextual bandits
Understanding Clipping for Federated Learning: Convergence and Client-Level Differential Privacy
Improve Single-Point Zeroth-Order Optimization Using High-Pass and Low-Pass Filters
De novo mass spectrometry peptide sequencing with a transformer model
Learning Augmented Binary Search Trees
Fishing for User Data in Large-Batch Federated Learning via Gradient Magnification
A Self-Play Posterior Sampling Algorithm for Zero-Sum Markov Games
Probabilistically Robust Learning: Balancing Average- and Worst-case Performance
Differentially Private Community Detection for Stochastic Block Models
A Study on the Ramanujan Graph Property of Winning Lottery Tickets
Local Augmentation for Graph Neural Networks
Learning from Counterfactual Links for Link Prediction
H-Consistency Bounds for Surrogate Loss Minimizers
VariGrow: Variational Architecture Growing for Task-Agnostic Continual Learning based on Bayesian Novelty
Topology-Aware Network Pruning using Multi-stage Graph Embedding and Reinforcement Learning
A deep convolutional neural network that is invariant to time rescaling
Improved Regret for Differentially Private Exploration in Linear MDP
Supervised Learning with General Risk Functionals
Combining Diverse Feature Priors
Robust Deep Reinforcement Learning through Bootstrapped Opportunistic Curriculum
Generalization Guarantee of Training Graph Convolutional Networks with Graph Topology Sampling
Do Differentiable Simulators Give Better Policy Gradients?
Entropic Causal Inference: Graph Identifiability
Counterfactual Transportability: A Formal Approach
BAMDT: Bayesian Additive Semi-Multivariate Decision Trees for Nonparametric Regression
Feature selection using e-values
Augment with Care: Contrastive Learning for Combinatorial Problems
LIMO: Latent Inceptionism for Targeted Molecule Generation
Object Permanence Emerges in a Random Walk along Memory
Imitation Learning by Estimating Expertise of Demonstrators
Neural Laplace: Learning diverse classes of differential equations in the Laplace domain
Stabilizing Q-learning with Linear Architectures for Provable Efficient Learning
Random Gegenbauer Features for Scalable Kernel Methods
Proximal and Federated Random Reshuffling
Neural Tangent Kernel Empowered Federated Learning
Sharpened Quasi-Newton Methods: Faster Superlinear Rate and Larger Local Convergence Neighborhood
Sample-Efficient Reinforcement Learning with loglog(T) Switching Cost
An Analytical Update Rule for General Policy Optimization
Constrained Gradient Descent: A Powerful and Principled Evasion Attack Against Neural Networks
Meaningfully debugging model mistakes using conceptual counterfactual explanations
The State of Sparse Training in Deep Reinforcement Learning
Diffusion Models for Adversarial Purification
Multirate Training of Neural Networks
A Natural Actor-Critic Framework for Zero-Sum Markov Games
Synergy and Symmetry in Deep Learning: Interactions between the Data, Model, and Inference Algorithm
Characterizing and Overcoming the Greedy Nature of Learning in Multi-modal Deep Neural Networks
Resilient and Communication Efficient Learning for Heterogeneous Federated Systems
DRAGONN: Distributed Randomized Approximate Gradients of Neural Networks
Bregman Proximal Langevin Monte Carlo via Bregman--Moreau Envelopes
Efficient Online ML API Selection for Multi-Label Classification Tasks
Quantum-Inspired Algorithms from Randomized Numerical Linear Algebra
Variational Mixtures of ODEs for Inferring Cellular Gene Expression Dynamics
Head2Toe: Utilizing Intermediate Representations for Better Transfer Learning
Discrete Tree Flows via Tree-Structured Permutations
On Transportation of Mini-batches: A Hierarchical Approach
Understanding the unstable convergence of gradient descent
Modeling Strong and Human-Like Gameplay with KL-Regularized Search
Federated Reinforcement Learning: Linear Speedup Under Markovian Sampling
Going Deeper into Permutation-Sensitive Graph Neural Networks
Online Decision Transformer
Streaming Algorithm for Monotone k-Submodular Maximization with Cardinality Constraints
LSB: Local Self-Balancing MCMC in Discrete Spaces
Sharp-MAML: Sharpness-Aware Model-Agnostic Meta Learning
Sketching Algorithms and Lower Bounds for Ridge Regression
data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language
Differentially Private Maximal Information Coefficients
Accelerating Shapley Explanation via Contributive Cooperator Selection
More Efficient Sampling for Tensor Decomposition With Worst-Case Guarantees
Antibody-Antigen Docking and Design via Hierarchical Structure Refinement
When and How Mixup Improves Calibration
NLP From Scratch Without Large-Scale Pretraining: A Simple and Efficient Framework
Hessian-Free High-Resolution Nesterov Acceleration For Sampling
DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale
Neural Inverse Transform Sampler
Reverse Engineering $\ell_p$ attacks: A block-sparse optimization approach with recovery guarantees
Efficient Reinforcement Learning in Block MDPs: A Model-free Representation Learning approach
On Improving Model-Free Algorithms for Decentralized Multi-Agent Reinforcement Learning
Symmetric Machine Theory of Mind
COAT: Measuring Object Compositionality in Emergent Representations
Discrete Probabilistic Inverse Optimal Transport
Stability Based Generalization Bounds for Exponential Family Langevin Dynamics
Role-based Multiplex Network Embedding
Bayesian Imitation Learning for End-to-End Mobile Manipulation
Constrained Variational Policy Optimization for Safe Reinforcement Learning
POEM: Out-of-Distribution Detection with Posterior Sampling
Understanding Dataset Difficulty with $\mathcal{V}$-Usable Information
Adaptive Best-of-Both-Worlds Algorithm for Heavy-Tailed Multi-Armed Bandits
Federated Learning with Partial Model Personalization
Entropic Gromov-Wasserstein between Gaussian Distributions
Debiaser Beware: Pitfalls of Centering Regularized Transport Maps
A Tree-based Model Averaging Approach for Personalized Treatment Effect Estimation from Heterogeneous Data Sources
G-Mixup: Graph Data Augmentation for Graph Classification
Selective Network Linearization for Efficient Private Inference
Dual Perspective of Label-Specific Feature Learning for Multi-Label Classification
Measure Estimation in the Barycentric Coding Model
Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural Networks
Distinguishing rule- and exemplar-based generalization in learning systems
Why Should I Trust You, Bellman? The Bellman Error is a Poor Replacement for Value Error
Versatile Offline Imitation from Observations and Examples via Regularized State-Occupancy Matching
Rich Feature Construction for the Optimization-Generalization Dilemma
Task-aware Privacy Preservation for Multi-dimensional Data
TACTiS: Transformer-Attentional Copulas for Time Series
Risk-Averse No-Regret Learning in Online Convex Games
Fourier Learning with Cyclical Data
Adaptive Second Order Coresets for Data-efficient Machine Learning
Describing Differences between Text Distributions with Natural Language
Uncertainty Modeling in Generative Compressed Sensing
Provable Stochastic Optimization for Global Contrastive Learning: Small Batch Does Not Harm Performance
Scaling-up Diverse Orthogonal Convolutional Networks by a Paraunitary Framework
Parsimonious Learning-Augmented Caching
On the Statistical Benefits of Curriculum Learning
GraphFM: Improving Large-Scale GNN Training via Feature Momentum
NysADMM: faster composite convex optimization via low-rank approximation
Beyond Worst-Case Analysis in Stochastic Approximation: Moment Estimation Improves Instance Complexity
Constrained Offline Policy Optimization
End-to-End Balancing for Causal Continuous Treatment-Effect Estimation
Constrained Discrete Black-Box Optimization using Mixed-Integer Programming
Learning to Solve PDE-constrained Inverse Problems with Graph Networks
Feature and Parameter Selection in Stochastic Linear Bandits
Faster Algorithms for Learning Convex Functions
Optimal Algorithms for Mean Estimation under Local Differential Privacy
Removing Batch Normalization Boosts Adversarial Training
Fast Convex Optimization for Two-Layer ReLU Networks: Equivalent Model Classes and Cone Decompositions
GALAXY: Graph-based Active Learning at the Extreme
Preconditioning for Scalable Gaussian Process Hyperparameter Optimization
Three-stage Evolution and Fast Equilibrium for SGD with Non-degerate Critical Points
Implicit Bias of Linear Equivariant Networks
Scaling Gaussian Process Optimization by Evaluating a Few Unique Candidates Multiple Times
ViT-NeT: Interpretable Vision Transformers with Neural Tree Decoder
The CLRS Algorithmic Reasoning Benchmark
Adaptive Accelerated (Extra-)Gradient Methods with Variance Reduction
Training OOD Detectors in their Natural Habitats
Power-Law Escape Rate of SGD
Why the Rich Get Richer? On the Balancedness of Random Partition Models
Validating Causal Inference Methods
Adversarial Vulnerability of Randomized Ensembles
Provable Acceleration of Heavy Ball beyond Quadratics for a Class of Polyak-Lojasiewicz Functions when the Non-Convexity is Averaged-Out
Achieving Minimax Rates in Pool-Based Batch Active Learning
On the Hidden Biases of Policy Mirror Ascent in Continuous Action Spaces
Detached Error Feedback for Distributed SGD with Random Sparsification
RankSim: Ranking Similarity Regularization for Deep Imbalanced Regression
Cascaded Gaps: Towards Logarithmic Regret for Risk-Sensitive Reinforcement Learning
Breaking the $\sqrt{T}$ Barrier: Instance-Independent Logarithmic Regret in Stochastic Contextual Linear Bandits
Weisfeiler-Lehman Meets Gromov-Wasserstein
HyperPrompt: Prompt-based Task-Conditioning of Transformers
Out-of-Distribution Detection with Deep Nearest Neighbors
Bregman Power k-Means for Clustering Exponential Family Data
Bounding the Width of Neural Networks via Coupled Initialization - A Worst Case Analysis
Neural Tangent Kernel Beyond the Infinite-Width Limit: Effects of Depth and Initialization
Information Discrepancy in Strategic Learning
Robust alignment of cross-session recordings of neural population activity by behaviour via unsupervised domain adaptation
Simultaneous Graph Signal Clustering and Graph Learning
Fair Representation Learning through Implicit Path Alignment
A General Recipe for Likelihood-free Bayesian Optimization
ROCK: Causal Inference Principles for Reasoning about Commonsense Causality
Interactive Correlation Clustering with Existential Cluster Constraints
Adaptive Random Walk Gradient Descent for Decentralized Optimization
Differentiable Top-k Classification Learning
Linear Bandit Algorithms with Sublinear Time Complexity
A query-optimal algorithm for finding counterfactuals
Label Ranking through Nonparametric Regression
Agnostic Learnability of Halfspaces via Logistic Loss
First-Order Regret in Reinforcement Learning with Linear Function Approximation: A Robust Estimation Approach
Deploying Convolutional Networks on Untrusted Platforms Using 2D Holographic Reduced Representations
Reward-Free RL is No Harder Than Reward-Aware RL in Linear Markov Decision Processes
Optimal Clipping and Magnitude-aware Differentiation for Improved Quantization-aware Training
DRIBO: Robust Deep Reinforcement Learning via Multi-View Information Bottleneck
Generalizing Gaussian Smoothing for Random Search
A Study of Face Obfuscation in ImageNet
Recurrent Model-Free RL Can Be a Strong Baseline for Many POMDPs
Sanity Simulations for Saliency Methods
Universal Joint Approximation of Manifolds and Densities by Simple Injective Flows
Variational Sparse Coding with Learned Thresholding
Local Linear Convergence of Douglas-Rachford for Linear Programming: a Probabilistic Analysis
Equivariant Diffusion for Molecule Generation in 3D
Loss Function Learning for Domain Generalization by Implicit Gradient
Pure Noise to the Rescue of Insufficient Data: Improving Imbalanced Classification by Training on Random Noise Images
Contextual Information-Directed Sampling
COLA: Consistent Learning with Opponent-Learning Awareness
Adapting to Mixing Time in Stochastic Optimization with Markovian Data
From data to functa: Your data point is a function and you can treat it like one
Scaling Structured Inference with Randomization
Log-Euclidean Signatures for Intrinsic Distances Between Unaligned Datasets
How to Train Your Wide Neural Network Without Backprop: An Input-Weight Alignment Perspective
Learning to Infer Structures of Network Games
Translatotron 2: High-quality direct speech-to-speech translation with voice preservation
Faster Fundamental Graph Algorithms via Learned Predictions
Fat–Tailed Variational Inference with Anisotropic Tail Adaptive Flows
In defense of dual-encoders for neural ranking
MAE-DET: Revisiting Maximum Entropy Principle in Zero-Shot NAS for Efficient Object Detection
Compressed-VFL: Communication-Efficient Learning with Vertically Partitioned Data
Nesterov Accelerated Shuffling Gradient Method for Convex Optimization
What Can Linear Interpolation of Neural Network Loss Landscapes Tell Us?
Algorithms for the Communication of Samples
Strategic Instrumental Variable Regression: Recovering Causal Relationships From Strategic Responses
Nearly Optimal Policy Optimization with Stable at Any Time Guarantee
Analysis of Stochastic Processes through Replay Buffers
Generalized Strategic Classification and the Case of Aligned Incentives
Policy Gradient Method For Robust Reinforcement Learning
Plug-In Inversion: Model-Agnostic Inversion for Vision with Data Augmentations
Structure-preserving GANs
Improved Certified Defenses against Data Poisoning with (Deterministic) Finite Aggregation
Tell me why! Explanations support learning relational and causal structure
A Random Matrix Analysis of Data Stream Clustering: Coping With Limited Memory Resources
A Minimax Learning Approach to Off-Policy Evaluation in Confounded Partially Observable Markov Decision Processes
Balancing Sample Efficiency and Suboptimality in Inverse Reinforcement Learning
Towards Coherent and Consistent Use of Entities in Narrative Generation
Certified Neural Network Watermarks with Randomized Smoothing
On the Practicality of Deterministic Epistemic Uncertainty
Robust SDE-Based Variational Formulations for Solving Linear PDEs via Deep Learning
On the Finite-Time Complexity and Practical Computation of Approximate Stationarity Concepts of Lipschitz Functions
Correlation Clustering via Strong Triadic Closure Labeling: Fast Approximation Algorithms and Practical Lower Bounds
Modeling Structure with Undirected Neural Networks
Convergence of Policy Gradient for Entropy Regularized MDPs with Neural Network Approximation in the Mean-Field Regime
Inducing Causal Structure for Interpretable Neural Networks
Controlling Conditional Language Models without Catastrophic Forgetting
ShiftAddNAS: Hardware-Inspired Search for More Accurate and Efficient Neural Networks
A Differential Entropy Estimator for Training Neural Networks
Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval
Towards Noise-adaptive, Problem-adaptive (Accelerated) Stochastic Gradient Descent
Scalable Computation of Causal Bounds
Detecting Adversarial Examples Is (Nearly) As Hard As Classifying Them
Tight and Robust Private Mean Estimation with Few Users
Strategies for Safe Multi-Armed Bandits with Logarithmic Regret and Risk
Modeling Irregular Time Series with Continuous Recurrent Units
No-Regret Learning in Time-Varying Zero-Sum Games
Curriculum Reinforcement Learning via Constrained Optimal Transport
Score Matching Enables Causal Discovery of Nonlinear Additive Noise Models
PAC-Bayesian Bounds on Rate-Efficient Classifiers
Generalized Results for the Existence and Consistency of the MLE in the Bradley-Terry-Luce Model
FedScale: Benchmarking Model and System Performance of Federated Learning at Scale
Probabilistic ODE Solutions in Millions of Dimensions
Sequential Covariate Shift Detection Using Classifier Two-Sample Tests
Accelerated, Optimal and Parallel: Some results on model-based stochastic optimization
On the Convergence of Inexact Predictor-Corrector Methods for Linear Programming
FriendlyCore: Practical Differentially Private Aggregation
GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models
RieszNet and ForestRiesz: Automatic Debiased Machine Learning with Neural Nets and Random Forests
Robust Kernel Density Estimation with Median-of-Means principle
Evolving Curricula with Regret-Based Environment Design
The Importance of Non-Markovianity in Maximum State Entropy Exploration
Adversarial Masking for Self-Supervised Learning
Fluctuations, Bias, Variance & Ensemble of Learners: Exact Asymptotics for Convex Losses in High-Dimension
Inverse Contextual Bandits: Learning How Behavior Evolves over Time
Adversarial Attacks on Gaussian Process Bandits
Bayesian Nonparametrics for Offline Skill Discovery
PDE-Based Optimal Strategy for Unconstrained Online Learning
Plan Your Target and Learn Your Skills: Transferable State-Only Imitation Learning via Decoupled Policy Optimization
Improved Rates for Differentially Private Stochastic Convex Optimization with Heavy-Tailed Data
Shuffle Private Linear Contextual Bandits
Universality of Winning Tickets: A Renormalization Group Perspective
Plug & Play Attacks: Towards Robust and Flexible Model Inversion Attacks
SDQ: Stochastic Differentiable Quantization with Mixed Precision
Direct Behavior Specification via Constrained Reinforcement Learning
Translating Robot Skills: Learning Unsupervised Skill Correspondences Across Robots
Error-driven Input Modulation: Solving the Credit Assignment Problem without a Backward Pass
Invariant Ancestry Search
Least Squares Estimation using Sketched Data with Heteroskedastic Errors
Robust Task Representations for Offline Meta-Reinforcement Learning via Contrastive Learning
Convergence and Recovery Guarantees of the K-Subspaces Method for Subspace Clustering
A Hierarchical Transitive-Aligned Graph Kernel for Un-attributed Graphs
A Psychological Theory of Explainability
Composing Partial Differential Equations with Physics-Aware Neural Networks
Stochastic Rising Bandits
How Faithful is your Synthetic Data? Sample-level Metrics for Evaluating and Auditing Generative Models
Failure and success of the spectral bias prediction for Laplace Kernel Ridge Regression: the case of low-dimensional data
Geometric Multimodal Contrastive Representation Learning
NOMU: Neural Optimization-based Model Uncertainty
Neural-Symbolic Models for Logical Queries on Knowledge Graphs
The Infinite Contextual Graph Markov Model
Federated Learning with Label Distribution Skew via Logits Calibration
RetrievalGuard: Provably Robust 1-Nearest Neighbor Image Retrieval
Exploring and Exploiting Hubness Priors for High-Quality GAN Latent Sampling
Deep symbolic regression for recurrence prediction
Kernel Methods for Radial Transformed Compositional Data with Many Zeros
Equivariance versus Augmentation for Spherical Images
Adversarial Robustness against Multiple and Single $l_p$-Threat Models via Quick Fine-Tuning of Robust Classifiers
Decomposing Temporal High-Order Interactions via Latent ODEs
Nearly Optimal Catoni’s M-estimator for Infinite Variance
Label-Free Explainability for Unsupervised Models
AutoIP: A United Framework to Integrate Physics into Gaussian Processes
PINs: Progressive Implicit Networks for Multi-Scale Neural Representations
Generalization Bounds using Lower Tail Exponents in Stochastic Optimizers
Style Equalization: Unsupervised Learning of Controllable Generative Sequence Models
Nonlinear Feature Diffusion on Hypergraphs
Interpretable Neural Networks with Frank-Wolfe: Sparse Relevance Maps and Relevance Orderings
Towards Theoretical Analysis of Transformation Complexity of ReLU DNNs
Self-supervised Models are Good Teaching Assistants for Vision Transformers
ME-GAN: Learning Panoptic Electrocardio Representations for Multi-view ECG Synthesis Conditioned on Heart Diseases
Delay-Adaptive Step-sizes for Asynchronous Learning
Transfer Learning In Differential Privacy's Hybrid-Model
On the Finite-Time Performance of the Knowledge Gradient Algorithm
Expression might be enough: representing pressure and demand for reinforcement learning based traffic signal control
Dynamic Topic Models for Temporal Document Networks
Choosing Answers in Epsilon-Best-Answer Identification for Linear Bandits
Variational On-the-Fly Personalization
Re-evaluating Word Mover's Distance
Revisiting the Effects of Stochasticity for Hamiltonian Samplers
Unsupervised Flow-Aligned Sequence-to-Sequence Learning for Video Restoration
Difference Advantage Estimation for Multi-Agent Policy Gradients
A Joint Exponential Mechanism For Differentially Private Top-$k$
Being Properly Improper
Global Optimization Networks
On the Effects of Artificial Data Modification
Near-Exact Recovery for Tomographic Inverse Problems via Deep Learning
Robustness Implies Generalization via Data-Dependent Generalization Bounds
Order Constraints in Optimal Transport
Estimation in Rotationally Invariant Generalized Linear Models via Approximate Message Passing
Safe Exploration for Efficient Policy Evaluation and Comparison
Omni-Granular Ego-Semantic Propagation for Self-Supervised Graph Representation Learning
Residual-Based Sampling for Online Outlier-Robust PCA
Neural Network Weights Do Not Converge to Stationary Points: An Invariant Measure Perspective
Efficient PAC Learning from the Crowd with Pairwise Comparisons
Generic Coreset for Scalable Learning of Monotonic Kernels: Logistic Regression, Sigmoid and more
Generalized Federated Learning via Sharpness Aware Minimization
NISPA: Neuro-Inspired Stability-Plasticity Adaptation for Continual Learning in Sparse Networks
Off-Policy Reinforcement Learning with Delayed Rewards
EAT-C: Environment-Adversarial sub-Task Curriculum for Efficient Reinforcement Learning
Towards understanding how momentum improves generalization in deep learning
One-Pass Diversified Sampling with Application to Terabyte-Scale Genomic Sequence Streams
Stochastic Reweighted Gradient Descent
Cooperative Online Learning in Stochastic and Adversarial MDPs
Analyzing and Mitigating Interference in Neural Architecture Search
Convergence of Invariant Graph Networks
Minimax Classification under Concept Drift with Multidimensional Adaptation and Performance Guarantees
Partial Label Learning via Label Influence Function
Instance Dependent Regret Analysis of Kernelized Bandits
A Theoretical Comparison of Graph Neural Network Extensions
Ripple Attention for Visual Perception with Sub-quadratic Complexity
Offline Meta-Reinforcement Learning with Online Self-Supervision
Prototype Based Classification from Hierarchy to Fairness
Understanding Robust Overfitting of Adversarial Training and Beyond
Fast Provably Robust Decision Trees and Boosting
MemSR: Training Memory-efficient Lightweight Model for Image Super-Resolution
Pessimistic Q-Learning for Offline Reinforcement Learning: Towards Optimal Sample Complexity
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for Everyone
Deep Squared Euclidean Approximation to the Levenshtein Distance for DNA Storage
Simple and near-optimal algorithms for hidden stratification and multi-group learning
Function-space Inference with Sparse Implicit Processes
C*-algebra Net: A New Approach Generalizing Neural Network Parameters to C*-algebra
Equivariant Quantum Graph Circuits
Attentional Meta-learners for Few-shot Polythetic Classification
Neural Network Pruning Denoises the Features and Makes Local Connectivity Emerge in Visual Tasks
Nonparametric Embeddings of Sparse High-Order Interaction Events
Robust Models Are More Interpretable Because Attributions Look Normal
Divergence-Regularized Multi-Agent Actor-Critic
Class-Imbalanced Semi-Supervised Learning with Adaptive Thresholding
DAdaQuant: Doubly-adaptive quantization for communication-efficient Federated Learning
Generative Trees: Adversarial and Copycat
Nonparametric Sparse Tensor Factorization with Hierarchical Gamma Processes
Learning Domain Adaptive Object Detection with Probabilistic Teacher
Matching Structure for Dual Learning
Large Batch Experience Replay
Coarsening the Granularity: Towards Structurally Sparse Lottery Tickets
Input Dependent Sparse Gaussian Processes
Sublinear-Time Clustering Oracle for Signed Graphs
Gaussian Process Uniform Error Bounds with Unknown Hyperparameters for Safety-Critical Applications
Continuous Control with Action Quantization from Demonstrations
An iterative clustering algorithm for the Contextual Stochastic Block Model with optimality guarantees
Estimating Instance-dependent Bayes-label Transition Matrix using a Deep Neural Network
A Neural Tangent Kernel Perspective of GANs
On Well-posedness and Minimax Optimal Rates of Nonparametric Q-function Estimation in Off-policy Evaluation
Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks
Understanding Policy Gradient Algorithms: A Sensitivity-Based Approach
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Differentially Private Approximate Quantiles
Nonparametric Factor Trajectory Learning for Dynamic Tensor Decomposition
Born-Infeld (BI) for AI: Energy-Conserving Descent (ECD) for Optimization
Centroid Approximation for Bootstrap: Improving Particle Quality at Inference
Learning Mixtures of Linear Dynamical Systems
Data-Efficient Double-Win Lottery Tickets from Robust Pre-training
Adaptive Data Analysis with Correlated Observations
$p$-Laplacian Based Graph Neural Networks
Adapting k-means Algorithms for Outliers
Smoothed Adaptive Weighting for Imbalanced Semi-Supervised Learning: Improve Reliability Against Unknown Distribution Data
Surrogate Likelihoods for Variational Annealed Importance Sampling
Learning to Hash Robustly, Guaranteed
Bounding Training Data Reconstruction in Private (Deep) Learning
Massively Parallel $k$-Means Clustering for Perturbation Resilient Instances
Bayesian Continuous-Time Tucker Decomposition
Consensus Multiplicative Weights Update: Learning to Learn using Projector-based Game Signatures
Fairness Interventions as (Dis)Incentives for Strategic Manipulation
Unsupervised Time-Series Representation Learning with Iterative Bilinear Temporal-Spectral Fusion
DSTAGNN: Dynamic Spatial-Temporal Aware Graph Neural Network for Traffic Flow Forecasting
Learning Stable Classifiers by Transferring Unstable Features
Greedy based Value Representation for Optimal Coordination in Multi-agent Reinforcement Learning
Fast-Rate PAC-Bayesian Generalization Bounds for Meta-Learning
Congested Bandits: Optimal Routing via Short-term Resets
For Learning in Symmetric Teams, Local Optima are Global Nash Equilibria
A Resilient Distributed Boosting Algorithm
Structure Preserving Neural Networks: A Case Study in the Entropy Closure of the Boltzmann Equation
QSFL: A Two-Level Uplink Communication Optimization Framework for Federated Learning
Skin Deep Unlearning: Artefact and Instrument Debiasing in the Context of Melanoma Classification
Online Learning and Pricing with Reusable Resources: Linear Bandits with Sub-Exponential Rewards
SoQal: Selective Oracle Questioning for Consistency Based Active Learning of Cardiac Signals
The dynamics of representation learning in shallow, non-linear autoencoders
Bayesian Nonparametric Learning for Point Processes with Spatial Homogeneity: A Spatial Analysis of NBA Shot Locations
Structural Entropy Guided Graph Hierarchical Pooling
Stochastic Deep Networks with Linear Competing Units for Model-Agnostic Meta-Learning
Large-scale Stochastic Optimization of NDCG Surrogates for Deep Learning with Provable Convergence
Deep Probability Estimation
SQ-VAE: Variational Bayes on Discrete Representation with Self-annealed Stochastic Quantization
Beyond Images: Label Noise Transition Matrix Estimation for Tasks with Lower-Quality Features
Detecting Corrupted Labels Without Training a Model to Predict
Accelerated Gradient Methods for Geodesically Convex Optimization: Tractable Algorithms and Convergence Analysis
Personalization Improves Privacy-Accuracy Tradeoffs in Federated Learning
Independent Policy Gradient for Large-Scale Markov Potential Games: Sharper Rates, Function Approximation, and Game-Agnostic Convergence
Sample Efficient Learning of Predictors that Complement Humans
Bayesian Learning with Information Gain Provably Bounds Risk for a Robust Adversarial Defense
From block-Toeplitz matrices to differential equations on graphs: towards a general theory for scalable masked Transformers
Generating 3D Molecules for Target Protein Binding
Deep and Flexible Graph Neural Architecture Search
Streaming Algorithms for Support-Aware Histograms
Policy Diagnosis via Measuring Role Diversity in Cooperative Multi-agent RL
Score-Guided Intermediate Level Optimization: Fast Langevin Mixing for Inverse Problems
Optimal Algorithms for Stochastic Multi-Level Compositional Optimization
Off-Policy Evaluation for Large Action Spaces via Embeddings
ProgFed: Effective, Communication, and Computation Efficient Federated Learning by Progressive Training
Learning Bellman Complete Representations for Offline Policy Evaluation
Doubly Robust Distributionally Robust Off-Policy Evaluation and Learning
GenLabel: Mixup Relabeling using Generative Models
Diversified Adversarial Attacks based on Conjugate Gradient Method
NAFS: A Simple yet Tough-to-beat Baseline for Graph Representation Learning
Estimating the Optimal Covariance with Imperfect Mean in Diffusion Probabilistic Models
Maximum Likelihood Training for Score-based Diffusion ODEs by High Order Denoising Score Matching
Efficient Test-Time Model Adaptation without Forgetting
Restarted Nonconvex Accelerated Gradient Descent: No More Polylogarithmic Factor in the $O(\epsilon^{-7/4})$ Complexity
CerDEQ: Certifiable Deep Equilibrium Model
PDO-s3DCNNs: Partial Differential Operator Based Steerable 3D CNNs
Optimization-Induced Graph Implicit Nonlinear Diffusion
G$^2$CN: Graph Gaussian Convolution Networks with Concentrated Graph Filters
Auxiliary Learning with Joint Task and Data Scheduling
Adversarial Attack and Defense for Non-Parametric Two-Sample Tests
Self-Supervised Representation Learning via Latent Graph Prediction
Mitigating Neural Network Overconfidence with Logit Normalization
Open-Sampling: Exploring Out-of-Distribution data for Re-balancing Long-tailed datasets
Robustness and Accuracy Could Be Reconcilable by (Proper) Definition
Online Learning with Knapsacks: the Best of Both Worlds
Safe Learning in Tree-Form Sequential Decision Making: Handling Hard and Soft Constraints
UnderGrad: A Universal Black-Box Optimization Method with Almost Dimension-Free Convergence Rate Guarantees
AdaGrad Avoids Saddle Points
PLATINUM: Semi-Supervised Model Agnostic Meta-Learning using Submodular Mutual Information
BabelTower: Learning to Auto-parallelized Program Translation
PMIC: Improving Multi-Agent Reinforcement Learning with Progressive Mutual Information Collaboration
Fast Relative Entropy Coding with A* coding
LIDL: Local Intrinsic Dimension Estimation Using Approximate Likelihood
Tranception: Protein Fitness Prediction with Autoregressive Transformers and Inference-time Retrieval
Deep equilibrium networks are sensitive to initialization statistics
Online Algorithms with Multiple Predictions
Batch Greenkhorn Algorithm for Entropic-Regularized Multimarginal Optimal Transport: Linear Rate of Convergence and Iteration Complexity
Dataset Condensation via Efficient Synthetic-Data Parameterization
Deep Neural Network Fusion via Graph Matching with Applications to Model Ensemble and Federated Learning
Batched Dueling Bandits
Regret Bounds for Stochastic Shortest Path Problems with Linear Function Approximation
High Probability Guarantees for Nonconvex Stochastic Gradient Descent with Heavy Tails
From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses
Biased Gradient Estimate with Drastic Variance Reduction for Meta Reinforcement Learning
Gradient Descent on Neurons and its Link to Approximate Second-order Optimization
Toward Compositional Generalization in Object-Oriented World Modeling
C-MinHash: Improving Minwise Hashing with Circulant Permutation
FOCUS: Familiar Objects in Common and Uncommon Settings
Deep Networks on Toroids: Removing Symmetries Reveals the Structure of Flat Regions in the Landscape Geometry
IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and Languages
(Non-)Convergence Results for Predictive Coding Networks
Monarch: Expressive Structured Matrices for Efficient and Accurate Training
Forget-free Continual Learning with Winning Subnetworks
Constraint-based graph network simulator
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
Online Active Regression
Achieving Fairness at No Utility Cost via Data Reweighing with Influence
Additive Gaussian Processes Revisited
Neural Network Poisson Models for Behavioural and Neural Spike Train Data
Rotting Infinitely Many-Armed Bandits
Low-Complexity Deep Convolutional Neural Networks on Fully Homomorphic Encryption Using Multiplexed Parallel Convolutions
Scalable Spike-and-Slab
Active Nearest Neighbor Regression Through Delaunay Refinement
Active Multi-Task Representation Learning
Utility Theory for Sequential Decision Making
Generative Coarse-Graining of Molecular Conformations
Path-Aware and Structure-Preserving Generation of Synthetically Accessible Molecules
Improving Transformers with Probabilistic Attention Keys
Cross-Space Active Learning on Graph Convolutional Networks
Public Data-Assisted Mirror Descent for Private Model Training
Thompson Sampling for (Combinatorial) Pure Exploration
Greedy when Sure and Conservative when Uncertain about the Opponents
Dual Decomposition of Convex Optimization Layers for Consistent Attention in Medical Images
Online Balanced Experimental Design
Prioritized Training on Points that are Learnable, Worth Learning, and not yet Learnt
What Language Model Architecture and Pretraining Objective Works Best for Zero-Shot Generalization?
3PC: Three Point Compressors for Communication-Efficient Distributed Training and a Better Theory for Lazy Aggregation
TPC: Transformation-Specific Smoothing for Point Cloud Models
Large-Scale Graph Neural Architecture Search
Random Forest Density Estimation
Particle Transformer for Jet Tagging
A Reduction from Linear Contextual Bandits Lower Bounds to Estimations Lower Bounds
Knowledge Base Question Answering by Case-based Reasoning over Subgraphs
Communication-efficient Distributed Learning for Large Batch Optimization
Revisiting Contrastive Learning through the Lens of Neighborhood Component Analysis: an Integrated Framework
Metric-Fair Active Learning
Thresholded Lasso Bandit
Kill a Bird with Two Stones: Closing the Convergence Gaps in Non-Strongly Convex Optimization by Directly Accelerated SVRG with Double Compensation and Snapshots
Benchmarking and Analyzing Point Cloud Classification under Corruptions
Position Prediction as an Effective Pretraining Strategy
Generalizing to New Physical Systems via Context-Informed Dynamics Model
Multi-Level Branched Regularization for Federated Learning
Efficient Variance Reduction for Meta-learning
ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers
Double Sampling Randomized Smoothing
Accelerating Bayesian Optimization for Biological Sequence Design with Denoising Autoencoders
History Compression via Language Models in Reinforcement Learning
Non-Vacuous Generalisation Bounds for Shallow Neural Networks
We use cookies to store which papers have been visited.
I agree
Successful Page Load
ICML uses cookies for essential functions only. We do not sell your personal information.
Our Privacy Policy »
Accept Cookies
We use cookies to store which papers have been visited.
I agree