Skip to yearly menu bar
Skip to main content
Main Navigation
ICML
Help/FAQ
Contact ICML
Downloads
Code of Conduct
Create Profile
Journal To Conference Track
Diversity & Inclusion
Privacy Policy
Future Meetings
Press
Careers
My Stuff
Login
Select Year: (2020)
2025
2024
2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2002
1996
IMLS Archives
Expo
Getting Started
Schedule
Tutorials
Invited Talks
Papers
Awards
Workshops
Town Hall
Socials
Sponsor Hall
Organizers
Browse
mini
compact
topic
detail
Showing papers for
.
×
×
title
author
topic
session
shuffle
by
serendipity
bookmarked first
visited first
not visited first
bookmarked but not visited
Enable Javascript in your browser to see the papers page.
Confidence Sets and Hypothesis Testing in a Likelihood-Free Inference Setting
Learning for Dose Allocation in Adaptive Clinical Trials with Safety Constraints
Restarted Bayesian Online Change-point Detector achieves Optimal Detection Delay
The Usual Suspects? Reassessing Blame for VAE Posterior Collapse
A Markov Decision Process Model for Socio-Economic Systems Impacted by Climate Change
Low-Variance and Zero-Variance Baselines for Extensive-Form Games
Multi-Task Learning with User Preferences: Gradient Descent with Controlled Ascent in Pareto Optimization
Sequence Generation with Mixed Representations
Rate-distortion optimization guided autoencoder for isometric embedding in Euclidean latent space
Student Specialization in Deep Rectified Networks With Finite Width and Input Dimension
Scalable Deep Generative Modeling for Sparse Graphs
Efficient nonparametric statistical inference on population feature importance using Shapley values
Can Autonomous Vehicles Identify, Recover From, and Adapt to Distribution Shifts?
Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors
A Mean Field Analysis Of Deep ResNet And Beyond: Towards Provably Optimization Via Overparameterization From Depth
PowerNorm: Rethinking Batch Normalization in Transformers
It's Not What Machines Can Learn, It's What We Cannot Teach
Improving generalization by controlling label-noise information in neural network weights
Go Wide, Then Narrow: Efficient Training of Deep Thin Networks
Inverse Active Sensing: Modeling and Understanding Timely Decision-Making
Convergence of a Stochastic Gradient Method with Momentum for Non-Smooth Non-Convex Optimization
Sample Amplification: Increasing Dataset Size even when Learning is Impossible
Hypernetwork approach to generating point clouds
Convex Calibrated Surrogates for the Multi-Label F-Measure
Graph Optimal Transport for Cross-Domain Alignment
History-Gradient Aided Batch Size Adaptation for Variance Reduced Algorithms
Adversarial Filters of Dataset Biases
Unbiased Risk Estimators Can Mislead: A Case Study of Learning with Complementary Labels
BoXHED: Boosted eXact Hazard Estimator with Dynamic covariates
Constrained Markov Decision Processes via Backward Value Functions
Hierarchical Generation of Molecular Graphs using Structural Motifs
Semiparametric Nonlinear Bipartite Graph Representation Learning with Provable Guarantees
Responsive Safety in Reinforcement Learning by PID Lagrangian Methods
Informative Dropout for Robust Representation Learning: A Shape-bias Perspective
Robust One-Bit Recovery via ReLU Generative Networks: Near-Optimal Statistical Rate and Global Landscape Analysis
R2-B2: Recursive Reasoning-Based Bayesian Optimization for No-Regret Learning in Games
ConQUR: Mitigating Delusional Bias in Deep Q-Learning
Confidence-Aware Learning for Deep Neural Networks
Bandits for BMO Functions
Superpolynomial Lower Bounds for Learning One-Layer Neural Networks using Gradient Descent
Scaling up Hybrid Probabilistic Inference with Logical and Arithmetic Constraints via Message Passing
Accelerating Large-Scale Inference with Anisotropic Vector Quantization
Differentiating through the Fréchet Mean
Few-shot Relation Extraction via Bayesian Meta-learning on Relation Graphs
Smaller, more accurate regression forests using tree alternating optimization
Divide and Conquer: Leveraging Intermediate Feature Representations for Quantized Training of Neural Networks
All in the Exponential Family: Bregman Duality in Thermodynamic Variational Inference
Explainable and Discourse Topic-aware Neural Language Understanding
Stochastic Latent Residual Video Prediction
NetGAN without GAN: From Random Walks to Low-Rank Approximations
Extra-gradient with player sampling for faster convergence in n-player games
Generalization via Derandomization
Maximum Entropy Gain Exploration for Long Horizon Multi-goal Reinforcement Learning
Leveraging Frequency Analysis for Deep Fake Image Recognition
Linear bandits with Stochastic Delayed Feedback
Time Series Deconfounder: Estimating Treatment Effects over Time in the Presence of Hidden Confounders
WaveFlow: A Compact Flow-based Model for Raw Audio
Self-supervised Label Augmentation via Input Transformations
Unsupervised Discovery of Interpretable Directions in the GAN Latent Space
Interference and Generalization in Temporal Difference Learning
Invariant Rationalization
Accelerated Stochastic Gradient-free and Projection-free Methods
Bayesian Experimental Design for Implicit Models by Mutual Information Neural Estimation
Tuning-free Plug-and-Play Proximal Algorithm for Inverse Imaging Problems
Robust Learning with the Hilbert-Schmidt Independence Criterion
Can Stochastic Zeroth-Order Frank-Wolfe Method Converge Faster for Non-Convex Problems?
Inexact Tensor Methods with Dynamic Accuracies
Radioactive data: tracing through training
Fast Adaptation to New Environments via Policy-Dynamics Value Functions
Fast and Consistent Learning of Hidden Markov Models by Incorporating Non-Consecutive Correlations
Stochastically Dominant Distributional Reinforcement Learning
Beyond Signal Propagation: Is Feature Diversity Necessary in Deep Neural Network Initialization?
Optimal approximation for unconstrained non-submodular minimization
Nonparametric Score Estimators
Implicit Learning Dynamics in Stackelberg Games: Equilibria Characterization, Convergence Analysis, and Empirical Study
Agent57: Outperforming the Atari Human Benchmark
Monte-Carlo Tree Search as Regularized Policy Optimization
On the (In)tractability of Computing Normalizing Constants for the Product of Determinantal Point Processes
On Coresets for Regularized Regression
On the Expressivity of Neural Networks for Deep Reinforcement Learning
T-GD: Transferable GAN-generated Images Detection Framework
Optimally Solving Two-Agent Decentralized POMDPs Under One-Sided Information Sharing
How Good is the Bayes Posterior in Deep Neural Networks Really?
Fast and Private Submodular and $k$-Submodular Functions Maximization with Matroid Constraints
Learning Flat Latent Manifolds with VAEs
Online Dense Subgraph Discovery via Blurred-Graph Feedback
Linear Lower Bounds and Conditioning of Differentiable Games
Refined bounds for algorithm configuration: The knife-edge of dual class approximability
Optimal Randomized First-Order Methods for Least-Squares Problems
Convex Representation Learning for Generalized Invariance in Semi-Inner-Product Space
Fiedler Regularization: Learning Neural Networks with Graph Sparsity
The Intrinsic Robustness of Stochastic Bandits to Strategic Manipulation
Learning What to Defer for Maximum Independent Sets
NGBoost: Natural Gradient Boosting for Probabilistic Prediction
Perceptual Generative Autoencoders
Closed Loop Neural-Symbolic Learning via Integrating Neural Perception, Grammar Parsing, and Symbolic Reasoning
Private Query Release Assisted by Public Data
On Validation and Planning of An Optimal Decision Rule with Application in Healthcare Studies
Randomized Smoothing of All Shapes and Sizes
Dispersed Exponential Family Mixture VAEs for Interpretable Text Generation
Searching to Exploit Memorization Effect in Learning with Noisy Labels
Safe Deep Semi-Supervised Learning for Unseen-Class Unlabeled Data
Deep Graph Random Process for Relational-Thinking-Based Speech Recognition
Asynchronous Coagent Networks
Generalizing Convolutional Neural Networks for Equivariance to Lie Groups on Arbitrary Continuous Data
Single Point Transductive Prediction
Provable Self-Play Algorithms for Competitive Reinforcement Learning
Lookahead-Bounded Q-learning
Kernelized Stein Discrepancy Tests of Goodness-of-fit for Time-to-Event Data
Provable Representation Learning for Imitation Learning via Bi-level Optimization
What is Local Optimality in Nonconvex-Nonconcave Minimax Optimization?
Bandits with Adversarial Scaling
On the Relation between Quality-Diversity Evaluation and Distribution-Fitting Goal in Text Generation
LEEP: A New Measure to Evaluate Transferability of Learned Representations
Designing Optimal Dynamic Treatment Regimes: A Causal Reinforcement Learning Approach
Expert Learning through Generalized Inverse Multiobjective Optimization: Models, Insights, and Algorithms
Low-Rank Bottleneck in Multi-head Attention Models
Reward-Free Exploration for Reinforcement Learning
Upper bounds for Model-Free Row-Sparse Principal Component Analysis
Accelerating the diffusion-based ensemble sampling by non-reversible dynamics
Learning To Stop While Learning To Predict
p-Norm Flow Diffusion for Local Graph Clustering
Latent Bernoulli Autoencoder
Data Valuation using Reinforcement Learning
Being Bayesian, Even Just a Bit, Fixes Overconfidence in ReLU Networks
Disentangling Trainability and Generalization in Deep Neural Networks
SCAFFOLD: Stochastic Controlled Averaging for Federated Learning
Which Tasks Should Be Learned Together in Multi-task Learning?
Adversarial Risk via Optimal Transport and Optimal Couplings
Boosting for Control of Dynamical Systems
Lifted Disjoint Paths with Application in Multiple Object Tracking
Evolutionary Reinforcement Learning for Sample-Efficient Multiagent Coordination
Decoupled Greedy Learning of CNNs
Overfitting in adversarially robust deep learning
Second-Order Provable Defenses against Adversarial Attacks
Parameterized Rate-Distortion Stochastic Encoder
MoNet3D: Towards Accurate Monocular 3D Object Localization in Real Time
Learning Robot Skills with Temporal Variational Inference
Counterfactual Cross-Validation: Stable Model Selection Procedure for Causal Inference Models
Revisiting Spatial Invariance with Low-Rank Local Connectivity
Deep Reasoning Networks for Unsupervised Pattern De-mixing with Constraint Reasoning
Reinforcement Learning in Feature Space: Matrix Bandit, Kernels, and Regret Bound
Generalized and Scalable Optimal Sparse Decision Trees
SIGUA: Forgetting May Make Learning with Noisy Labels More Robust
Learning Discrete Structured Representations by Adversarially Maximizing Mutual Information
Tensor denoising and completion based on ordinal observations
Video Prediction via Example Guidance
Efficient Continuous Pareto Exploration in Multi-Task Learning
Efficient Policy Learning from Surrogate-Loss Classification Reductions
Minimax Weight and Q-Function Learning for Off-Policy Evaluation
On Efficient Low Distortion Ultrametric Embedding
Data preprocessing to mitigate bias: A maximum entropy based approach
Global Concavity and Optimization in a Class of Dynamic Discrete Choice Models
Can Increasing Input Dimensionality Improve Deep Reinforcement Learning?
Scalable Gaussian Process Separation for Kernels with a Non-Stationary Phase
Streaming k-Submodular Maximization under Noise subject to Size Constraint
CLUB: A Contrastive Log-ratio Upper Bound of Mutual Information
Small Data, Big Decisions: Model Selection in the Small-Data Regime
Distributionally Robust Policy Evaluation and Learning in Offline Contextual Bandits
An Accelerated DFO Algorithm for Finite-sum Convex Functions
Finding trainable sparse networks through Neural Tangent Transfer
Learning Quadratic Games on Networks
Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks
PoWER-BERT: Accelerating BERT Inference via Progressive Word-vector Elimination
On the Sample Complexity of Adversarial Multi-Source PAC Learning
Super-efficiency of automatic differentiation for functions defined as a minimum
Scalable Identification of Partially Observed Systems with Certainty-Equivalent EM
Learning to Learn Kernels with Variational Random Features
A distributional view on multi-objective policy optimization
Learning Autoencoders with Relational Regularization
Bayesian Sparsification of Deep C-valued Networks
Neural Contextual Bandits with UCB-based Exploration
Train Big, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers
Lorentz Group Equivariant Neural Network for Particle Physics
Interpolation between Residual and Non-Residual Networks
Collaborative Machine Learning with Incentive-Aware Model Rewards
Forecasting Sequential Data Using Consistent Koopman Autoencoders
The Performance Analysis of Generalized Margin Maximizers on Separable Data
Adversarial Attacks on Probabilistic Autoregressive Forecasting Models
From Chaos to Order: Symmetry and Conservation Laws in Game Dynamics
Training Binary Neural Networks using the Bayesian Learning Rule
Learning the piece-wise constant graph structure of a varying Ising model
Efficiently Solving MDPs with Stochastic Mirror Descent
Robust Graph Representation Learning via Neural Sparsification
Handling the Positive-Definite Constraint in the Bayesian Learning Rule
Interpretable, Multidimensional, Multimodal Anomaly Detection with Negative Sampling for Detection of Device Failure
Optimal Bounds between f-Divergences and Integral Probability Metrics
Likelihood-free MCMC with Amortized Approximate Ratio Estimators
Provably Convergent Two-Timescale Off-Policy Actor-Critic with Function Approximation
Certified Robustness to Label-Flipping Attacks via Randomized Smoothing
GradientDICE: Rethinking Generalized Offline Estimation of Stationary Values
An Investigation of Why Overparameterization Exacerbates Spurious Correlations
Double-Loop Unadjusted Langevin Algorithm
Provable guarantees for decision tree induction: the agnostic setting
Learning Optimal Tree Models under Beam Search
Attacks Which Do Not Kill Training Make Adversarial Learning Stronger
Towards Adaptive Residual Network Training: A Neural-ODE Perspective
Estimating Model Uncertainty of Neural Networks in Sparse Information Form
Is Local SGD Better than Minibatch SGD?
Estimating the Number and Effect Sizes of Non-null Hypotheses
From ImageNet to Image Classification: Contextualizing Progress on Benchmarks
Scalable and Efficient Comparison-based Search without Features
Identifying Statistical Bias in Dataset Replication
On Unbalanced Optimal Transport: An Analysis of Sinkhorn Algorithm
Learning and Sampling of Atomic Interventions from Observations
Reliable Fidelity and Diversity Metrics for Generative Models
Combining Differentiable PDE Solvers and Graph Neural Networks for Fluid Flow Prediction
Evolutionary Topology Search for Tensor Network Decomposition
Randomization matters How to defend against strong adversarial attacks
Reinforcement Learning for Non-Stationary Markov Decision Processes: The Blessing of (More) Optimism
Bidirectional Model-based Policy Optimization
Learning to Rank Learning Curves
Understanding and Mitigating the Tradeoff between Robustness and Accuracy
Near Input Sparsity Time Kernel Embeddings via Adaptive Sampling
Non-autoregressive Machine Translation with Disentangled Context Transformer
Estimating Generalization under Distribution Shifts via Domain-Invariant Representations
Error-Bounded Correction of Noisy Labels
Population-Based Black-Box Optimization for Biological Sequence Design
Improved Optimistic Algorithms for Logistic Bandits
Control Frequency Adaptation via Action Persistence in Batch Reinforcement Learning
Schatten Norms in Matrix Streams: Hello Sparsity, Goodbye Dimension
Boosted Histogram Transform for Regression
Sample Complexity Bounds for 1-bit Compressive Sensing and Binary Stable Embeddings with Generative Priors
Knowing The What But Not The Where in Bayesian Optimization
Implicit Euler Skip Connections: Enhancing Adversarial Robustness via Numerical Stability
Black-Box Methods for Restoring Monotonicity
Multi-fidelity Bayesian Optimization with Max-value Entropy Search and its Parallelization
A Flexible Framework for Nonparametric Graphical Modeling that Accommodates Machine Learning
Adversarial Robustness via Runtime Masking and Cleansing
Proving the Lottery Ticket Hypothesis: Pruning is All You Need
Aggregation of Multiple Knockoffs
Multi-objective Bayesian Optimization using Pareto-frontier Entropy
On the Generalization Benefit of Noise in Stochastic Gradient Descent
Optimization Theory for ReLU Neural Networks Trained with Normalization Layers
Does label smoothing mitigate label noise?
Variational Bayesian Quantization
Non-Stationary Delayed Bandits with Intermediate Observations
Evaluating Machine Accuracy on ImageNet
Composable Sketches for Functions of Frequencies: Beyond the Worst Case
The Implicit and Explicit Regularization Effects of Dropout
Decision Trees for Decision-Making under the Predict-then-Optimize Framework
Adaptive Estimator Selection for Off-Policy Evaluation
Two Simple Ways to Learn Individual Fairness Metrics from Data
Kernel Methods for Cooperative Multi-Agent Contextual Bandits
On the Theoretical Properties of the Network Jackknife
Bisection-Based Pricing for Repeated Contextual Auctions against Strategic Buyer
Bayesian Graph Neural Networks with Adaptive Connection Sampling
On Breaking Deep Generative Model-based Defenses and Beyond
Beyond UCB: Optimal and Efficient Contextual Bandits with Regression Oracles
Representation Learning via Adversarially-Contrastive Optimal Transport
Thompson Sampling via Local Uncertainty
Meta Variance Transfer: Learning to Augment from the Others
Abstraction Mechanisms Predict Generalization in Deep Neural Networks
Coresets for Clustering in Graphs of Bounded Treewidth
Learning the Valuations of a $k$-demand Agent
Graph Homomorphism Convolution
Bounding the fairness and accuracy of classifiers from population statistics
Distribution Augmentation for Generative Modeling
Revisiting Fundamentals of Experience Replay
Haar Graph Pooling
Nested Subspace Arrangement for Representation of Relational Data
Deep Molecular Programming: A Natural Implementation of Binary-Weight ReLU Neural Networks
DINO: Distributed Newton-Type Optimization Method
FedBoost: A Communication-Efficient Algorithm for Federated Learning
Healing Products of Gaussian Process Experts
PDO-eConvs: Partial Differential Operator Based Equivariant Convolutions
Online Bayesian Moment Matching based SAT Solver Heuristics
Robust and Stable Black Box Explanations
Interpretable Off-Policy Evaluation in Reinforcement Learning by Highlighting Influential Transitions
Implicit Geometric Regularization for Learning Shapes
On conditional versus marginal bias in multi-armed bandits
Influence Diagram Bandits: Variational Thompson Sampling for Structured Bandit Problems
Loss Function Search for Face Recognition
Circuit-Based Intrinsic Methods to Detect Overfitting
Graph-based, Self-Supervised Program Repair from Diagnostic Feedback
Implicit competitive regularization in GANs
Computational and Statistical Tradeoffs in Inferring Combinatorial Structures of Ising Model
Inter-domain Deep Gaussian Processes
Mapping natural-language problems to formal-language solutions using structured neural representations
Estimating Q(s,s') with Deep Deterministic Dynamics Gradients
Source Separation with Deep Generative Priors
Non-Autoregressive Neural Text-to-Speech
DropNet: Reducing Neural Network Complexity via Iterative Pruning
The Neural Tangent Kernel in High Dimensions: Triple Descent and a Multi-Scale Theory of Generalization
Transformation of ReLU-based recurrent neural networks from discrete-time to continuous-time
Black-box Certification and Learning under Adversarial Perturbations
A Chance-Constrained Generative Framework for Sequence Optimization
On the Number of Linear Regions of Convolutional Neural Networks
Detecting Out-of-Distribution Examples with Gram Matrices
When deep denoising meets iterative phase retrieval
Predicting deliberative outcomes
Contrastive Multi-View Representation Learning on Graphs
On Variational Learning of Controllable Representations for Text without Supervision
Optimistic Bounds for Multi-output Learning
Multi-step Greedy Reinforcement Learning Algorithms
Amortised Learning by Wake-Sleep
Near-optimal sample complexity bounds for learning Latent $k-$polytopes and applications to Ad-Mixtures
Online Learning for Active Cache Synchronization
Time-aware Large Kernel Convolutions
Strength from Weakness: Fast Learning Using Weak Supervision
Gradient Temporal-Difference Learning with Regularized Corrections
Deep Streaming Label Learning
Learning to Branch for Multi-Task Learning
On Gradient Descent Ascent for Nonconvex-Concave Minimax Problems
NADS: Neural Architecture Distribution Search for Uncertainty Awareness
Improving Transformer Optimization Through Better Initialization
Learning and Evaluating Contextual Embedding of Source Code
Complexity of Finding Stationary Points of Nonconvex Nonsmooth Functions
Accountable Off-Policy Evaluation With Kernel Bellman Statistics
Learning Mixtures of Graphs from Epidemic Cascades
Do GANs always have Nash equilibria?
The Impact of Neural Network Overparameterization on Gradient Confusion and Stochastic Gradient Descent
Improving the Gating Mechanism of Recurrent Neural Networks
Parameter-free, Dynamic, and Strongly-Adaptive Online Learning
Hierarchical Verification for Adversarial Robustness
From Sets to Multisets: Provable Variational Inference for Probabilistic Integer Submodular Models
BINOCULARS for efficient, nonmyopic sequential experimental design
Stochastic Frank-Wolfe for Constrained Finite-Sum Minimization
Empirical Study of the Benefits of Overparameterization in Learning Latent Variable Models
Near-linear time Gaussian process optimization with adaptive batching and resparsification
Learning with Bounded Instance- and Label-dependent Label Noise
Temporal Phenotyping using Deep Predictive Clustering of Disease Progression
How recurrent networks implement contextual processing in sentiment analysis
Selective Dyna-style Planning Under Limited Model Capacity
Zeno++: Robust Fully Asynchronous SGD
Time-Consistent Self-Supervision for Semi-Supervised Learning
Spectral Graph Matching and Regularized Quadratic Relaxations: Algorithm and Theory
Training Deep Energy-Based Models with f-Divergence Minimization
On the consistency of top-k surrogate losses
Breaking the Curse of Many Agents: Provable Mean Embedding Q-Iteration for Mean-Field Reinforcement Learning
Representations for Stable Off-Policy Reinforcement Learning
Transparency Promotion with Model-Agnostic Linear Competitors
StochasticRank: Global Optimization of Scale-Free Discrete Functions
Provable Smoothness Guarantees for Black-Box Variational Inference
Boosting Deep Neural Network Efficiency with Dual-Module Inference
Adversarial Attacks on Copyright Detection Systems
Countering Language Drift with Seeded Iterated Learning
Compressive sensing with un-trained neural networks: Gradient descent finds a smooth approximation
Latent Variable Modelling with Hyperbolic Normalizing Flows
Mutual Transfer Learning for Massive Data
A general recurrent state space framework for modeling neural dynamics during decision-making
PackIt: A Virtual Environment for Geometric Planning
Representing Unordered Data Using Complex-Weighted Multiset Automata
The Differentiable Cross-Entropy Method
Domain Adaptive Imitation Learning
Generalization to New Actions in Reinforcement Learning
Better depth-width trade-offs for neural networks through the lens of dynamical systems
Stochastic Coordinate Minimization with Progressive Precision for Stochastic Convex Optimization
Convolutional Kernel Networks for Graph-Structured Data
Learning the Stein Discrepancy for Training and Evaluating Energy-Based Models without Sampling
Bridging the Gap Between f-GANs and Wasserstein GANs
Learning with Good Feature Representations in Bandits and in RL with a Generative Model
Correlation Clustering with Asymmetric Classification Errors
Learning Similarity Metrics for Numerical Simulations
AR-DAE: Towards Unbiased Neural Entropy Gradient Estimation
Sparsified Linear Programming for Zero-Sum Equilibrium Finding
Aligned Cross Entropy for Non-Autoregressive Machine Translation
Supervised Quantile Normalization for Low Rank Matrix Factorization
Adversarial Nonnegative Matrix Factorization
Multigrid Neural Memory
Adaptive Sampling for Estimating Probability Distributions
Cautious Adaptation For Reinforcement Learning in Safety-Critical Settings
Tails of Lipschitz Triangular Flows
Inductive Relation Prediction by Subgraph Reasoning
Thompson Sampling Algorithms for Mean-Variance Bandits
Operation-Aware Soft Channel Pruning using Differentiable Masks
Boosting Frank-Wolfe by Chasing Gradients
Stochastic Regret Minimization in Extensive-Form Games
On hyperparameter tuning in general clustering problemsm
Simultaneous Inference for Massive Data: Distributed Bootstrap
AutoML-Zero: Evolving Machine Learning Algorithms From Scratch
Continuous Time Bayesian Networks with Clocks
T-Basis: a Compact Representation for Neural Networks
Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks
Evaluating Lossy Compression Rates of Deep Generative Models
How to Train Your Neural ODE: the World of Jacobian and Kinetic Regularization
Extreme Multi-label Classification from Aggregated Labels
Familywise Error Rate Control by Interactive Unmasking
Optimizer Benchmarking Needs to Account for Hyperparameter Tuning
Unraveling Meta-Learning: Understanding Feature Representations for Few-Shot Tasks
CoMic: Complementary Task Learning & Mimicry for Reusable Skills
Implicit differentiation of Lasso-type models for hyperparameter optimization
Revisiting Training Strategies and Generalization Performance in Deep Metric Learning
On Efficient Constructions of Checkpoints
Random Hypervolume Scalarizations for Provable Multi-Objective Black Box Optimization
Self-Modulating Nonparametric Event-Tensor Factorization
Bio-Inspired Hashing for Unsupervised Similarity Search
Full Law Identification in Graphical Models of Missing Data: Completeness Results
Self-Attentive Associative Memory
Hallucinative Topological Memory for Zero-Shot Visual Planning
MetaFun: Meta-Learning with Iterative Functional Updates
Improving Generative Imagination in Object-Centric World Models
Sequential Transfer in Reinforcement Learning with a Generative Model
VideoOneNet: Bidirectional Convolutional Recurrent OneNet with Trainable Data Steps for Video Processing
Fine-Grained Analysis of Stability and Generalization for Stochastic Gradient Descent
Feature Quantization Improves GAN Training
Amortized Finite Element Analysis for Fast PDE-Constrained Optimization
Peer Loss Functions: Learning from Noisy Labels without Knowing Noise Rates
Temporal Logic Point Processes
Rank Aggregation from Pairwise Comparisons in the Presence of Adversarial Corruptions
Optimizing Data Usage via Differentiable Rewards
Finite-Time Convergence in Continuous-Time Optimization
Estimation of Bounds on Potential Outcomes For Decision Making
Undirected Graphical Models as Approximate Posteriors
Deep Gaussian Markov Random Fields
Adaptive Reward-Poisoning Attacks against Reinforcement Learning
Adversarial Neural Pruning with Latent Vulnerability Suppression
Online Control of the False Coverage Rate and False Sign Rate
Stronger and Faster Wasserstein Adversarial Attacks
Dynamics of Deep Neural Networks and Neural Tangent Hierarchy
Planning to Explore via Self-Supervised World Models
Measuring Non-Expert Comprehension of Machine Learning Fairness Metrics
Multinomial Logit Bandit with Low Switching Cost
Beyond Synthetic Noise: Deep Learning on Controlled Noisy Labels
Task-Oriented Active Perception and Planning in Environments with Partially Known Semantics
Defense Through Diverse Directions
Min-Max Optimization without Gradients: Convergence and Applications to Black-Box Evasion and Poisoning Attacks
Neural Architecture Search in A Proxy Validation Loss Landscape
AutoGAN-Distiller: Searching to Compress Generative Adversarial Networks
Multilinear Latent Conditioning for Generating Unseen Attribute Combinations
One Policy to Control Them All: Shared Modular Policies for Agent-Agnostic Control
Sparse Sinkhorn Attention
Feature Noise Induces Loss Discrepancy Across Groups
Oracle Efficient Private Non-Convex Optimization
Rigging the Lottery: Making All Tickets Winners
Minimax-Optimal Off-Policy Evaluation with Linear Function Approximation
Inducing and Exploiting Activation Sparsity for Fast Inference on Deep Neural Networks
Improving Molecular Design by Stochastic Iterative Target Augmentation
FetchSGD: Communication-Efficient Federated Learning with Sketching
Two Routes to Scalable Credit Assignment without Weight Symmetry
A Pairwise Fair and Community-preserving Approach to k-Center Clustering
Identifying the Reward Function by Anchor Actions
Probing Emergent Semantics in Predictive Agents via Question Answering
Conditional gradient methods for stochastically constrained convex minimization
Infinite attention: NNGP and NTK for deep attention networks
LP-SparseMAP: Differentiable Relaxed Optimization for Sparse Structured Prediction
ControlVAE: Controllable Variational Autoencoder
Accelerated Message Passing for Entropy-Regularized MAP Inference
Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data
Leveraging Procedural Generation to Benchmark Reinforcement Learning
Up or Down? Adaptive Rounding for Post-Training Quantization
Meta-learning for Mixed Linear Regression
A Sample Complexity Separation between Non-Convex and Convex Meta-Learning
Variance Reduction and Quasi-Newton for Particle-Based Variational Inference
Individual Fairness for k-Clustering
Predictive Multiplicity in Classification
A Finite-Time Analysis of Q-Learning with Neural Network Function Approximation
Neuro-Symbolic Visual Reasoning: Disentangling "Visual" from "Reasoning"
Supervised learning: no loss no cry
Analytic Marching: An Analytic Meshing Solution from Deep Implicit Surface Networks
From Local SGD to Local Fixed-Point Methods for Federated Learning
A Simple Framework for Contrastive Learning of Visual Representations
Spectral Subsampling MCMC for Stationary Time Series
Efficient Proximal Mapping of the 1-path-norm of Shallow Networks
Learning with Feature and Distribution Evolvable Streams
State Space Expectation Propagation: Efficient Inference Schemes for Temporal Gaussian Processes
Angular Visual Hardness
Black-Box Variational Inference as a Parametric Approximation to Langevin Dynamics
Reducing Sampling Error in Batch Temporal Difference Learning
“Other-Play” for Zero-Shot Coordination
Efficiently Learning Adversarially Robust Halfspaces with Noise
Progressive Identification of True Labels for Partial-Label Learning
Subspace Fitting Meets Regression: The Effects of Supervision and Orthonormality Constraints on Double Descent of Generalization Errors
An Optimistic Perspective on Offline Deep Reinforcement Learning
Word-Level Speech Recognition With a Letter to Word Encoder
Spectral Frank-Wolfe Algorithm: Strict Complementarity and Linear Convergence
Sharp Composition Bounds for Gaussian Differential Privacy via Edgeworth Expansion
Multi-Agent Routing Value Iteration Network
Inferring DQN structure for high-dimensional continuous control
Distributed Online Optimization over a Heterogeneous Network
Fair Learning with Private Demographic Data
Predictive Sampling with Forecasting Autoregressive Models
On Learning Sets of Symmetric Elements
DessiLBI: Exploring Structural Sparsity of Deep Networks via Differential Inclusion Paths
Differentiable Product Quantization for End-to-End Embedding Compression
Doubly robust off-policy evaluation with shrinkage
Learning to Combine Top-Down and Bottom-Up Signals in Recurrent Neural Networks with Attention over Modules
Description Based Text Classification with Reinforcement Learning
Meta-Learning with Shared Amortized Variational Inference
Neural Clustering Processes
Growing Action Spaces
On a projective ensemble approach to two sample test for equality of distributions
Estimating the Error of Randomized Newton Methods: A Bootstrap Approach
Adding seemingly uninformative labels helps in low data regimes
Low-loss connection of weight vectors: distribution-based approaches
Optimal Differential Privacy Composition for Exponential Mechanisms
A Flexible Latent Space Model for Multilayer Networks
Gradient-free Online Learning in Continuous Games with Delayed Rewards
Working Memory Graphs
Bayesian Optimisation over Multiple Continuous and Categorical Inputs
A Generative Model for Molecular Distance Geometry
Transfer Learning without Knowing: Reprogramming Black-box Machine Learning Models with Scarce Data and Limited Resources
Interpreting Robust Optimization via Adversarial Influence Functions
Lower Complexity Bounds for Finite-Sum Convex-Concave Minimax Optimization Problems
Towards Understanding the Dynamics of the First-Order Adversaries
An Imitation Learning Approach for Cache Replacement
Option Discovery in the Absence of Rewards with Manifold Analysis
Learning Selection Strategies in Buchberger’s Algorithm
Adversarial Robustness Against the Union of Multiple Perturbation Models
Generative Adversarial Imitation Learning with Neural Network Parameterization: Global Optimality and Convergence Rate
(Locally) Differentially Private Combinatorial Semi-Bandits
A Game Theoretic Framework for Model Based Reinforcement Learning
Streaming Coresets for Symmetric Tensor Factorization
Batch Stationary Distribution Estimation
The FAST Algorithm for Submodular Maximization
Context-aware Dynamics Model for Generalization in Model-Based Reinforcement Learning
Stochastic Optimization for Non-convex Inf-Projection Problems
Retrieval Augmented Language Model Pre-Training
When Does Self-Supervision Help Graph Convolutional Networks?
A Tree-Structured Decoder for Image-to-Markup Generation
Neural Network Control Policy Verification With Persistent Adversarial Perturbation
Normalized Loss Functions for Deep Learning with Noisy Labels
k-means++: few more steps yield constant approximation
Polynomial Tensor Sketch for Element-wise Function of Low-Rank Matrix
Let's Agree to Agree: Neural Networks Share Classification Order on Real Datasets
Stochastic Subspace Cubic Newton Method
Bayesian Learning from Sequential Data using Gaussian Processes with Signature Covariances
UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training
Variational Inference for Sequential Data with Future Likelihood Estimates
On the Noisy Gradient Descent that Generalizes as SGD
Sub-Goal Trees -- a Framework for Goal-Based Reinforcement Learning
On Contrastive Learning for Likelihood-free Inference
Variational Autoencoders with Riemannian Brownian Motion Priors
Distinguishing Cause from Effect Using Quantiles: Bivariate Quantile Causal Discovery
How to Solve Fair k-Center in Massive Data Models
Batch Reinforcement Learning with Hyperparameter Gradients
A Geometric Approach to Archetypal Analysis via Sparse Projections
Dynamic Knapsack Optimization Towards Efficient Multi-Channel Sequential Advertising
Sets Clustering
Data-Efficient Image Recognition with Contrastive Predictive Coding
Online Convex Optimization in the Random Order Model
Normalizing Flows on Tori and Spheres
Causal Effect Estimation and Optimal Dose Suggestions in Mobile Health
Dual-Path Distillation: A Unified Framework to Improve Black-Box Attacks
Fairwashing explanations with off-manifold detergent
More Data Can Expand The Generalization Gap Between Adversarially Robust and Standard Models
Implicit Generative Modeling for Efficient Exploration
Double Reinforcement Learning for Efficient and Robust Off-Policy Evaluation
Relaxing Bijectivity Constraints with Continuously Indexed Normalising Flows
Streaming Submodular Maximization under a k-Set System Constraint
Naive Exploration is Optimal for Online LQR
Adversarial Learning Guarantees for Linear Hypotheses and Neural Networks
The Cost-free Nature of Optimally Tuning Tikhonov Regularizers and Other Ordered Smoothers
Inductive-bias-driven Reinforcement Learning For Efficient Schedules in Heterogeneous Clusters
The Boomerang Sampler
Weakly-Supervised Disentanglement Without Compromises
LazyIter: A Fast Algorithm for Counting Markov Equivalent DAGs and Designing Experiments
Goodness-of-Fit Tests for Inhomogeneous Random Graphs
Towards Understanding the Regularization of Adversarial Robustness on Neural Networks
Neural Datalog Through Time: Informed Temporal Modeling via Logical Specification
Safe Reinforcement Learning in Constrained Markov Decision Processes
Reinforcement Learning for Integer Programming: Learning to Cut
Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning
Fast and Three-rious: Speeding Up Weak Supervision with Triplet Methods
Online Multi-Kernel Learning with Graph-Structured Feedback
A Swiss Army Knife for Minimax Optimal Transport
Deep Reinforcement Learning with Smooth Policy
Tightening Exploration in Upper Confidence Reinforcement Learning
Einsum Networks: Fast and Scalable Learning of Tractable Probabilistic Circuits
Fast computation of Nash Equilibria in Imperfect Information Games
The k-tied Normal Distribution: A Compact Parameterization of Gaussian Mean Field Posteriors in Bayesian Neural Networks
Learning disconnected manifolds: a no GAN's land
Low Bias Low Variance Gradient Estimates for Hierarchical Boolean Stochastic Networks
Optimizing for the Future in Non-Stationary MDPs
Explainable k-Means and k-Medians Clustering
Goal-Aware Prediction: Learning to Model What Matters
Laplacian Regularized Few-Shot Learning
Learning Compound Tasks without Task-specific Knowledge via Imitation and Self-supervised Learning
What can I do here? A Theory of Affordances in Reinforcement Learning
Automatic Shortcut Removal for Self-Supervised Representation Learning
Simple and sharp analysis of k-means||
Sharp Statistical Guaratees for Adversarially Robust Gaussian Classification
Characterizing Distribution Equivalence and Structure Learning for Cyclic and Acyclic Directed Graphs
Discriminative Adversarial Search for Abstractive Summarization
Structured Linear Contextual Bandits: A Sharp and Geometric Smoothed Analysis
Curse of Dimensionality on Randomized Smoothing for Certifiable Robustness
On the Power of Compressed Sensing with Generative Models
Optimal Sequential Maximization: One Interview is Enough!
Invariant Causal Prediction for Block MDPs
Efficiently sampling functions from Gaussian process posteriors
On Second-Order Group Influence Functions for Black-Box Predictions
Randomly Projected Additive Gaussian Processes for Regression
Involutive MCMC: a Unifying Framework
Fair k-Centers via Maximum Matching
Safe Imitation Learning via Fast Bayesian Reward Inference from Preferences
Missing Data Imputation using Optimal Transport
Fast Learning of Graph Neural Networks with Guaranteed Generalizability: One-hidden-layer Case
Online Continual Learning from Imbalanced Data
AdaScale SGD: A User-Friendly Algorithm for Distributed Training
Automated Synthetic-to-Real Generalization
Structure Adaptive Algorithms for Stochastic Bandits
Uncertainty Estimation Using a Single Deep Deterministic Neural Network
Partial Trace Regression and Low-Rank Kraus Decomposition
Implicit Class-Conditioned Domain Alignment for Unsupervised Domain Adaptation
Extrapolation for Large-batch Training in Deep Learning
Preselection Bandits
Provably Efficient Model-based Policy Adaptation
Causal Inference using Gaussian Processes with Structured Latent Confounders
Robustifying Sequential Neural Processes
Optimizing Dynamic Structures with Bayesian Generative Search
Problems with Shapley-value-based explanations as feature importance measures
Mix-n-Match : Ensemble and Compositional Methods for Uncertainty Calibration in Deep Learning
No-Regret and Incentive-Compatible Online Learning
Gamification of Pure Exploration for Linear Bandits
Sparse Gaussian Processes with Spherical Harmonic Features
Stochastic Optimization for Regularized Wasserstein Estimators
Differentially Private Set Union
Understanding the Curse of Horizon in Off-Policy Evaluation via Conditional Importance Sampling
Poisson Learning: Graph Based Semi-Supervised Learning At Very Low Label Rates
Non-separable Non-stationary random fields
Puzzle Mix: Exploiting Saliency and Local Statistics for Optimal Mixup
Associative Memory in Iterated Overparameterized Sigmoid Autoencoders
Ordinal Non-negative Matrix Factorization for Recommendation
Generalization Error of Generalized Linear Models in High Dimensions
Confidence-Calibrated Adversarial Training: Generalizing to Unseen Attacks
Logarithmic Regret for Adversarial Online Control
Causal Modeling for Fairness In Dynamical Systems
Quantum Boosting
Skew-Fit: State-Covering Self-Supervised Reinforcement Learning
The Tree Ensemble Layer: Differentiability meets Conditional Computation
ACFlow: Flow Models for Arbitrary Conditional Likelihoods
Data-Dependent Differentially Private Parameter Learning for Directed Graphical Models
Combinatorial Pure Exploration for Dueling Bandit
Error Estimation for Sketched SVD via the Bootstrap
Generalization and Representational Limits of Graph Neural Networks
Enhanced POET: Open-ended Reinforcement Learning through Unbounded Invention of Learning Challenges and their Solutions
Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention
Deep Isometric Learning for Visual Recognition
Unlabelled Data Improves Bayesian Uncertainty Calibration under Covariate Shift
Dual Mirror Descent for Online Allocation Problems
Model-free Reinforcement Learning in Infinite-horizon Average-reward Markov Decision Processes
Preference Modeling with Context-Dependent Salient Features
Efficient Non-conjugate Gaussian Process Factor Models for Spike Count Data using Polynomial Approximations
Growing Adaptive Multi-hyperplane Machines
Spectrum Dependent Learning Curves in Kernel Regression and Wide Neural Networks
On Semi-parametric Inference for BART
RIFLE: Backpropagation in Depth for Deep Transfer Learning through Re-Initializing the Fully-connected LayEr
Optimization from Structured Samples for Coverage Functions
Online Learning with Imperfect Hints
Alleviating Privacy Attacks via Causal Learning
Efficient Identification in Linear Structural Causal Models with Auxiliary Cutsets
The Effect of Natural Distribution Shift on Question Answering Models
Multiresolution Tensor Learning for Efficient and Interpretable Spatial Analysis
Feature Selection using Stochastic Gates
Information-Theoretic Local Minima Characterization and Regularization
Online Learned Continual Compression with Adaptive Quantization Modules
Semi-Supervised Learning with Normalizing Flows
Fractal Gaussian Networks: A sparse random graph model based on Gaussian Multiplicative Chaos
Educating Text Autoencoders: Latent Representation Guidance via Denoising
Stabilizing Differentiable Architecture Search via Perturbation-based Regularization
Constant Curvature Graph Convolutional Networks
Kernel interpolation with continuous volume sampling
IPBoost – Non-Convex Boosting via Integer Programming
Adaptive Gradient Descent without Descent
Concise Explanations of Neural Networks using Adversarial Training
Federated Learning with Only Positive Labels
The continuous categorical: a novel simplex-valued exponential family
PENNI: Pruned Kernel Sharing for Efficient CNN Inference
A Unified Theory of Decentralized SGD with Changing Topology and Local Updates
Efficient Intervention Design for Causal Discovery with Latents
Online Learning with Dependent Stochastic Feedback Graphs
Frequentist Uncertainty in Recurrent Neural Networks via Blockwise Influence Functions
Finite-Time Last-Iterate Convergence for Multi-Agent Learning in Games
Learning Adversarial Markov Decision Processes with Bandit Feedback and Unknown Transition
Maximum-and-Concatenation Networks
Optimistic Policy Optimization with Bandit Feedback
Moniqua: Modulo Quantized Communication in Decentralized SGD
Why Are Learned Indexes So Effective?
Learning Human Objectives by Evaluating Hypothetical Behavior
The Shapley Taylor Interaction Index
Test-Time Training with Self-Supervision for Generalization under Distribution Shifts
SGD Learns One-Layer Networks in WGANs
Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning
Faster Graph Embeddings via Coarsening
CURL: Contrastive Unsupervised Representations for Reinforcement Learning
Generative Pretraining From Pixels
Retro*: Learning Retrosynthetic Planning with Neural Guided A* Search
Amortized Population Gibbs Samplers with Neural Sufficient Statistics
What Can Learned Intrinsic Rewards Capture?
Learning Fair Policies in Multi-Objective (Deep) Reinforcement Learning with Average and Discounted Rewards
Set Functions for Time Series
Variational Imitation Learning with Diverse-quality Demonstrations
The Role of Regularization in Classification of High-dimensional Noisy Gaussian Mixture
A quantile-based approach for hyperparameter transfer learning
DeepCoDA: personalized interpretability for compositional health data
Optimal Non-parametric Learning in Repeated Contextual Auctions with Strategic Buyer
No-Regret Exploration in Goal-Oriented Reinforcement Learning
Unsupervised Speech Decomposition via Triple Information Bottleneck
Efficient Optimistic Exploration in Linear-Quadratic Regulators via Lagrangian Relaxation
Fast Differentiable Sorting and Ranking
Implicit Regularization of Random Feature Models
Continuous Graph Neural Networks
Consistent Estimators for Learning to Defer to an Expert
A Graph to Graphs Framework for Retrosynthesis Prediction
Learning Calibratable Policies using Programmatic Style-Consistency
Distance Metric Learning with Joint Representation Diversification
Near-optimal Regret Bounds for Stochastic Shortest Path
ECLIPSE: An Extreme-Scale Linear Program Solver for Web-Applications
Model-Based Reinforcement Learning with Value-Targeted Regression
A Sequential Self Teaching Approach for Improving Generalization in Sound Event Recognition
On the Global Convergence Rates of Softmax Policy Gradient Methods
Learnable Group Transform For Time-Series
Fair Generative Modeling via Weak Supervision
Obtaining Adjustable Regularization for Free via Iterate Averaging
Invariant Risk Minimization Games
Linear Mode Connectivity and the Lottery Ticket Hypothesis
Input-Sparsity Low Rank Approximation in Schatten Norm
Convolutional dictionary learning based auto-encoders for natural exponential-family distributions
Optimal transport mapping via input convex neural networks
Visual Grounding of Learned Physical Models
GraphOpt: Learning Optimization Models of Graph Formation
An EM Approach to Non-autoregressive Conditional Sequence Generation
Training Neural Networks for and by Interpolation
On the Iteration Complexity of Hypergradient Computation
Balancing Competing Objectives with Noisy Data: Score-Based Classifiers for Welfare-Aware Machine Learning
InstaHide: Instance-hiding Schemes for Private Distributed Learning
Variance Reduction in Stochastic Particle-Optimization Sampling
The Sample Complexity of Best-$k$ Items Selection from Pairwise Comparisons
Efficient Domain Generalization via Common-Specific Low-Rank Decomposition
Approximating Stacked and Bidirectional Recurrent Architectures with the Delayed Recurrent Neural Network
Born-again Tree Ensembles
Learning to Simulate Complex Physics with Graph Networks
Learning to Simulate and Design for Structural Engineering
Understanding Contrastive Representation Learning through Alignment and Uniformity on the Hypersphere
Efficient Robustness Certificates for Discrete Data: Sparsity-Aware Randomized Smoothing for Graphs, Images and More
Towards a General Theory of Infinite-Width Limits of Neural Classifiers
PolyGen: An Autoregressive Generative Model of 3D Meshes
Multiclass Neural Network Minimization via Tropical Newton Polytope Approximation
Structured Prediction with Partial Labelling through the Infimum Loss
XtarNet: Learning to Extract Task-Adaptive Representation for Incremental Few-Shot Learning
Dissecting Non-Vacuous Generalization Bounds based on the Mean-Field Approximation
Controlling Overestimation Bias with Truncated Mixture of Continuous Distributional Quantile Critics
Constructive Universal High-Dimensional Distribution Generation through Deep ReLU Networks
Regularized Optimal Transport is Ground Cost Adversarial
On Convergence-Diagnostic based Step Sizes for Stochastic Gradient Descent
Hierarchically Decoupled Imitation For Morphological Transfer
Minimally distorted Adversarial Examples with a Fast Adaptive Boundary Attack
Equivariant Flows: Exact Likelihood Generative Learning for Symmetric Densities
The Buckley-Osthus model and the block preferential attachment model: statistical analysis and application
Causal Strategic Linear Regression
Online metric algorithms with untrusted predictions
Piecewise Linear Regression via a Difference of Convex Functions
Robustness to Spurious Correlations via Human Annotations
On Learning Language-Invariant Representations for Universal Machine Translation
A simpler approach to accelerated optimization: iterative averaging meets optimism
Sub-linear Memory Sketches for Near Neighbor Search on Streaming Data
Neural Topic Modeling with Continual Lifelong Learning
On Lp-norm Robustness of Ensemble Decision Stumps and Trees
Recht-Re Noncommutative Arithmetic-Geometric Mean Conjecture is False
High-dimensional Robust Mean Estimation via Gradient Descent
Fundamental Tradeoffs between Invariance and Sensitivity to Adversarial Perturbations
On Implicit Regularization in $\beta$-VAEs
Uncertainty quantification for nonconvex tensor completion: Confidence intervals, heteroscedasticity and optimality
A Nearly-Linear Time Algorithm for Exact Community Recovery in Stochastic Block Model
Concept Bottleneck Models
FACT: A Diagnostic for Group Fairness Trade-offs
Data Amplification: Instance-Optimal Property Estimation
Stochastic Hamiltonian Gradient Methods for Smooth Games
DROCC: Deep Robust One-Class Classification
Predictive Coding for Locally-Linear Control
Understanding Self-Training for Gradual Domain Adaptation
Improved Bounds on Minimax Regret under Logarithmic Loss via Self-Concordance
Do We Need Zero Training Loss After Achieving Zero Training Error?
On Thompson Sampling with Langevin Algorithms
Strategic Classification is Causal Modeling in Disguise
Decentralised Learning with Random Features and Distributed Gradient Descent
Topic Modeling via Full Dependence Mixtures
Transformer Hawkes Process
Adversarial Mutual Information for Text Generation
Optimal Estimator for Unlabeled Linear Regression
Learning Near Optimal Policies with Low Inherent Bellman Error
Margin-aware Adversarial Domain Adaptation with Optimal Transport
Message Passing Least Squares Framework and its Application to Rotation Synchronization
Improving Robustness of Deep-Learning-Based Image Reconstruction
Domain Aggregation Networks for Multi-Source Domain Adaptation
Recurrent Hierarchical Topic-Guided RNN for Language Generation
Closing the convergence gap of SGD without replacement
Learning to Navigate The Synthetically Accessible Chemical Space Using Reinforcement Learning
Emergence of Separable Manifolds in Deep Language Representations
Recovery of Sparse Signals from a Mixture of Linear Samples
Don't Waste Your Bits! Squeeze Activations and Gradients for Deep Neural Networks via TinyScript
Flexible and Efficient Long-Range Planning Through Curious Exploration
Private Outsourced Bayesian Optimization
Sparse Convex Optimization via Adaptively Regularized Hard Thresholding
Orthogonalized SGD and Nested Architectures for Anytime Neural Networks
Discriminative Jackknife: Quantifying Uncertainty in Deep Learning via Higher-Order Influence Functions
Improved Communication Cost in Distributed PageRank Computation – A Theoretical Study
Collapsed Amortized Variational Inference for Switching Nonlinear Dynamical Systems
On the Convergence of Nesterov's Accelerated Gradient Method in Stochastic Settings
Causal Structure Discovery from Distributions Arising from Mixtures of DAGs
A Distributional Framework For Data Valuation
Customizing ML Predictions for Online Algorithms
Learning Adversarially Robust Representations via Worst-Case Mutual Information Maximization
Individual Calibration with Randomized Forecasting
Bayesian Differential Privacy for Machine Learning
Logistic Regression for Massive Data with Rare Events
Meta-learning with Stochastic Linear Bandits
Parallel Algorithm for Non-Monotone DR-Submodular Maximization
Deep Divergence Learning
TrajectoryNet: A Dynamic Optimal Transport Network for Modeling Cellular Dynamics
A new regret analysis for Adam-type algorithms
InfoGAN-CR and ModelCentrality: Self-supervised Model Training and Selection for Disentangling GANs
Adversarial Robustness for Code
Curvature-corrected learning dynamics in deep neural networks
Attentive Group Equivariant Convolutional Networks
XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalisation
Guided Learning of Nonconvex Models through Successive Functional Gradient Optimization
Causal Effect Identifiability under Partial-Observability
Scalable Exact Inference in Multi-Output Gaussian Processes
Topologically Densified Distributions
Graph Filtration Learning
Student-Teacher Curriculum Learning via Reinforcement Learning: Predicting Hospital Inpatient Admission Location
Scalable Differential Privacy with Certified Robustness in Adversarial Learning
A Free-Energy Principle for Representation Learning
Generalisation error in learning with random features and the hidden manifold model
Why bigger is not always better: on finite and infinite neural networks
Budgeted Online Influence Maximization
Being Bayesian about Categorical Probability
Structured Policy Iteration for Linear Quadratic Regulator
Intrinsic Reward Driven Imitation Learning via Generative Model
Manifold Identification for Ultimately Communication-Efficient Distributed Optimization
When Demands Evolve Larger and Noisier: Learning and Earning in a Growing Environment
Communication-Efficient Distributed PCA by Riemannian Optimization
Learning Reasoning Strategies in End-to-End Differentiable Proving
Learning Algebraic Multigrid Using Graph Neural Networks
Normalized Flat Minima: Exploring Scale Invariant Definition of Flat Minima for Neural Networks Using PAC-Bayesian Analysis
Neural Kernels Without Tangents
Q-value Path Decomposition for Deep Multiagent Reinforcement Learning
Quantized Decentralized Stochastic Learning over Directed Graphs
Continuous-time Lower Bounds for Gradient-based Algorithms
Adaptive Droplet Routing in Digital Microfluidic Biochips Using Deep Reinforcement Learning
Coresets for Data-efficient Training of Machine Learning Models
Sample Factory: Egocentric 3D Control from Pixels at 100000 FPS with Asynchronous Reinforcement Learning
Learning Task-Agnostic Embedding of Multiple Black-Box Experts for Multi-Task Model Fusion
Safe screening rules for L0-regression from Perspective Relaxations
Semi-Supervised StyleGAN for Disentanglement Learning
Variational Label Enhancement
Optimizing Long-term Social Welfare in Recommender Systems: A Constrained Matching Approach
The Non-IID Data Quagmire of Decentralized Machine Learning
Learning from Irregularly-Sampled Time Series: A Missing Data Perspective
Parametric Gaussian Process Regressors
Evaluating the Performance of Reinforcement Learning Algorithms
Eliminating the Invariance on the Loss Landscape of Linear Autoencoders
FormulaZero: Distributionally Robust Online Adaptation via Offline Population Synthesis
An end-to-end approach for the verification problem: learning the right distance
Decentralized Reinforcement Learning: Global Decision-Making via Local Economic Transactions
Near-Tight Margin-Based Generalization Bounds for Support Vector Machines
Frustratingly Simple Few-Shot Object Detection
DeepMatch: Balancing Deep Covariate Representations for Causal Inference Using Adversarial Training
Rethinking Bias-Variance Trade-off for Generalization of Neural Networks
Universal Equivariant Multilayer Perceptrons
Optimal Robust Learning of Discrete Distributions from Batches
Multi-Precision Policy Enforced Training (MuPPET) : A Precision-Switching Strategy for Quantised Fixed-Point Training of CNNs
Energy-Based Processes for Exchangeable Data
An end-to-end Differentially Private Latent Dirichlet Allocation Using a Spectral Algorithm
LowFER: Low-rank Bilinear Pooling for Link Prediction
SimGANs: Simulator-Based Generative Adversarial Networks for ECG Synthesis to Improve Deep ECG Classification
Momentum Improves Normalized SGD
When are Non-Parametric Methods Robust?
Is There a Trade-Off Between Fairness and Accuracy? A Perspective Using Mismatched Hypothesis Testing
Multi-Objective Molecule Generation using Interpretable Substructures
The Implicit Regularization of Stochastic Gradient Flow for Least Squares
Learning Representations that Support Extrapolation
Frequency Bias in Neural Networks for Input of Non-Uniform Density
Incremental Sampling Without Replacement for Sequence Models
Neural Networks are Convex Regularizers: Exact Polynomial-time Convex Optimization Formulations for Two-layer Networks
Variable Skipping for Autoregressive Range Density Estimation
TaskNorm: Rethinking Batch Normalization for Meta-Learning
Acceleration for Compressed Gradient Descent in Distributed and Federated Optimization
Learning to Score Behaviors for Guided Policy Optimization
Invertible generative models for inverse problems: mitigating representation error and dataset bias
Anderson Acceleration of Proximal Gradient Methods
Harmonic Decompositions of Convolutional Networks
Deep k-NN for Noisy Labels
An Explicitly Relational Neural Network Architecture
Private Counting from Anonymous Messages: Near-Optimal Accuracy with Vanishing Communication Overhead
Learning with Multiple Complementary Labels
Understanding the Impact of Model Incoherence on Convergence of Incremental SGD with Random Reshuffle
LTF: A Label Transformation Framework for Correcting Label Shift
Quadratically Regularized Subgradient Methods for Weakly Convex Optimization with Weakly Convex Constraints
Learning Portable Representations for High-Level Planning
Choice Set Optimization Under Discrete Choice Models of Group Decisions
Improving the Sample and Communication Complexity for Decentralized Non-Convex Optimization: Joint Gradient Estimation and Tracking
Unique Properties of Flat Minima in Deep Networks
Stochastic Differential Equations with Variational Wishart Diffusions
Explaining Groups of Points in Low-Dimensional Representations
Understanding and Stabilizing GANs' Training Dynamics Using Control Theory
Calibration, Entropy Rates, and Memory in Language Models
Entropy Minimization In Emergent Languages
Model Fusion with Kullback--Leibler Divergence
Momentum-Based Policy Gradient Methods
Scalable Nearest Neighbor Search for Optimal Transport
Discount Factor as a Regularizer in Reinforcement Learning
Small-GAN: Speeding up GAN Training using Core-Sets
Median Matrix Completion: from Embarrassment to Optimality
Exploration Through Reward Biasing: Reward-Biased Maximum Likelihood Estimation for Stochastic Multi-Armed Bandits
Online mirror descent and dual averaging: keeping pace in the dynamic case
Differentiable Likelihoods for Fast Inversion of 'Likelihood-Free' Dynamical Systems
Encoding Musical Style with Transformer Autoencoders
Interferometric Graph Transform: a Deep Unsupervised Graph Representation
Robust Pricing in Dynamic Mechanism Design
SDE-Net: Equipping Deep Neural Networks with Uncertainty Estimates
Spectral Clustering with Graph Neural Networks for Graph Pooling
Fully Parallel Hyperparameter Search: Reshaped Space-Filling
On Relativistic f-Divergences
Projection-free Distributed Online Convex Optimization with $O(\sqrt{T})$ Communication Complexity
Good Subnetworks Provably Exist: Pruning via Greedy Forward Selection
FR-Train: A Mutual Information-Based Approach to Fair and Robust Training
Robust Outlier Arm Identification
Stochastic bandits with arm-dependent delays
Consistent Structured Prediction with Max-Min Margin Markov Networks
Latent Space Factorisation and Manipulation via Matrix Subspace Projection
Modulating Surrogates for Bayesian Optimization
Fast Deterministic CUR Matrix Decomposition with Accuracy Assurance
Random extrapolation for primal-dual coordinate descent
Concentration bounds for CVaR estimation: The cases of light-tailed and heavy-tailed distributions
Active World Model Learning in Agent-rich Environments with Progress Curiosity
Predicting Choice with Set-Dependent Aggregation
Landscape Connectivity and Dropout Stability of SGD Solutions for Over-parameterized Neural Networks
Projective Preferential Bayesian Optimization
Simple and Deep Graph Convolutional Networks
Stochastic Gauss-Newton Algorithms for Nonconvex Compositional Optimization
Real-Time Optimisation for Online Learning in Auctions
Multidimensional Shape Constraints
Robust Bayesian Classification Using An Optimistic Score Ratio
Convergence Rates of Variational Inference in Sparse Deep Learning
Logarithmic Regret for Learning Linear Quadratic Regulators Efficiently
Deep Coordination Graphs
Reinforcement Learning for Molecular Design Guided by Quantum Mechanics
A Natural Lottery Ticket Winner: Reinforcement Learning with Ordinary Neural Circuits
Unsupervised Transfer Learning for Spatiotemporal Predictive Networks
Strategyproof Mean Estimation from Multiple-Choice Questions
Double Trouble in Double Descent: Bias and Variance(s) in the Lazy Regime
Imputer: Sequence Modelling via Imputation and Dynamic Programming
Towards non-parametric drift detection via Dynamic Adapting Window Independence Drift Detection (DAWIDD)
Information Particle Filter Tree: An Online Algorithm for POMDPs with Belief-Based Rewards on Continuous Domains
On the Global Optimality of Model-Agnostic Meta-Learning
Non-convex Learning via Replica Exchange Stochastic Gradient MCMC
Continuously Indexed Domain Adaptation
Minimax Rate for Learning From Pairwise Comparisons in the BTL Model
Cost-effectively Identifying Causal Effects When Only Response Variable is Observable
Sequential Cooperative Bayesian Inference
Converging to Team-Maxmin Equilibria in Zero-Sum Multiplayer Games
Adaptive Region-Based Active Learning
Private Reinforcement Learning with PAC and Regret Guarantees
Sparse Subspace Clustering with Entropy-Norm
Generalization Guarantees for Sparse Kernel Approximation with Entropic Optimal Features
Debiased Sinkhorn barycenters
Principled learning method for Wasserstein distributionally robust optimization with local perturbations
Generative Flows with Matrix Exponential
Equivariant Neural Rendering
Optimizing Black-box Metrics with Adaptive Surrogates
Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills
Self-Attentive Hawkes Process
Negative Sampling in Semi-Supervised learning
Sparse Shrunk Additive Models
Spread Divergence
VFlow: More Expressive Generative Flows with Variational Data Augmentation
Adaptive Sketching for Fast and Convergent Canonical Polyadic Decomposition
Label-Noise Robust Domain Adaptation
Learning Opinions in Social Networks
Voice Separation with an Unknown Number of Multiple Speakers
Off-Policy Actor-Critic with Shared Experience Replay
Self-Concordant Analysis of Frank-Wolfe Algorithms
On the Generalization Effects of Linear Transformations in Data Augmentation
Graph Random Neural Features for Distance-Preserving Graph Representations
Provably Efficient Exploration in Policy Optimization
Striving for Simplicity and Performance in Off-Policy DRL: Output Normalization and Non-Uniform Sampling
Doubly Stochastic Variational Inference for Neural Processes with Hierarchical Latent Variables
Quantum Expectation-Maximization for Gaussian mixture models
Proper Network Interpretability Helps Adversarial Robustness in Classification
DeBayes: a Bayesian Method for Debiasing Network Embeddings
Learning Deep Kernels for Non-Parametric Two-Sample Tests
Stabilizing Transformers for Reinforcement Learning
Duality in RKHSs with Infinite Dimensional Outputs: Application to Robust Losses
OPtions as REsponses: Grounding behavioural hierarchies in multi-agent reinforcement learning
Divide, Conquer, and Combine: a New Inference Strategy for Probabilistic Programs with Stochastic Support
Influenza Forecasting Framework based on Gaussian Processes
Does the Markov Decision Process Fit the Data: Testing for the Markov Property in Sequential Decision Making
Ready Policy One: World Building Through Active Learning
Semismooth Newton Algorithm for Efficient Projections onto $\ell_{1, \infty}$-norm Ball
Graph-based Nearest Neighbor Search: From Practice to Theory
Structural Language Models of Code
Policy Teaching via Environment Poisoning: Training-time Adversarial Attacks against Reinforcement Learning
PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization
On Differentially Private Stochastic Convex Optimization with Heavy-tailed Data
Statistically Efficient Off-Policy Policy Gradients
Nearly Linear Row Sampling Algorithm for Quantile Regression
A Generic First-Order Algorithmic Framework for Bi-Level Programming Beyond Lower-Level Singleton
On Leveraging Pretrained GANs for Generation with Limited Data
Few-shot Domain Adaptation by Causal Mechanism Transfer
Adaptive Adversarial Multi-task Representation Learning
Variance Reduced Coordinate Descent with Acceleration: New Method With a Surprising Application to Finite-Sum Problems
Learning Structured Latent Factors from Dependent Data:A Generative Model Framework from Information-Theoretic Perspective
Generating Programmatic Referring Expressions via Program Synthesis
Explicit Gradient Learning for Black-Box Optimization
Optimization and Analysis of the pAp@k Metric for Recommender Systems
Self-PU: Self Boosted and Calibrated Positive-Unlabeled Training
Prediction-Guided Multi-Objective Reinforcement Learning for Continuous Robot Control
Interpretations are Useful: Penalizing Explanations to Align Neural Networks with Prior Knowledge
Training Linear Neural Networks: Non-Local Convergence and Complexity Results
ROMA: Multi-Agent Reinforcement Learning with Emergent Roles
Online Pricing with Offline Data: Phase Transition and Inverse Square Law
Stochastic Gradient and Langevin Processes
Minimax Pareto Fairness: A Multi Objective Perspective
When Explanations Lie: Why Many Modified BP Attributions Fail
Approximation Guarantees of Local Search Algorithms via Localizability of Set Functions
DeltaGrad: Rapid retraining of machine learning models
Teaching with Limited Information on the Learner's Behaviour
Do RNN and LSTM have Long Memory?
On the Unreasonable Effectiveness of the Greedy Algorithm: Greedy Adapts to Sharpness
Taylor Expansion Policy Optimization
Layered Sampling for Robust Optimization Problems
Adaptive Checkpoint Adjoint Method for Gradient Estimation in Neural ODE
One Size Fits All: Can We Train One Denoiser for All Noise Levels?
Learning to Encode Position for Transformer with Continuous Dynamical Model
Certified Data Removal from Machine Learning Models
GNN-FiLM: Graph Neural Networks with Feature-wise Linear Modulation
Approximation Capabilities of Neural ODEs and Invertible Residual Networks
Maximum Likelihood with Bias-Corrected Calibration is Hard-To-Beat at Label Shift Adaptation
Random Matrix Theory Proves that Deep Learning Representations of GAN-data Behave as Gaussian Mixtures
Pretrained Generalized Autoregressive Model with Adaptive Probabilistic Label Clusters for Extreme Multi-label Text Classification
Fast OSCAR and OWL Regression via Safe Screening Rules
Inertial Block Proximal Methods for Non-Convex Non-Smooth Optimization
Statistically Preconditioned Accelerated Gradient Method for Distributed Optimization
Uncertainty-Aware Lookahead Factor Models for Quantitative Investing
Learning Efficient Multi-agent Communication: An Information Bottleneck Approach
Multi-Agent Determinantal Q-Learning
Task Understanding from Confusing Multi-task Data
Privately detecting changes in unknown distributions
Linear Convergence of Randomized Primal-Dual Coordinate Method for Large-scale Linear Constrained Convex Programming
Symbolic Network: Generalized Neural Policies for Relational MDPs
Stochastic Flows and Geometric Optimization on the Orthogonal Group
Topological Autoencoders
On Layer Normalization in the Transformer Architecture
Fiduciary Bandits
One-shot Distributed Ridge Regression in High Dimensions
Robustness to Programmable String Transformations via Augmented Abstract Training
Towards Accurate Post-training Network Quantization via Bit-Split and Stitching
Channel Equilibrium Networks for Learning Deep Representation
Universal Asymptotic Optimality of Polyak Momentum
Progressive Graph Learning for Open-Set Domain Adaptation
SoftSort: A Continuous Relaxation for the argsort Operator
Cooperative Multi-Agent Bandits with Heavy Tails
Enhancing Simple Models by Exploiting What They Already Know
Cost-Effective Interactive Attention Learning with Neural Attention Processes
Active Learning on Attributed Graphs via Graph Cognizant Logistic Regression and Preemptive Query Generation
Learning De-biased Representations with Biased Representations
Too Relaxed to Be Fair
From Importance Sampling to Doubly Robust Policy Gradient
Context Aware Local Differential Privacy
From PAC to Instance-Optimal Sample Complexity in the Plackett-Luce Model
Graph Structure of Neural Networks
Fractional Underdamped Langevin Dynamics: Retargeting SGD with Momentum under Heavy-Tailed Gradient Noise
Graph Convolutional Network for Recommendation with Low-pass Collaborative Filters
Automatic Reparameterisation of Probabilistic Programs
Privately Learning Markov Random Fields
Improved Sleeping Bandits with Stochastic Action Sets and Adversarial Rewards
Clinician-in-the-Loop Decision Making: Reinforcement Learning with Near-Optimal Set-Valued Policies
Learning Factorized Weight Matrix for Joint Filtering
Optimal Continual Learning has Perfect Memory and is NP-hard
More Information Supervised Probabilistic Deep Face Embedding Learning
Acceleration through spectral density estimation
Reverse-engineering deep ReLU networks
Almost Tune-Free Variance Reduction
Breaking the Curse of Space Explosion: Towards Efficient NAS with Curriculum Search
Randomized Block-Diagonal Preconditioning for Parallel Learning
Class-Weighted Classification: Trade-offs and Robust Approaches
Hybrid Stochastic-Deterministic Minibatch Proximal Gradient: Less-Than-Single-Pass Optimization with Nearly Optimal Generalization
New Oracle-Efficient Algorithms for Private Synthetic Data Release
Do We Really Need to Access the Source Data? Source Hypothesis Transfer for Unsupervised Domain Adaptation
Scalable Differentiable Physics for Learning and Control
Best Arm Identification for Cascading Bandits in the Fixed Confidence Setting
Training Binary Neural Networks through Learning with Noisy Supervision
My Fair Bandit: Distributed Learning of Max-Min Fairness with Multi-player Bandits
The Complexity of Finding Stationary Points with Stochastic Gradient Descent
Reserve Pricing in Repeated Second-Price Auctions with Strategic Bidders
Uniform Convergence of Rank-weighted Learning
DRWR: A Differentiable Renderer without Rendering for Unsupervised 3D Structure Learning from Silhouette Images
The Many Shapley Values for Model Explanation
Feature-map-level Online Adversarial Knowledge Distillation
Soft Threshold Weight Reparameterization for Learnable Sparsity
Performative Prediction
CAUSE: Learning Granger Causality from Event Sequences using Attribution Methods
We use cookies to store which papers have been visited.
I agree
Successful Page Load
ICML uses cookies for essential functions only. We do not sell your personal information.
Our Privacy Policy »
Accept Cookies
We use cookies to store which papers have been visited.
I agree