Getting Started
Schedule
Tutorials
Main Conference
Invited Talks
Orals
Spotlights
Awards
Test of Time Award
Papers
Workshops
Community
Affinity Events
Socials
Sponsors
Organizers
Help
FAQ
Presenters Instructions
Moderators Instructions
RocketChat Help
RocketChat Desktop Client
Login
Browse
mini
compact
detail
Showing papers for
.
×
×
title
author
session
shuffle
by
serendipity
bookmarked first
visited first
not visited first
bookmarked but not visited
Enable Javascript in your browser to see the papers page.
Unsupervised Skill Discovery for Learning Shared Structures across Changing Environments
Personalized Federated Learning under Mixture of Distributions
Multi-Layer Neural Networks as Trainable Ladders of Hilbert Spaces
Improving Fair Training under Correlation Shifts
How Powerful are Shallow Neural Networks with Bandlimited Random Weights?
Parallel Online Clustering of Bandits via Hedonic Game
High-dimensional Clustering onto Hamiltonian Cycle
A Generalization of ViT/MLP-Mixer to Graphs
Banker Online Mirror Descent: A Universal Approach for Delayed Online Bandit Learning
Image Shortcut Squeezing: Countering Perturbative Availability Poisons with Compression
Towards Constituting Mathematical Structures for Learning to Optimize
Controlled Differential Equations on Long Sequences via Non-standard Wavelets
On User-Level Private Convex Optimization
Improved Online Learning Algorithms for CTR Prediction in Ad Auctions
Robust and private stochastic linear bandits
Differentiable Tree Operations Promote Compositional Generalization
Differentiable Simulations for Enhanced Sampling of Rare Events
B-Learner: Quasi-Oracle Bounds on Heterogeneous Causal Effects Under Hidden Confounding
NTK-approximating MLP Fusion for Efficient Language Model Fine-tuning
The Impact of Exploration on Convergence and Performance of Multi-Agent Q-Learning Dynamics
Refining Generative Process with Discriminator Guidance in Score-based Diffusion Models
Moderately Distributional Exploration for Domain Generalization
Dividing and Conquering a BlackBox to a Mixture of Interpretable Models: Route, Interpret, Repeat
On Coresets for Clustering in Small Dimensional Euclidean spaces
Estimating Joint Treatment Effects by Combining Multiple Experiments
Minimal Width for Universal Property of Deep RNN
Weakly Supervised Regression with Interval Targets
Rockmate: an Efficient, Fast, Automatic and Generic Tool for Re-materialization in PyTorch
Deep Graph Representation Learning and Optimization for Influence Maximization
DoMo-AC: Doubly Multi-step Off-policy Actor-Critic Algorithm
Towards a better understanding of representation dynamics under TD-learning
Decentralized Stochastic Bilevel Optimization with Improved per-Iteration Complexity
The Edge of Orthogonality: A Simple View of What Makes BYOL Tick
Model-based Reinforcement Learning with Scalable Composite Policy Gradient Estimators
Global Convergence of Sub-gradient Method for Robust Matrix Recovery: Small Initialization, Noisy Measurements, and Over-parameterization
Confidence and Dispersity Speak: Characterizing Prediction Matrix for Unsupervised Accuracy Estimation
Linear Time GPs for Inferring Latent Trajectories from Neural Spike Trains
HOPE: High-order Graph ODE For Modeling Interacting Dynamics
Can We Scale Transformers to Predict Parameters of Diverse ImageNet Models?
The Unreasonable Effectiveness of Few-shot Learning for Machine Translation
On Over-Squashing in Message Passing Neural Networks: The Impact of Width, Depth, and Topology
DRew: Dynamically Rewired Message Passing with Delay
Theoretical Bounds on the Network Community Profile from Low-rank Semi-definite Programming
MEWL: Few-shot multimodal word learning with referential uncertainty
Men Also Do Laundry: Multi-Attribute Bias Amplification
Instant Soup: Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models
Rethinking Weak Supervision in Helping Contrastive Representation Learning
Outline, Then Details: Syntactically Guided Coarse-To-Fine Code Generation
Alternately Optimized Graph Neural Networks
LazyGNN: Large-Scale Graph Neural Networks via Lazy Propagation
PAC Generalization via Invariant Representations
Interpretable Neural-Symbolic Concept Reasoning
Effective Structured Prompting by Meta-Learning and Representative Verbalizer
Uncertainty Estimation by Fisher Information-based Evidential Deep Learning
A Likelihood Approach to Nonparametric Estimation of a Singular Distribution Using Deep Generative Models
The Virtues of Laziness in Model-based RL: A Unified Objective and Algorithms
TIDE: Time Derivative Diffusion for Deep Learning on Graphs
Diverse and Faithful Knowledge-Grounded Dialogue Generation via Sequential Posterior Inference
Learning to Boost Training by Periodic Nowcasting Near Future Weights
Multi-Agent Best Arm Identification with Private Communications
Generative Decoding of Visual Stimuli
Invariant Slot Attention: Object Discovery with Slot-Centric Reference Frames
Multi-Fidelity Covariance Estimation in the Log-Euclidean Geometry
Scaling Vision Transformers to 22 Billion Parameters
Weakly Supervised Disentangled Generative Causal Representation Learning
On the Relationship Between Explanation and Prediction: A Causal View
Unifying Molecular and Textual Representations via Multi-task Language Modelling
Recovery Bounds on Class-Based Optimal Transport: A Sum-of-Norms Regularization Framework
Quantitative Universal Approximation Bounds for Deep Belief Networks
Neural signature kernels as infinite-width-depth-limits of controlled ResNets
Revisiting Over-smoothing and Over-squashing Using Ollivier-Ricci Curvature
Leveraging Label Non-Uniformity for Node Classification in Graph Neural Networks
Why do Nearest Neighbor Language Models Work?
Learning Hidden Markov Models When the Locations of Missing Observations are Unknown
Topologically Faithful Image Segmentation via Induced Matching of Persistence Barcodes
Harmonic Neural Networks
On the Effectiveness of Offline RL for Dialogue Response Generation
Modality-Agnostic Variational Compression of Implicit Neural Representations
IncDSI: Incrementally Updatable Document Retrieval
Sampling-Based Accuracy Testing of Posterior Estimators for General Inference
Do Perceptually Aligned Gradients Imply Robustness?
Towards Understanding Ensemble Distillation in Federated Learning
Fast Rates for Maximum Entropy Exploration
Quantile Credit Assignment
LegendreTron: Uprising Proper Multiclass Loss Learning
Nearly-tight Bounds for Deep Kernel Learning
PAL: Program-aided Language Models
Directed Chain Generative Adversarial Networks
Tighter Lower Bounds for Shuffling SGD: Random Permutations and Beyond
Motion Question Answering via Modular Motion Programs
Modeling Dynamic Environments with Scene Graph Memory
Adversarial Classification: Necessary Conditions and Geometric Flows
Collaborative Multi-Agent Heterogeneous Multi-Armed Bandits
Proper Losses for Discrete Generative Models
Meta Learning of Interface Conditions for Multi-Domain Physics-Informed Neural Networks
Enforcing Hard Constraints with Soft Barriers: Safe-driven Reinforcement Learning in Unknown Stochastic Environments
Learning Mixtures of Gaussians with Censored Data
VIMA: Robot Manipulation with Multimodal Prompts
The multimarginal optimal transport formulation of adversarial multiclass classification
Settling the Reward Hypothesis
On Regularization and Inference with Label Constraints
Constrained Optimization via Exact Augmented Lagrangian and Randomized Iterative Sketching
Compressed Decentralized Proximal Stochastic Gradient Method for Nonconvex Composite Problems with Heterogeneous Data
Continuous Spatiotemporal Transformer
Robust Non-Linear Feedback Coding via Power-Constrained Deep Learning
Simple and Fast Group Robustness by Automatic Feature Reweighting
Robust Weight Signatures: Gaining Robustness as Easy as Patching Weights?
Are Large Kernels Better Teachers than Transformers for ConvNets?
Learning to Incentivize Information Acquisition: Proper Scoring Rules Meet Principal-Agent Model
A/B Testing in Network Data with Covariate-Adaptive Randomization
Drug Discovery under Covariate Shift with Domain-Informed Prior Distributions over Functions
Function-Space Regularization in Neural Networks: A Probabilistic Perspective
CogQA: Answering Advanced Questions on Scientific Articles
Toward Large Kernel Models
Stabilizing Transformer Training by Preventing Attention Entropy Collapse
Policy Evaluation and Temporal-Difference Learning in Continuous Time and Space: A Martingale Approach
LoSparse: Structured Compression of Large Language Models based on Low-Rank and Sparse Approximation
HyperTuning: Toward Adapting Large Language Models without Back-propagation
Less is More: Task-aware Layer-wise Distillation for Language Model Compression
Synthetic Prompting: Generating Chain-of-Thought Demonstrations for Large Language Models
Text Generation with Diffusion Language Models: A Pre-training Approach with Continuous Paragraph Denoise
Von Mises Mixture Distributions for Molecular Conformation Generation
Who Needs to Know? Minimal Knowledge for Optimal Coordination
A General Representation Learning Framework with Generalization Performance Guarantees
MALTS: Matching After Learning to Stretch
Gradient Descent Converges Linearly for Logistic Regression on Separable Data
Towards Stable and Efficient Adversarial Training against $l_1$ Bounded Adversarial Attacks
Model-based Offline Reinforcement Learning with Count-based Conservatism
Combinatorial Neural Bandits
Semi-Parametric Contextual Pricing Algorithm using Cox Proportional Hazards Model
Improved Policy Evaluation for Randomized Trials of Algorithmic Resource Allocation
MAGANet: Achieving Combinatorial Generalization by Modeling a Group Action
Computationally Efficient PAC RL in POMDPs with Latent Determinism and Conditional Embeddings
Fascinating Supervisory Signals and Where to Find Them: Deep Anomaly Detection with Scale Learning
Understanding the Role of Feedback in Online Learning with Switching Costs
CircuitNet: A Generic Neural Network to Realize Universal Circuit Motif Modeling
Transcendental Idealism of Planner: Evaluating Perception from Planning Perspective for Autonomous Driving
Adaptive Whitening in Neural Populations with Gain-modulating Interneurons
GeCoNeRF: Few-shot Neural Radiance Fields via Geometric Consistency
Bidirectional Adaptation for Robust Semi-Supervised Learning with Inconsistent Data Distributions
A Distributional Optimization-based Framework for Confidence Bounds of Risk Measures
Offline Meta Reinforcement Learning with In-Distribution Online Adaptation
Improved Learning-Augmented Algorithms for the Multi-Option Ski Rental Problem via Best-Possible Competitive Analysis
Neural Network Accelerated Implicit Filtering: Integrating Neural Network Surrogates With Provably Convergent Derivative Free Optimization Methods
Learning Expressive Priors for Generalization and Uncertainty Estimation in Neural Networks
Generative Graph Dictionary Learning
Vector Quantized Wasserstein Auto-Encoder
Rethinking Warm-Starts with Predictions: Learning Predictions Close to Sets of Optimal Solutions for Faster $\text{L}$-/$\text{L}^\natural$-Convex Function Minimization
Nearly Optimal Algorithms with Sublinear Computational Complexity for Online Kernel Regression
Nonlinear Causal Discovery with Latent Confounders
Traversing Between Modes in Function Space for Fast Ensembling
On the Impact of Knowledge Distillation for Model Interpretability
Towards Trustworthy Explanation: On Causal Rationalization
One Transformer Fits All Distributions in Multi-Modal Diffusion at Scale
Federated Hypergradient Computation via Aggregated Iterative Differentiation
Mu$^2$SLAM: Multitask, Multilingual Speech and Language Models
Constrained Efficient Global Optimization of Expensive Black-box Functions
Returning The Favour: When Regression Benefits From Probabilistic Causal Knowledge
Image generation with shortest path diffusion
Attention-Based Recurrence for Multi-Agent Reinforcement Learning under Stochastic Partial Observability
Exponential Smoothing for Off-Policy Learning
Memory-Based Meta-Learning on Non-Stationary Distributions
Equivariant Architectures for Learning in Deep Weight Spaces
Auxiliary Learning as an Asymmetric Bargaining Game
Extending Kernel PCA through Dualization: Sparsity, Robustness and Fast Algorithms
Towards Unbiased Training in Federated Open-world Semi-supervised Learning
Double-Weighting for Covariate Shift Adaptation
Facial Expression Recognition with Adaptive Frame Rate based on Multiple Testing Correction
A theory of representation learning gives a deep generalisation of kernel methods
Entropy-driven Unsupervised Keypoint Representation Learning in Videos
Monotonicity and Double Descent in Uncertainty Estimation with Gaussian Processes
Reward-Mixing MDPs with Few Contexts are Learnable
Conformalization of Sparse Generalized Linear Models
From Hypergraph Energy Functions to Hypergraph Neural Networks
A Fully First-Order Method for Stochastic Bilevel Optimization
Learning the Dynamics of Sparsely Observed Interacting Systems
Feed Two Birds with One Scone: Exploiting Wild Data for Both Out-of-Distribution Generalization and Detection
Making Transformers Compute-lite for CPU inference
Block subsampled randomized Hadamard transform for Nystr ̈om approximation on distributed architectures
Towards Quantum Machine Learning for Constrained Combinatorial Optimization: a Quantum QAP Solver
On the Robustness of Randomized Ensembles to Adversarial Perturbations
Policy Gradient in Robust MDPs with Global Convergence Guarantee
Robust One-Class Classification with Signed Distance Function using 1-Lipschitz Neural Networks
Provably Learning Object-Centric Representations
Learning Neural Constitutive Laws from Motion Observations for Generalizable PDE Dynamics
Learning Preconditioner for Conjugate Gradient PDE Solvers
Implicit Neural Spatial Representations for Time-dependent PDEs
Flipping Coins to Estimate Pseudocounts for Exploration in Reinforcement Learning
Bayesian Progressive Deep Topic Model with Knowledge Informed Textual Data Coarsening Process
Prometheus: Taming Sample and Communication Complexities in Constrained Decentralized Stochastic Bilevel Learning
Linearly Constrained Bilevel Optimization: A Smoothed Implicit Gradient Approach
Semi-Offline Reinforcement Learning for Optimized Text Generation
Revisiting Gradient Clipping: Stochastic bias and tight convergence guarantees
Hierarchical Neural Coding for Controllable CAD Model Generation
A General Theory for Federated Optimization with Asynchronous and Heterogeneous Clients Updates
CocktailSGD: Fine-tuning Foundation Models over 500Mbps Networks
Discovering Object-Centric Generalized Value Functions From Pixels
FlexGen: High-throughput Generative Inference of Large Language Models with a Single GPU
Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
Optimizing DDPM Sampling with Shortcut Fine-Tuning
A Universal Unbiased Method for Classification from Aggregate Observations
On Pitfalls of Test-Time Adaptation
Learning Control-Oriented Dynamical Structure from Data
Rotation and Translation Invariant Representation Learning with Implicit Neural Representations
Generalized Reductions: Making any Hierarchical Clustering Fair and Balanced with Low Cost
Expectation-Complete Graph Representations with Homomorphisms
Coupled Variational Autoencoder
Multi-Objective Population Based Training
Understanding and Defending Patched-based Adversarial Attacks for Vision Transformer
Federated Adversarial Learning: A Framework with Convergence Analysis
Federated Online and Bandit Convex Optimization
Understanding Int4 Quantization for Language Models: Latency Speedup, Composability, and Failure Cases
High-dimensional Location Estimation via Norm Concentration for Subgamma Vectors
Towards Robust and Safe Reinforcement Learning with Benign Off-policy Data
DualHSIC: HSIC-Bottleneck and Alignment for Continual Learning
Adapting to game trees in zero-sum imperfect information games
Unsupervised Out-of-Distribution Detection with Diffusion Inpainting
Internally Rewarded Reinforcement Learning
Online Platt Scaling with Calibeating
A Kernel-Based View of Language Model Fine-Tuning
Entity Divider with Language Grounding in Multi-Agent Reinforcement Learning
CoDi: Co-evolving Contrastive Diffusion Models for Mixed-type Tabular Synthesis
Shape-Guided Dual-Memory Learning for 3D Anomaly Detection
Towards Theoretical Understanding of Inverse Reinforcement Learning
Refined Regret for Adversarial MDPs with Linear Function Approximation
Stable Estimation of Heterogeneous Treatment Effects
On Generalizations of Some Distance Based Classifiers for HDLSS Data
MixFlows: principled variational inference via mixed flows
Tied-Augment: Controlling Representation Similarity Improves Data Augmentation
Diagnosis, Feedback, Adaptation: A Human-in-the-Loop Framework for Test-Time Policy Adaptation
Accelerated Infeasibility Detection of Constrained Optimization and Fixed-Point Iterations
Low-Variance Gradient Estimation in Unrolled Computation Graphs with ES-Single
The Dormant Neuron Phenomenon in Deep Reinforcement Learning
Behavior Contrastive Learning for Unsupervised Skill Discovery
Understanding Gradient Regularization in Deep Learning: Efficient Finite-Difference Computation and Implicit Bias
Subequivariant Graph Reinforcement Learning in 3D Environments
In Search for a Generalizable Method for Source Free Domain Adaptation
Multi-Task Off-Policy Learning from Bandit Feedback
Sampling random graph homomorphisms and applications to network data analysis
A Fast, Well-Founded Approximation to the Empirical Neural Tangent Kernel
Meta-learning Parameterized Skills
Sequential Strategic Screening
Global Context Vision Transformers
On Bridging the Gap between Mean Field and Finite Width in Deep Random Multilayer Perceptron with Batch Normalization
Estimation Beyond Data Reweighting: Kernel Method of Moments
Generalization on the Unseen, Logic Reasoning and Degree Curriculum
Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities
Importance Weighted Variational Bayes for Protein Sequence Design
Contextual Combinatorial Bandits with Probabilistically Triggered Arms
Propensity Matters: Measuring and Enhancing Balancing for Recommendation
All in a Row: Compressed Convolution Networks for Graphs
Reinforcement Learning in Low-rank MDPs with Density Features
InfoDiffusion: Representation Learning Using Information Maximizing Diffusion Models
Graph Generative Model for Benchmarking Graph Neural Networks
Causal Isotonic Calibration for Heterogeneous Treatment Effects
The photo-sketch correspondence problem: a new benchmark and a self-supervised approach
Path Neural Networks: Expressive and Accurate Graph Neural Networks
The Catalog Problem: Clustering and Ordering Variable-Sized Sets
A Robust Test for the Stationarity Assumption in Sequential Decision Making
Moccasin: Efficient Tensor Rematerialization for Neural Networks
Variational Sparse Inverse Cholesky Approximation for Latent Gaussian Processes via Double Kullback-Leibler Minimization
Simplifying Momentum-based Positive-definite Submanifold Optimization with Applications to Deep Learning
DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature
Deep Regression Unlearning
Understanding and Generalizing Contrastive Learning from the Inverse Optimal Transport Perspective
ClimaX: A foundation model for weather and climate
What Can Be Learnt With Wide Convolutional Neural Networks?
SinDDM: A Single Image Denoising Diffusion Model
An Effective Meaningful Way to Evaluate Survival Models
Conditional Graph Information Bottleneck for Molecular Relational Learning
Hypervolume Knowledge Gradient: A Lookahead Approach for Multi-Objective Bayesian Optimization with Partial Information
Learning Optimal Group-structured Individualized Treatment Rules with Many Treatments
Global optimality for Euclidean CCCP under Riemannian convexity
DSGD-CECA: Decentralized SGD with Communication-Optimal Exact Consensus Algorithm
Which Invariance Should We Transfer? A Causal Minimax Learning Approach
MyoDex: A Generalizable Prior for Dexterous Manipulation
On the Training Instability of Shuffling SGD with Batch Normalization
Efficient Rate Optimal Regret for Adversarial Contextual MDPs Using Online Function Approximation
An Instrumental Variable Approach to Confounded Off-Policy Evaluation
CrossSplit: Mitigating Label Noise Memorization through Data Splitting
FeDXL: Provable Federated Learning for Deep X-Risk Optimization
Effectively Using Public Data in Privacy Preserving Machine Learning
A Nearly-Optimal Bound for Fast Regression with $\ell_\infty$ Guarantee
Multi-channel Autobidding with Budget and ROI Constraints
X-Paste: Revisiting Scalable Copy-Paste for Instance Segmentation using CLIP and StableDiffusion
Text-To-4D Dynamic Scene Generation
Conditions and Assumptions for Constraint-based Causal Structure Learning
SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient
Semi-Dual Unbalanced Quadratic Optimal Transport: fast statistical rates and convergent algorithm.
Spherical Inducing Features for Orthogonally-Decoupled Gaussian Processes
Can Neural Network Memorization Be Localized?
Meta-SAGE: Scale Meta-Learning Scheduled Adaptation with Guided Exploration for Mitigating Scale Shift on Combinatorial Optimization
Conformal Prediction Sets for Graph Neural Networks
Deep Perturbation Learning: Enhancing the Network Performance via Image Perturbations
Neuro-Symbolic Continual Learning: Knowledge, Reasoning Shortcuts and Concept Rehearsal
SeMAIL: Eliminating Distractors in Visual Imitation via Separated Models
On Strengthening and Defending Graph Reconstruction Attack with Markov Chain Approximation
When does Privileged information Explain Away Label Noise?
Revisiting Bellman Errors for Offline Model Selection
Partial Optimality in Cubic Correlation Clustering
Revisiting Data-Free Knowledge Distillation with Poisoned Teachers
Self-Repellent Random Walks on General Graphs - Achieving Minimal Sampling Variance via Nonlinear Markov Chains
TabLeak: Tabular Data Leakage in Federated Learning
Online Learning with Feedback Graphs: The True Shape of Regret
A Scalable Frank-Wolfe-Based Algorithm for the Max-Cut SDP
Anchor Sampling for Federated Learning with Partial Client Participation
Markovian Gaussian Process Variational Autoencoders
Differentially Private Episodic Reinforcement Learning with Heavy-tailed Rewards
General Covariance Data Augmentation for Neural PDE Solvers
From Robustness to Privacy and Back
Eliminating Adversarial Noise via Information Discard and Robust Representation Restoration
The Optimal Approximation Factors in Misspecified Off-Policy Value Function Estimation
SinFusion: Training Diffusion Models on a Single Image or Video
Using Large Language Models to Simulate Multiple Humans and Replicate Human Subject Studies
Oracles and Followers: Stackelberg Equilibria in Deep Multi-Agent Reinforcement Learning
Restoration based Generative Models
Learning Unforeseen Robustness from Out-of-distribution Data Using Equivariant Domain Translator
The Unintended Consequences of Discount Regularization: Improving Regularization in Certainty Equivalence Reinforcement Learning
Cut your Losses with Squentropy
Self-supervised learning of Split Invariant Equivariant representations
Gradient Descent Monotonically Decreases the Sharpness of Gradient Flow Solutions in Scalar Networks and Beyond
IRNeXt: Rethinking Convolutional Network Design for Image Restoration
Meta-learning based Adaptive Stability Certificates in Dynamical Systems
Certifying Ensembles: A General Certification Theory with S-Lipschitzness
Warm-Start Actor-Critic: From Approximation Error to Sub-optimality Gap
Simplified Temporal Consistency Reinforcement Learning
Extending Conformal Prediction to Hidden Markov Models with Exact Validity via de Finetti's Theorem for Markov Chains
PASTA: Pessimistic Assortment Optimization
Functional Neural Networks: Shift invariant models for functional data with applications to EEG classification
Score Approximation, Estimation and Distribution Recovery of Diffusion Models on Low-Dimensional Data
A theory of continuous generative flow networks
Efficient preconditioned stochastic gradient descent for estimation in latent variable models
Set-membership Belief State-based Reinforcement Learning for POMDPs
Efficient Distribution-Free Predictive Inference for Standard and Feedback Covariate Shift
Modeling Temporal Data as Continuous Functions with Stochastic Process Diffusion
Over-parametrization via Lifting for Low-rank Matrix Sensing: Conversion of Spurious Solutions to Strict Saddle Points
Input uncertainty propagation through trained neural networks
In or Out? Fixing ImageNet Out-of-Distribution Detection Evaluation
2D-Shapley: A Framework for Fragmented Data Valuation
DIFF2: Differential Private Optimization via Gradient Differences for Nonconvex Distributed Learning
Near-Optimal Algorithms for Private Online Optimization in the Realizable Regime
RankMe: Assessing the Downstream Performance of Pretrained Self-Supervised Representations by Their Rank
RGE: A Repulsive Graph Rectification for Node Classification via Influence
Adaptive Computation with Elastic Input Sequence
A Study on Transformer Configuration and Training Objective
Nearly-Linear Time and Streaming Algorithms for Outlier-Robust PCA
Correcting discount-factor mismatch in on-policy policy gradient methods
Data Structures for Density Estimation
The Computational Complexity of Concise Hypersphere Classification
Reinforcement Learning from Passive Data via Latent Intentions
Bidirectional Looking with A Novel Double Exponential Moving Average to Adaptive and Non-adaptive Momentum Optimizers
Leveraging Proxy of Training Data for Test-Time Adaptation
Primal and Dual Analysis of Entropic Fictitious Play for Finite-sum Problems
Fine-Tuning Language Models via Epistemic Neural Networks
End-to-end Training of Deep Boltzmann Machines by Unbiased Contrastive Divergence with Local Mode Initialization
Dimension-independent Certified Neural Network Watermarks via Mollifier Smoothing
FARE: Provably Fair Representation Learning with Practical Certificates
Tighter Information-Theoretic Generalization Bounds from Supersamples
On the Convergence Rate of Gaussianization with Random Rotations
Forget Unlearning: Towards True Data-Deletion in Machine Learning
CodeIPPrompt: Intellectual Property Infringement Assessment of Code Language Models
Universal Physics-Informed Neural Networks: Symbolic Differential Operator Discovery with Sparse Data
Feature learning in deep classifiers through Intermediate Neural Collapse
Causal Proxy Models for Concept-based Model Explanations
TGRL: Teacher Guided Reinforcement Learning Algorithm
Identification of the Adversary from a Single Adversarial Example
Fundamental Limits of Two-layer Autoencoders, and Achieving Them with Gradient Methods
Dimensionality Reduction for General KDE Mode Finding
Integrating Prior Knowledge in Contrastive Learning with Kernel
MonoFlow: Rethinking Divergence GANs via the Perspective of Differential Equations
Doubly Optimal No-Regret Learning in Monotone Games
Explaining Reinforcement Learning with Shapley Values
A new near-linear time algorithm for k-nearest neighbor search using a compressed cover tree
Rigid body flows for sampling molecular crystal structures
Multi-Task Differential Privacy Under Distribution Skew
Gibbsian Polar Slice Sampling
Towards Learning Geometric Eigen-Lengths Crucial for Fitting Tasks
Delay-Adapted Policy Optimization and Improved Regret for Adversarial MDP with Delayed Bandit Feedback
Representer Point Selection for Explaining Regularized High-dimensional Models
NeRFool: Uncovering the Vulnerability of Generalizable Neural Radiance Fields against Adversarial Perturbations
Efficient displacement convex optimization with particle gradient descent
The Test of Tests: A Framework for Differentially Private Hypothesis Testing
Spurious Valleys and Clustering Behavior of Neural Networks
Efficient Graph Field Integrators Meet Point Clouds
Fisher Information Embedding for Node and Graph Learning
ELSA: Efficient Label Shift Adaptation through the Lens of Semiparametric Models
Git-Theta: A Git Extension for Collaborative Development of Machine Learning Models
Data-Efficient Contrastive Self-supervised Learning: Most Beneficial Examples for Supervised Learning Contribute the Least
Bayesian Estimation of Differential Privacy
Adaptive Estimation of Graphical Models under Total Positivity
Disentangled Multi-Fidelity Deep Bayesian Active Learning
Understand and Modularize Generator Optimization in ELECTRA-style Pretraining
Optimistic Planning by Regularized Dynamic Programming
Bayesian Design Principles for Frequentist Sequential Learning
Domain Adaptation for Time Series Under Feature and Label Shifts
Towards Sustainable Learning: Coresets for Data-efficient Deep Learning
On Enhancing Expressive Power via Compositions of Single Fixed-Size ReLU Network
On Data Manifolds Entailed by Structural Causal Models
The Hessian perspective into the Nature of Convolutional Neural Networks
Are Neurons Actually Collapsed? On the Fine-Grained Structure in Neural Representations
Master-ASR: Achieving Multilingual Scalability and Low-Resource Adaptation in ASR with Modularized Learning
Learning Belief Representations for Partially Observable Deep RL
MultiAdam: Parameter-wise Scale-invariant Optimizer for Physics-informed Neural Network
Polynomial Preconditioning for Gradient Methods
Learning to Learn from APIs: Black-Box Data-Free Meta-Learning
Slot-VAE: Object-Centric Scene Generation with Slot Attention
Internet Explorer: Targeted Representation Learning on the Open Web
Theoretical Behavior of XAI Methods in the Presence of Suppressor Variables
Short-lived High-volume Bandits
Active Policy Improvement from Multiple Black-box Oracles
Mixing Predictions for Online Metric Algorithms
Deep Generative Symbolic Regression with Monte-Carlo-Tree-Search
Approximate Causal Effect Identification under Weak Confounding
DRCFS: Doubly Robust Causal Feature Selection
Optimal Rates and Efficient Algorithms for Online Bayesian Persuasion
Tractable Control for Auto-regressive Language Generation
Open-Vocabulary Universal Image Segmentation with MaskCLIP
A Large-Scale Study of Probabilistic Calibration in Neural Network Regression
Global optimality of Elman-type RNNs in the mean-field regime
Semi-Autoregressive Energy Flows: Towards Determinant-Free Training of Normalizing Flows
When do Minimax-fair Learning and Empirical Risk Minimization Coincide?
Probabilistic Imputation for Time-series Classification with Missing Data
Robust Counterfactual Explanations for Neural Networks With Probabilistic Guarantees
Multi-Agent Learning from Learners
Characterizing Multicalibration via Property Elicitation
Conformal Prediction with Missing Values
How Do Transformers Learn Topic Structure: Towards a Mechanistic Understanding
Subset-Based Instance Optimality in Private Estimation
A Statistical Perspective on Retrieval-Based Models
Effective Minkowski Dimension of Deep Nonparametric Regression: Function Approximation and Statistical Theories
On the Importance of Feature Decorrelation for Unsupervised Representation Learning in Reinforcement Learning
Unearthing InSights into Mars: Unsupervised Source Separation with Limited Data
Multi-Symmetry Ensembles: Improving Diversity and Generalization via Opposing Symmetries
When is Realizability Sufficient for Off-Policy Reinforcement Learning?
Bootstrapped Representations in Reinforcement Learning
New metrics and search algorithms for weighted causal DAGs
Exact Inference in High-order Structured Prediction
The Statistical Benefits of Quantile Temporal-Difference Learning for Value Estimation
Hyperparameters in Reinforcement Learning and How To Tune Them
Identifiability and Generalizability in Constrained Inverse Reinforcement Learning
Optimizing the Collaboration Structure in Cross-Silo Federated Learning
Smart Initial Basis Selection for Linear Programs
Additive Causal Bandits with Unknown Graph
Mitigating Spurious Correlations in Multi-modal Models during Fine-tuning
Resurrecting Recurrent Neural Networks for Long Sequences
Predicting Rare Events by Shrinking Towards Proportional Odds
Efficient Quantum Algorithms for Quantum Optimal Control
SLAMB: Accelerated Large Batch Training with Sparse Communication
High Probability Convergence of Stochastic Gradient Methods
Towards Understanding and Reducing Graph Structural Noise for GNNs
On the Convergence of Gradient Flow on Multi-layer Linear Models
Multi-species multi-task benchmark for learned representations of behavior
Beyond Exponentially Fast Mixing in Average-Reward Reinforcement Learning via Multi-Level Monte Carlo Actor-Critic
Stochastic Marginal Likelihood Gradients using Neural Tangent Kernels
Maximal Initial Learning Rates in Deep ReLU Networks
Generalized Implicit Follow-The-Regularized-Leader
Algorithms for bounding contribution for histogram estimation under user-level privacy
Are Gaussian Data All You Need? The Extents and Limits of Universality in High-Dimensional Generalized Linear Estimation
One-shot Imitation in a Non-Stationary Environment via Multi-Modal Skill
Structural Re-weighting Improves Graph Domain Adaptation
Machine Learning Force Fields with Data Cost Aware Training
ED-Batch: Efficient Automatic Batching of Dynamic Neural Networks via Learned Finite State Machines
Emergent Asymmetry of Precision and Recall for Measuring Fidelity and Diversity of Generative Models in High Dimensions
An SDE for Modeling SAM: Theory and Insights
MG-GNN: Multigrid Graph Neural Networks for Learning Multilevel Domain Decomposition Methods
Applied Online Algorithms with Heterogeneous Predictors
A Model-Based Method for Minimizing CVaR and Beyond
Extrapolative Controlled Sequence Generation via Iterative Refinement
mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video
Protecting Language Generation Models via Invisible Watermarking
Kernel QuantTree
Rethinking Backdoor Attacks
Adversarially Robust PAC Learnability of Real-Valued Functions
Task-specific experimental design for treatment effect estimation
Statistical Indistinguishability of Learning Algorithms
$H$-Consistency Bounds for Pairwise Misranking Loss Surrogates
Learning to Jump: Thinning and Thickening Latent Counts for Generative Modeling
Neural Markov Jump Processes
PFGM++: Unlocking the Potential of Physics-Inspired Generative Models
Parallel neurosymbolic integration with Concordia
Prototype-oriented unsupervised anomaly detection for multivariate time series
Conditional Tree Matching for Inference-Time Adaptation of Tree Prediction Models
Uncertainty Estimation for Molecules: Desiderata and Methods
Hierarchical Clustering: A Nearly-Optimal Construction for Well-Clustered Graphs
Graphically Structured Diffusion Models
Beam Tree Recursive Cells
User-level Private Stochastic Convex Optimization with Optimal Rates
SMURF-THP: Score Matching-based UnceRtainty quantiFication for Transformer Hawkes Process
Counterfactual Identifiability of Bijective Causal Models
Tensor Decompositions Meet Control Theory: Learning General Mixtures of Linear Dynamical Systems
E$(n)$ Equivariant Message Passing Simplicial Networks
LEVER: Learning to Verify Language-to-Code Generation with Execution
Optimal No-Regret Learning for One-Sided Lipschitz Functions
Partially Observable Multi-agent RL with Provable (Quasi-)Efficiency: Information-Sharing to the Rescue
Neural FIM for learning Fisher information metrics from point cloud data
Variational Mixture of HyperGenerators for Learning Distributions over Functions
Robust Subtask Learning for Compositional Generalization
Strategic Classification with Unknown User Manipulations
Robust and Scalable Bayesian Online Changepoint Detection
Robustly Learning a Single Neuron via Sharpness
Data-OOB: Out-of-bag Estimate as a Simple and Efficient Data Value
From Adaptive Query Release to Machine Unlearning
A Study of Global and Episodic Bonuses for Exploration in Contextual MDPs
Iterative Approximate Cross-Validation
DecompDiff: Diffusion Models with Decomposed Priors for Structure-Based Drug Design
Inflow, Outflow, and Reciprocity in Machine Learning
Predictive Flows for Faster Ford-Fulkerson
Multisample Flow Matching: Straightening Flows with Minibatch Couplings
Active causal structure learning with advice
Graph Inductive Biases in Transformers without Message Passing
Private Federated Learning with Autotuned Compression
Principled Acceleration of Iterative Numerical Methods Using Machine Learning
Complexity of block coordinate descent with proximal regularization and applications to Wasserstein CP-dictionary learning
Best of Both Worlds Policy Optimization
Towards Bridging the Gaps between the Right to Explanation and the Right to be Forgotten
Deep Anomaly Detection under Labeling Budget Constraints
On Kinetic Optimal Probability Paths for Generative Models
Off-Policy Average Reward Actor-Critic with Deterministic Policy Search
Coarse-to-Fine: a Hierarchical Diffusion Model for Molecule Generation in 3D
Dirichlet Diffusion Score Model for Biological Sequence Generation
NUNO: A General Framework for Learning Parametric PDEs with Non-Uniform Data
Unit Scaling: Out-of-the-Box Low-Precision Training
Training-Free Neural Active Learning with Initialization-Robustness Guarantees
Learning Rate Schedules in the Presence of Distribution Shift
Approximate Stein Classes for Truncated Density Estimation
Joint Implicit Neural Representations for Global-Scale Species Mapping
Adaptive Coordination in Social Embodied Rearrangement
OMS-DPM: Optimizing the Model Schedule for Diffusion Probabilistic Model
Prefer to Classify: Improving Text Classifiers via Auxiliary Preference Learning
On Preemption and Learning in Stochastic Scheduling
Linear CNNs Discover the Statistical Structure of the Dataset Using only the Most Dominant Frequencies
Properties of the Mallows Model Depending on the Number of Alternatives: A Warning for an Experimentalist
A Model-free Closeness-of-influence Test for Features in Supervised Learning
Can Large Language Models Reason about Program Behavior?
Bayesian Unrolling: Scalable, Inverse-Free Maximum Likelihood Estimation of Latent Gaussian Models
Accelerated Primal-Dual Methods for Convex-Strongly-Concave Saddle Point Problems
Are Random Decompositions all we need in High Dimensional Bayesian Optimisation?
Exphormer: Sparse Transformers for Graphs
Supervised Metric Learning to Rank for Retrieval via Contextual Similarity Optimization
Optimal Stochastic Non-smooth Non-convex Optimization through Online-to-Non-convex Conversion
Federated Heavy Hitter Recovery under Linear Sketching
Inferring Relational Potentials in Interacting Systems
Conditionally Strongly Log-Concave Generative Models
Toward Efficient Grad-Based Value Estimation
Fast as CHITA: Neural Network Pruning with Combinatorial Optimization
Provably Learning Diverse Features in Multi-View Data with Midpoint Mixup
Dual Propagation: Accelerating Contrastive Hebbian Learning with Dyadic Neurons
Identifying Useful Learnwares for Heterogeneous Label Spaces
Fair Neighbor Embedding
Blackout Diffusion: Generative Diffusion Models in Discrete-State Spaces
Bandit Online Linear Optimization with Hints and Queries
Improving $\ell_1$-Certified Robustness via Randomized Smoothing by Leveraging Box Constraints
MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation
Test-time Adaptation with Slot-Centric Models
Generative Pretraining for Offline Model-based Optimization
Provable Multi-instance Deep AUC Maximization with Stochastic Pooling
Generalized-Smooth Nonconvex Optimization is As Efficient As Smooth Nonconvex Optimization
Improved Techniques for Maximum Likelihood Estimation for Diffusion ODEs
Weighted flow diffusion for local graph clustering with node attributes: an algorithm and statistical guarantees
Compressing Tabular Data via Latent Variable Estimation
A Conditional Normalizing Flow for Accelerated Multi-Coil MR Imaging
Learnability and Algorithm for Continual Learning
Online Local Differential Private Quantile Inference via Self-normalization
H-Likelihood Approach to Deep Neural Networks with Temporal-Spatial Random Effects for High-Cardinality Categorical Features
Transformers Learn In-Context by Gradient Descent
The Persistent Laplacian for Data Science: Evaluating Higher-Order Persistent Spectral Representations of Data
DUET: 2D Structured and Approximately Equivariant Representations
Predicting Ordinary Differential Equations with Transformers
BNN-DP: Robustness Certification of Bayesian Neural Networks via Dynamic Programming
Critical Points and Convergence Analysis of Generative Deep Linear Networks Trained with Bures-Wasserstein Loss
Finite-Sample Analysis of Learning High-Dimensional Single ReLU Neuron
Simple Hardware-Efficient Long Convolutions for Sequence Modeling
Specializing Smaller Language Models towards Multi-Step Reasoning
Coin Sampling: Gradient-Based Bayesian Inference without Learning Rates
Feature Programming for Multivariate Time Series Prediction
A Critical View of Vision-Based Long-Term Dynamics Prediction Under Environment Misalignment
PixelAsParam: A Gradient View on Diffusion Sampling with Guidance
Learning Lightweight Object Detectors via Multi-Teacher Progressive Distillation
COMCAT: Towards Efficient Compression and Customization of Attention-Based Vision Models
Emergence of Adaptive Circadian Rhythms in Deep Reinforcement Learning
QAS-Bench: Rethinking Quantum Architecture Search and A Benchmark
Half-Hop: A graph upsampling approach for slowing down message passing
Group Equivariant Fourier Neural Operators for Partial Differential Equations
Near-Optimal Quantum Coreset Construction Algorithms for Clustering
Bilevel Optimization with Coupled Decision-dependent Distributions
Differentially Private Distributed Bayesian Linear Regression with MCMC
InGram: Inductive Knowledge Graph Embedding via Relation Graphs
Provably Efficient Offline Reinforcement Learning with Perturbed Data Sources
On the Initialization of Graph Neural Networks
CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling
A Category-theoretical Meta-analysis of Definitions of Disentanglement
Quantifying the Variability Collapse of Neural Networks
Reinforcement Learning Can Be More Efficient with Multiple Rewards
Robust Explanation for Free or At the Cost of Faithfulness
Jump-Start Reinforcement Learning
Towards Reliable Neural Specifications
Universal Morphology Control via Contextual Modulation
Learning the Right Layers a Data-Driven Layer-Aggregation Strategy for Semi-Supervised Learning on Multilayer Graphs
Provable Benefit of Mixup for Finding Optimal Decision Boundaries
Covariate balancing using the integral probability metric for causal inference
Network Effects in Performative Prediction Games
SpENCNN: Orchestrating Encoding and Sparsity for Fast Homomorphically Encrypted Neural Network Inference
The case for 4-bit precision: k-bit Inference Scaling Laws
Explore and Exploit the Diverse Knowledge in Model Zoo for Domain Generalization
Regions of Reliability in the Evaluation of Multivariate Probabilistic Forecasts
Provable Copyright Protection for Generative Models
Online Restless Bandits with Unobserved States
Analyzing Diffusion as Serial Reproduction
Posterior Sampling for Deep Reinforcement Learning
Probably Anytime-Safe Stochastic Combinatorial Semi-Bandits
Multi-Objective GFlowNets
Linear Causal Disentanglement via Interventions
Learning Dynamic Query Combinations for Transformer-based Object Detection and Segmentation
Cramming: Training a Language Model on a single GPU in one day.
On Computing Optimal Tree Ensembles
Kernel Logistic Regression Approximation of an Understandable ReLU Neural Network
A Game-Theoretic Framework for Managing Risk in Multi-Agent Systems
Learning Compiler Pass Orders using Coreset and Normalized Value Prediction
SGD with large step sizes learns sparse features
Learning Prescriptive ReLU Networks
Achieving Linear Speedup in Non-IID Federated Bilevel Learning
Grounding Large Language Models in Interactive Environments with Online Reinforcement Learning
Few-bit Backward: Quantized Gradients of Activation Functions for Memory Footprint Reduction
Do Machine Learning Models Learn Statistical Rules Inferred from Data?
Blossom: an Anytime Algorithm for Computing Optimal Decision Trees
Generalized Disparate Impact for Configurable Fairness Solutions in ML
Constant Matters: Fine-grained Error Bound on Differentially Private Continual Observation
Computational Asymmetries in Robust Classification
Probabilistic Attention-to-Influence Neural Models for Event Sequences
Flash: Concept Drift Adaptation in Federated Learning
Input Perturbation Reduces Exposure Bias in Diffusion Models
NeuralStagger: Accelerating Physics-constrained Neural PDE Solver with Spatial-temporal Decomposition
Differentially Private Sharpness-Aware Training
Monotonic Location Attention for Length Generalization
Near-Optimal Cryptographic Hardness of Agnostically Learning Halfspaces and ReLU Regression under Gaussian Marginals
PaLM-E: An Embodied Multimodal Language Model
Stein Variational Goal Generation for adaptive Exploration in Multi-Goal Reinforcement Learning
Self-Interpretable Time Series Prediction with Counterfactual Explanations
Provably Efficient Representation Learning with Tractable Planning in Low-Rank POMDP
Sketch-Flip-Merge: Mergeable Sketches for Private Distinct Count
DiscoBAX - Discovery of optimal intervention sets in genomic experiment design
UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers
Nonparametric Extensions of Randomized Response for Private Confidence Sets
Optimization for Amortized Inverse Problems
Second-Order Optimization with Lazy Hessians
A Kernelized Stein Discrepancy for Biological Sequences
Streaming Submodular Maximization with Differential Privacy
What do CNNs Learn in the First Layer and Why? A Linear Systems Perspective
Retrosynthetic Planning with Dual Value Networks
Diffusion Models are Minimax Optimal Distribution Estimators
MultiresNet: Sequence Modeling with Multiresolution Convolutional Memory
Collaborative Causal Inference with Fair Incentives
Performative Reinforcement Learning
Approximation and Estimation Ability of Transformers for Sequence-to-Sequence Functions with Infinite Dimensional Input
GuardHFL: Privacy Guardian for Heterogeneous Federated Learning
Overcoming Simplicity Bias in Deep Networks using a Feature Sieve
Exploring the Limits of Model-Targeted Indiscriminate Data Poisoning Attacks
The Regret of Exploration and the Control of Bad Episodes in Reinforcement Learning
The Wisdom of Hindsight Makes Language Models Better Instruction Followers
STEP: Learning N:M Structured Sparsity Masks from Scratch with Precondition
Speeding Up Bellman Ford via Minimum Violation Permutations
SparseGPT: Massive Language Models Can be Accurately Pruned in One-Shot
Efficient RL via Disentangled Environment and Agent Representations
Reflected Diffusion Models
How much does Initialization Affect Generalization?
Context Consistency Regularization for Label Sparsity in Time Series
Constrained Monotonic Neural Networks
Attributing Image Generative Models using Latent Fingerprints
Principled Offline RL in the Presence of Rich Exogenous Information
Multicalibration as Boosting for Regression
Discrete Continuous Optimization Framework for Simultaneous Clustering and Training in Mixture Models
Revisiting the Linear-Programming Framework for Offline RL with General Function Approximation
Statistical Foundations of Prior-Data Fitted Networks
Anti-Exploration by Random Network Distillation
Fast, Differentiable and Sparse Top-k: a Convex Analysis Perspective
Certified Robust Neural Networks: Generalization and Corruption Resistance
NNSplitter: An Active Defense Solution for DNN Model via Automated Weight Obfuscation
Data Poisoning Attacks Against Multimodal Encoders
Kernel Sufficient Dimension Reduction and Variable Selection for Compositional Data via Amalgamation
Team Belief DAG: Generalizing the Sequence Form to Team Games for Fast Computation of Correlated Team Max-Min Equilibria via Regret Minimization
MANSA: Learning Fast and Slow in Multi-Agent Systems
Causal Discovery with Latent Confounders Based on Higher-Order Cumulants
Data Representations' Study of Latent Image Manifolds
Contextual Conservative Interleaving Bandits
Improved Regret for Efficient Online Reinforcement Learning with Linear Function Approximation
FAENet: Frame Averaging Equivariant GNNs for Materials Modeling
Active Learning based Structural Inference
SparseProp: Efficient Sparse Backpropagation for Faster Training of Neural Networks at the Edge
Effective and Efficient Structural Inference with Reservoir Computing
Fractional Denoising for 3D Molecular Pre-training
A Reinforcement Learning Framework for Dynamic Mediation Analysis
On the Functional Similarity of Robust and Non-Robust Neural Representations
Generalization Bounds using Data-Dependent Fractal Dimensions
Learn to Accumulate Evidence from All Training Samples: Theory and Practice
Multi-agent Online Scheduling: MMS Allocations for Indivisible Items
High Fidelity Image Counterfactuals with Probabilistic Causal Models
Prototype-Sample Relation Distillation: Towards Replay-Free Continual Learning
Neural Collapse in Deep Linear Networks: From Balanced to Imbalanced Data
Continual Task Allocation in Meta-Policy Network via Sparse Prompting
Causal Modeling of Policy Interventions From Sequences of Treatments and Outcomes using Gaussian Processes
Learning Deductive Reasoning from Synthetic Corpus based on Formal Logic
UMD: Unsupervised Model Detection for X2X Backdoor Attacks
Extrapolated Random Tree for Regression
Hierarchies of Reward Machines
Contrastive Energy Prediction for Exact Energy-Guided Diffusion Sampling in Offline Reinforcement Learning
Out-of-Distribution Generalization of Federated Learning via Implicit Invariant Relationships
Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling
ReLOAD: Reinforcement Learning with Optimistic Ascent-Descent for Last-Iterate Convergence in Constrained MDPs
Federated Linear Contextual Bandits with User-level Differential Privacy
Geometric Clifford Algebra Networks
Nonparametric Generative Modeling with Conditional Sliced-Wasserstein Flows
Accelerated Stochastic Optimization Methods under Quasar-convexity
N$\text{A}^\text{2}$Q: Neural Attention Additive Model for Interpretable Multi-Agent Q-Learning
State and parameter learning with PARIS particle Gibbs
Bayes-optimal Learning of Deep Random Networks of Extensive-width
Tuning Computer Vision Models With Task Rewards
Under-Counted Tensor Completion with Neural Incorporation of Attributes
Weighted Sampling without Replacement for Deep Top-$k$ Classification
Online Mechanism Design for Information Acquisition
LM-Design: Structure-informed Language Models Are Protein Designers
Towards Coherent Image Inpainting Using Denoising Diffusion Implicit Models
Go Beyond Imagination: Maximizing Episodic Reachability with World Models
Curiosity in Hindsight: Intrinsic Exploration in Stochastic Environments
Sequential Multi-Dimensional Self-Supervised Learning for Clinical Time Series
Efficient Algorithms for Exact Graph Matching on Correlated Stochastic Block Models with Constant Correlation
A Group Symmetric Stochastic Differential Equation Model for Molecule Multi-modal Pretraining
Causal Bounds in Quasi-Markovian Graphs
Delay-agnostic Asynchronous Coordinate Update Algorithm
FedCR: Personalized Federated Learning Based on Across-Client Common Representation with Conditional Mutual Information Regularization
Lifelong Language Pretraining with Distribution-Specialized Experts
A Picture of the Space of Typical Learnable Tasks
Hyena Hierarchy: Towards Larger Convolutional Language Models
Transformers as Algorithms: Generalization and Stability in In-context Learning
GibbsDDRM: A Partially Collapsed Gibbs Sampler for Solving Blind Inverse Problems with Denoising Diffusion Restoration
Dissecting the Effects of SGD Noise in Distinct Regimes of Deep Learning
Neural Status Registers
Convex Geometry of ReLU-layers, Injectivity on the Ball and Local Reconstruction
Improving Adversarial Robustness Through the Contrastive-Guided Diffusion Process
Linkless Link Prediction via Relational Distillation
FedAvg Converges to Zero Training Loss Linearly for Overparameterized Multi-Layer Neural Networks
Subset Selection Based On Multiple Rankings in the Presence of Bias: Effectiveness of Fairness Constraints for Multiwinner Voting Score Functions
Expertise Trees Resolve Knowledge Limitations in Collective Decision-Making
Efficient Learning of Mesh-Based Physical Simulation with Bi-Stride Multi-Scale Graph Neural Network
Bag of Tricks for Training Data Extraction from Language Models
Comparison of meta-learners for estimating multi-valued treatment heterogeneous effects
One-Step Estimator for Permuted Sparse Recovery
Structured Cooperative Learning with Graphical Model Priors
A Mathematical Model for Curriculum Learning for Parities
Guiding Pretraining in Reinforcement Learning with Large Language Models
Learning Perturbations to Explain Time Series Predictions
Learning to Suggest Breaks: Sustainable Optimization of Long-Term User Engagement
Training Normalizing Flows from Dependent Data
The Power of Uniform Sampling for k-Median
Towards Explaining Distribution Shifts
Direct Parameterization of Lipschitz-Bounded Deep Networks
Generated Graph Detection
Adaptive Smoothing Gradient Learning for Spiking Neural Networks
Pre-training for Speech Translation: CTC Meets Optimal Transport
Out-of-Domain Robustness via Targeted Augmentations
Constrained Causal Bayesian Optimization
Dynamic Constrained Submodular Optimization with Polylogarithmic Update Time
A Theoretical Analysis of the Learning Dynamics under Class Imbalance
Low-Switching Policy Gradient with Exploration via Online Sensitivity Sampling
Improving Adversarial Robustness of Deep Equilibrium Models with Explicit Regulations Along the Neural Dynamics
Towards Practical Preferential Bayesian Optimization with Skew Gaussian Processes
What can online reinforcement learning with function approximation benefit from general coverage conditions?
End-to-End Learning for Stochastic Optimization: A Bayesian Perspective
Revisiting Domain Randomization via Relaxed State-Adversarial Policy Optimization
Emergence of Sparse Representations from Noise
Non-stationary Reinforcement Learning under General Function Approximation
MetaDiffuser: Diffusion Model as Conditional Planner for Offline Meta-RL
SemSup-XC: Semantic Supervision for Zero and Few-shot Extreme Classification
Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models
Convergence of first-order methods for nonconvex constrained optimization with dependent data
Hyperbolic Diffusion Embedding and Distance for Hierarchical Representation Learning
Graph Neural Networks with Learnable and Optimal Polynomial Bases
Efficient and Equivariant Graph Networks for Predicting Quantum Hamiltonian
Learning to Maximize Mutual Information for Dynamic Feature Selection
Emergent Agentic Transformer from Chain of Hindsight Experience
Transformer-based Stagewise Decomposition for Large-Scale Multistage Stochastic Optimization
The Saddle-Point Method in Differential Privacy
From Temporal to Contemporaneous Iterative Causal Discovery in the Presence of Latent Confounders
GOAT: A Global Transformer on Large-scale Graphs
Learning to Initiate and Reason in Event-Driven Cascading Processes
Semi Bandit dynamics in Congestion Games: Convergence to Nash Equilibrium and No-Regret Guarantees.
Scaling Laws for Generative Mixed-Modal Language Models
Improved Algorithms for Multi-period Multi-class Packing Problems with Bandit Feedback
Temporally Consistent Transformers for Video Generation
Towards Better Graph Representation Learning with Parameterized Decomposition & Filtering
Large Language Models Can Be Easily Distracted by Irrelevant Context
Identifiability of Label Noise Transition Matrix
Proper Scoring Rules for Survival Analysis
Efficient List-Decodable Regression using Batches
Parallel $Q$-Learning: Scaling Off-policy Reinforcement Learning under Massively Parallel Simulation
Instrumental Variable Estimation of Average Partial Causal Effects
One-sided Matrix Completion from Two Observations Per Row
How Bad is Top-$K$ Recommendation under Competing Content Creators?
Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models
Stabilizing GANs' Training with Brownian Motion Controller
Featured Graph Coarsening with Similarity Guarantees
Biases in Evaluation of Molecular Optimization Methods and Bias Reduction Strategies
Policy Regularization with Dataset Constraint for Offline Reinforcement Learning
R-U-SURE? Uncertainty-Aware Code Suggestions By Maximizing Utility Across Random User Intents
Lower Bounds for Learning in Revealing POMDPs
Width and Depth Limits Commute in Residual Networks
Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic Shortest Path
Hiding Data Helps: On the Benefits of Masking for Sparse Coding
Streaming Active Learning with Deep Neural Networks
Multiply Robust Off-policy Evaluation and Learning under Truncation by Death
Shiftable Context: Addressing Training-Inference Context Mismatch in Simultaneous Speech Translation
Data Feedback Loops: Model-driven Amplification of Dataset Biases
Meta-Learning the Inductive Bias of Simple Neural Circuits
Neural Latent Aligner: Cross-trial Alignment for Learning Representations of Complex, Naturalistic Neural Data
Provable Reset-free Reinforcement Learning by No-Regret Reduction
Mimetic Initialization of Self-Attention Layers
Individually Fair Learning with One-Sided Feedback
Uncertain Evidence in Probabilistic Models and Stochastic Simulators
Lottery Tickets in Evolutionary Optimization: On Sparse Backpropagation-Free Trainability
On the Impact of Algorithmic Recourse on Social Segregation
Beyond Uniform Lipschitz Condition in Differentially Private Optimization
SurCo: Learning SURrogate costs for COmbinatorial Nonlinear Optimization Problems
Towards a Persistence Diagram that is Robust to Noise and Varied Densities
Stochastic Gradient Succeeds for Bandits
Whose Opinions Do Language Models Reflect?
Optimal Sets and Solution Paths of ReLU Networks
Transformed Distribution Matching for Missing Value Imputation
Understanding Oversquashing in GNNs through the Lens of Effective Resistance
Fast Rates in Time-Varying Strongly Monotone Games
Blockwise Stochastic Variance-Reduced Methods with Parallel Speedup for Multi-Block Bilevel Optimization
Tighter Analysis for ProxSkip
On Distribution Dependent Sub-Logarithmic Query Time of Learned Indexing
Atari-5: Distilling the Arcade Learning Environment down to Five Games
Auto-Differentiation of Relational Computations for Very Large Scale Machine Learning
Optimal Online Generalized Linear Regression with Stochastic Noise and Its Application to Heteroscedastic Bandits
Ske2Grid: Skeleton-to-Grid Representation Learning for Action Recognition
Graph Reinforcement Learning for Network Control via Bi-Level Optimization
General Sequential Episodic Memory Model
What is Essential for Unseen Goal Generalization of Offline Goal-conditioned RL?
System identification of neural systems: If we got it right, would we know?
Cross-Modal Fine-Tuning: Align then Refine
Gradient-based Wang--Landau Algorithm: A Novel Sampler for Output Distribution of Neural Networks over the Input Space
Spatial-Temporal Graph Learning with Adversarial Contrastive Adaptation
Brainformers: Trading Simplicity for Efficiency
Learning to Bid in Repeated First-Price Auctions with Budgets
Stochastic Policy Gradient Methods: Improved Sample Complexity for Fisher-non-degenerate Policies
Infinite Action Contextual Bandits with Reusable Data Exhaust
Dropout Reduces Underfitting
D2Match: Leveraging Deep Learning and Degeneracy for Subgraph Matching
Improving Medical Predictions by Irregular Multimodal Electronic Health Records Modeling
Target-based Surrogates for Stochastic Optimization
On Pre-Training for Visuo-Motor Control: Revisiting a Learning-from-Scratch Baseline
Investigating the Role of Model-Based Learning in Exploration and Transfer
MolDiff: Addressing the Atom-Bond Inconsistency Problem in 3D Molecule Diffusion Generation
On the Convergence of Federated Averaging with Cyclic Client Participation
Scaling Up Dataset Distillation to ImageNet-1K with Constant Memory
Sketched Ridgeless Linear Regression: The Role of Downsampling
Does Sparsity Help in Learning Misspecified Linear Bandits?
Randomized Schur Complement Views for Graph Contrastive Learning
Improved Algorithms for White-Box Adversarial Streams
Exploring the Benefits of Training Expert Language Models over Instruction Tuning
Trompt: Towards a Better Deep Neural Network for Tabular Data
GREAD: Graph Neural Reaction-Diffusion Networks
FlexRound: Learnable Rounding based on Element-wise Division for Post-Training Quantization
Active Ranking of Experts Based on their Performances in Many Tasks
Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding
Distributional Offline Policy Evaluation with Predictive Error Guarantees
MODeL: Memory Optimizations for Deep Learning
Test-Time Style Shifting: Handling Arbitrary Styles in Domain Generalization
Learning Mixtures of Markov Chains and MDPs
Chameleon: Adapting to Peer Images for Planting Durable Backdoors in Federated Learning
Abstract-to-Executable Trajectory Translation for One-Shot Task Generalization
Mixture Proportion Estimation Beyond Irreducibility
Pre-computed memory or on-the-fly encoding? A hybrid approach to retrieval augmentation makes the most of your compute
Polynomial Time and Private Learning of Unbounded Gaussian Mixture Models
Autoregressive Diffusion Model for Graph Generation
Poisoning Generative Replay in Continual Learning to Promote Forgetting
Learning Globally Smooth Functions on Manifolds
Tight Data Access Bounds for Private Top-$k$ Selection
Understanding Plasticity in Neural Networks
When Sparsity Meets Contrastive Models: Less Graph Data Can Bring Better Class-Balanced Representations
Magneto: A Foundation Transformer
GNOT: A General Neural Operator Transformer for Operator Learning
A Three-regime Model of Network Pruning
Last Switch Dependent Bandits with Monotone Payoff Functions
Inverse Reinforcement Learning without Reinforcement Learning
Topological Point Cloud Clustering
An Information-Theoretic Analysis of Nonstationary Bandit Learning
On the Global Convergence of Risk-Averse Policy Gradient Methods with Expected Conditional Risk Measures
Addressing Budget Allocation and Revenue Allocation in Data Market Environment Using an Adaptive Sampling Algorithm
Wrapped Cauchy Distributed Angular Softmax for Long-Tailed Visual Recognition
Achieving Hierarchy-Free Approximation for Bilevel Programs with Equilibrium Constraints
On Penalty-based Bilevel Gradient Descent Method
Cooperation in the Latent Space: The Benefits of Adding Mixture Components in Variational Autoencoders
GraphCleaner: Detecting Mislabelled Samples in Popular Graph Learning Benchmarks
Matrix Estimation for Individual Fairness
Mirror Sinkhorn: Fast Online Optimization on Transport Polytopes
Eventual Discounting Temporal Logic Counterfactual Experience Replay
Competitive Gradient Optimization
Better Diffusion Models Further Improve Adversarial Training
DeSRA: Detect and Delete the Artifacts of GAN-based Real-World Super-Resolution Models
Automatically Auditing Large Language Models via Discrete Optimization
Data-Copying in Generative Models: A Formal Framework
Graph Contrastive Backdoor Attacks
Provable Data Subset Selection For Efficient Neural Networks Training
On the Stepwise Nature of Self-Supervised Learning
Estimating Causal Effects using a Multi-task Deep Ensemble
Coordinate Descent Methods for Fractional Minimization
Subsample Ridge Ensembles: Equivalences and Generalized Cross-Validation
Incentivizing Exploration with Linear Contexts and Combinatorial Actions
Proximal Causal Learning of Conditional Average Treatment Effects
SAM operates far from home: eigenvalue regularization as a dynamical phenomenon
Approximation Algorithms for Fair Range Clustering
Towards Understanding and Improving GFlowNet Training
Gradient-Free Structured Pruning with Unlabeled Data
Coder Reviewer Reranking for Code Generation
A Hybrid Quantum-Classical Approach based on the Hadamard Transform for the Convolutional Layer
Benign Overfitting in Deep Neural Networks under Lazy Training
Dataset Distillation with Convexified Implicit Gradients
Omnipredictors for Constrained Optimization
Recasting Self-Attention with Holographic Reduced Representations
Gaussian processes at the Helm(holtz): A more fluid model for ocean currents
Pairwise Ranking Losses of Click-Through Rates Prediction for Welfare Maximization in Ad Auctions
Hindsight Learning for MDPs with Exogenous Inputs
Model Transferability with Responsive Decision Subjects
Delving into Noisy Label Detection with Clean Data
GC-Flow: A Graph-Based Flow Network for Effective Clustering
Nonparametric Density Estimation under Distribution Drift
Global Selection of Contrastive Batches via Optimization on Sample Permutations
Adversarial Example Does Good: Preventing Painting Imitation from Diffusion Models via Adversarial Examples
LESS-VFL: Communication-Efficient Feature Selection for Vertical Federated Learning
On Balancing Bias and Variance in Unsupervised Multi-Source-Free Domain Adaptation
Abstracting Imperfect Information Away from Two-Player Zero-Sum Games
Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision Processes
Disentangled Generative Models for Robust Prediction of System Dynamics
Curious Replay for Model-based Adaptation
Deep Latent State Space Models for Time-Series Generation
Pricing Experimental Design: Causal Effect, Expected Revenue and Tail Risk
FedVS: Straggler-Resilient and Privacy-Preserving Vertical Federated Learning for Split Models
Improving Bi-level Optimization Based Methods with Inspiration from Humans' Classroom Study Techniques
Linear optimal partial transport embedding
Quantum Speedups for Zero-Sum Games via Improved Dynamic Gibbs Sampling
Probabilistic Categorical Adversarial Attack and Adversarial Training
Reducing SO(3) Convolutions to SO(2) for Efficient Equivariant GNNs
Neural Algorithmic Reasoning with Causal Regularisation
Adversarial Learning of Distributional Reinforcement Learning
Robust Speech Recognition via Large-Scale Weak Supervision
Sequential Monte Carlo Learning for Time Series Structure Discovery
Sample Complexity of Probability Divergences under Group Symmetry
Adversarial Policies Beat Superhuman Go AIs
Sequential Counterfactual Risk Minimization
Automatic Data Augmentation via Invariance-Constrained Learning
Concurrent Shuffle Differential Privacy Under Continual Observation
On the Global Convergence of Fitted Q-Iteration with Two-layer Neural Network Parametrization
Decoding Layer Saliency in Transformers
CSP: Self-Supervised Contrastive Spatial Pre-Training for Geospatial-Visual Representations
Fully Dynamic Submodular Maximization over Matroids
Beyond In-Domain Scenarios: Robust Density-Aware Calibration
A Watermark for Large Language Models
Optimal Horizon-Free Reward-Free Exploration for Linear Mixture MDPs
The Benefits of Model-Based Generalization in Reinforcement Learning
Provably and Practically Efficient Neural Contextual Bandits
Training Deep Surrogate Models with Large Scale Online Learning
STEERING : Stein Information Directed Exploration for Model-Based Reinforcement Learning
Bayesian Neural Networks Avoid Encoding Complex and Perturbation-Sensitive Concepts
Learning to Design Analog Circuits to Meet Threshold Specifications
Distributed Linear Bandits under Communication Constraints
Automated Search for Conjectures on Mathematical Constants using Analysis of Integer Sequences
On the Expressive Power of Geometric Graph Neural Networks
Geometric Latent Diffusion Models for 3D Molecule Generation
Private Statistical Estimation of Many Quantiles
Improving Graph Generation by Restricting Graph Bandwidth
On the Estimation of Gaussian Mixture Copula Models
Solving Linear Program with Fast Online Learning Algorithms
Paging with Succinct Predictions
Variational Open-Domain Question Answering
Phase Transitions in the Detection of Correlated Databases
Algorithmic Collective Action in Machine Learning
Adversarial Cheap Talk
RLang: A Declarative Language for Describing Partial World Knowledge to Reinforcement Learning Agents
Scalable Safe Policy Improvement via Monte Carlo Tree Search
Regularization-free Diffeomorphic Temporal Alignment Nets
CLUTR: Curriculum Learning via Unsupervised Task Representation Learning
POUF: Prompt-Oriented Unsupervised Fine-tuning for Large Pre-trained Models
Bayesian Reparameterization of Reward-Conditioned Reinforcement Learning with Energy-based Models
PINA: Leveraging Side Information in eXtreme Multi-label Classification via Predicted Instance Neighborhood Aggregation
Constrained Decision Transformer for Offline Safe Reinforcement Learning
A Two-Stage Active Learning Algorithm for k-Nearest Neighbors
Delayed Bandits: When Do Intermediate Observations Help?
Consistency Models
Dynamic IMLE for Few-shot Pretraining-free Generative Modelling
Cooperative Multi-Agent Reinforcement Learning: Asynchronous Communication and Linear Function Approximation
On the Convergence of SARSA with Linear Function Approximation
PAC-Bayesian Generalization Bounds for Adversarial Generative Models
Diffusion Based Representation Learning
Pretraining Language Models with Human Preferences
Optimal randomized multilevel Monte Carlo for repeatedly nested expectations
PAC Prediction Sets for Large Language Models of Code
Scalable Adaptive Computation for Iterative Generation
PAC-Bayesian Offline Contextual Bandits With Guarantees
Special Properties of Gradient Descent with Large Learning Rates
The Power of Learned Locally Linear Models for Nonlinear Policy Optimization
Statistical Learning under Heterogenous Distribution Shift
Intrinsic Sliced Wasserstein Distances for Comparing Collections of Probability Distributions on Manifolds and Graphs
Compositional Score Modeling for Simulation-Based Inference
Surrogate Module Learning: Reduce the Gradient Error Accumulation in Training Spiking Neural Networks
Human-Timescale Adaptation in an Open-Ended Task Space
Sequential Predictive Conformal Inference for Time Series
The Flan Collection: Designing Data and Methods for Effective Instruction Tuning
Revisiting Structured Variational Autoencoders
Can Forward Gradient Match Backpropagation?
TRAK: Understanding Model Predictions at Scale
Differentially Private Optimization on Large Model at Small Cost
OCD: Learning to Overfit with Conditional Diffusion Models
Cocktail Party Attack: Breaking Aggregation-Based Privacy in Federated Learning Using Independent Component Analysis
The Numerical Stability of Hyperbolic Representation Learning
$\pi$-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation
Improved Active Multi-Task Representation Learning via Lasso
Multi-task Representation Learning for Pure Exploration in Linear Bandits
Implicit Graph Neural Networks: A Monotone Operator Viewpoint
Reasons for the Superiority of Stochastic Estimators over Deterministic Ones: Robustness, Consistency and Perceptual Quality
Regularizing Towards Soft Equivariance Under Mixed Symmetries
RSC: Accelerate Graph Neural Networks Training via Randomized Sparse Computations
Image Restoration with Mean-Reverting Stochastic Differential Equations
Dynamic Regularized Sharpness Aware Minimization in Federated Learning: Approaching Global Consistency and Smooth Landscape
Pareto Manifold Learning: Tackling multiple tasks via ensembles of single-task models
Sequential Kernelized Independence Testing
Differentiable and Transportable Structure Learning
ILLUME: Rationalizing Vision-Language Models through Human Interactions
Flexible Phase Dynamics for Bio-plausible Contrastive Learning
Are labels informative in semi-supervised learning? Estimating and leveraging the missing-data mechanism.
Demystifying Disagreement-on-the-Line in High Dimensions
Trustworthy Policy Learning under the Counterfactual No-Harm Criterion
Unconstrained Online Learning with Unbounded Losses
Thompson Sampling with Diffusion Generative Prior
Distilling Internet-Scale Vision-Language Models into Embodied Agents
Explainable Data-Driven Optimization: From Context to Decision and Back Again
Fast Online Node Labeling for Very Large Graphs
ContraBAR: Contrastive Bayes-Adaptive Deep RL
Learning useful representations for shifting tasks and distributions
Computational Doob h-transforms for Online Filtering of Discretely Observed Diffusions
XTab: Cross-table Pretraining for Tabular Transformers
Beyond Reward: Offline Preference-guided Policy Optimization
On Many-Actions Policy Gradient
The Blessing of Heterogeneity in Federated Q-Learning: Linear Speedup and Beyond
Shapley Based Residual Decomposition for Instance Analysis
Relevant Walk Search for Explaining Graph Neural Networks
Bandit Multi-linear DR-Submodular Maximization and Its Applications on Adversarial Submodular Bandits
Temporal Label Smoothing for Early Event Prediction
VectorMapNet: End-to-end Vectorized HD Map Learning
Estimating the Contamination Factor's Distribution in Unsupervised Anomaly Detection
When Personalization Harms Performance: Reconsidering the Use of Group Attributes in Prediction
Near-optimal Conservative Exploration in Reinforcement Learning under Episode-wise Constraints
Ewald-based Long-Range Message Passing for Molecular Graphs
An Investigation into Pre-Training Object-Centric Representations for Reinforcement Learning
Cell-Free Latent Go-Explore
Unlocking Slot Attention by Changing Optimal Transport Costs
Alternating Local Enumeration (TnALE): Solving Tensor Network Structure Search with Fewer Evaluations
Generalized Polyak Step Size for First Order Optimization with Momentum
Towards Understanding the Generalization of Graph Neural Networks
Projected Tensor Power Method for Hypergraph Community Recovery
Graph Neural Tangent Kernel: Convergence on Large Graphs
Aligning Language Models with Preferences through $f$-divergence Minimization
Speed-Oblivious Online Scheduling: Knowing (Precise) Speeds is not Necessary
Stable and Consistent Prediction of 3D Characteristic Orientation via Invariant Residual Learning
Node Embedding from Neural Hamiltonian Orbits in Graph Neural Networks
Towards Understanding Generalization of Macro-AUC in Multi-label Learning
On the Within-Group Fairness of Screening Classifiers
Causal Structure Learning for Latent Intervened Non-stationary Data
Neural Inverse Operators for Solving PDE Inverse Problems
From Relational Pooling to Subgraph GNNs: A Universal Framework for More Expressive Graph Neural Networks
How to address monotonicity for model risk management?
On the Power of Pre-training for Generalization in RL: Provable Benefits and Hardness
On the Complexity of Bayesian Generalization
Vertical Federated Graph Neural Network for Recommender System
What Makes Entities Similar? A Similarity Flooding Perspective for Multi-sourced Knowledge Graph Embeddings
SGD with AdaGrad Stepsizes: Full Adaptivity with High Probability to Unknown Parameters, Unbounded Gradients and Affine Variance
Finding Generalization Measures by Contrasting Signal and Noise
Regression with Sensor Data Containing Incomplete Observations
Detecting Out-of-distribution Data through In-distribution Class Prior
DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule
Generalizing Neural Wave Functions
Information-Theoretic State Space Model for Multi-View Reinforcement Learning
Continuation Path Learning for Homotopy Optimization
Fourmer: An Efficient Global Modeling Paradigm for Image Restoration
Multi-Modal Classifiers for Open-Vocabulary Object Detection
Multi-Task Structural Learning using Local Task Similarity induced Neuron Creation and Removal
Adversarial Collaborative Learning on Non-IID Features
Provably Invariant Learning without Domain Information
GRAFENNE: Learning on Graphs with Heterogeneous and Dynamic Feature Sets
Denoising MCMC for Accelerating Diffusion-Based Generative Models
Continual Learning in Linear Classification on Separable Data
Total Variation Graph Neural Networks
Learning Control by Iterative Inversion
Calibrating Multimodal Learning
Topological Singularity Detection at Multiple Scales
Finding the Missing-half: Graph Complementary Learning for Homophily-prone and Heterophily-prone Graphs
Homomorphism AutoEncoder --- Learning Group Structured Representations from Observed Transitions
Automatic Intrinsic Reward Shaping for Exploration in Deep Reinforcement Learning
Dynamical Linear Bandits
Symmetry-Aware Robot Design with Structured Subgroups
Self-supervised Neural Factor Analysis for Disentangling Utterance-level Speech Representations
A Robust Optimisation Perspective on Counterexample-Guided Repair of Neural Networks
One-Shot Federated Conformal Prediction
Optimality of Thompson Sampling with Noninformative Priors for Pareto Bandits
Regret Bounds for Markov Decision Processes with Recursive Optimized Certainty Equivalents
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both Worlds in Stochastic and Deterministic Environments
Controllability-Aware Unsupervised Skill Discovery
Multiplier Bootstrap-based Exploration
CHiLS: Zero-Shot Image Classification with Hierarchical Label Sets
Towards credible visual model interpretation with path attribution
Optimistic Online Mirror Descent for Bridging Stochastic and Adversarial Online Convex Optimization
Diffusion Models for Offline Black-Box Optimization
Interactive Object Placement with Reinforcement Learning
Mechanistic Mode Connectivity
Contrastive Learning Meets Homophily: Two Birds with One Stone
Adversarial Parameter Attack on Deep Neural Networks
Tuning Language Models as Training Data Generators for Augmentation-Enhanced Few-Shot Learning
NeuralSlice: Neural 3D Triangle Mesh Reconstruction via Slicing 4D Tetrahedral Meshes
Implicit Regularization Leads to Benign Overfitting for Sparse Linear Regression
Leveraging Offline Data in Online Reinforcement Learning
Fast $(1+\varepsilon)$-Approximation Algorithms for Binary Matrix Factorization
Reinforcement Learning with General Utilities: Simpler Variance Reduction and Large State-Action Space
Naive imputation implicitly regularizes high-dimensional linear models
Meta Optimal Transport
Hyperbolic Image-text Representations
Which is Better for Learning with Noisy Labels: The Semi-supervised Method or Modeling Label Noise?
Surface Snapping Optimization Layer for Single Image Object Shape Reconstruction
NP-SemiSeg: When Neural Processes meet Semi-Supervised Semantic Segmentation
A Critical Revisit of Adversarial Robustness in 3D Point Cloud Recognition with Diffusion-Driven Purification
Regret-Minimizing Double Oracle for Extensive-Form Games
Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition
Graph Ladling: Shockingly Simple Parallel GNN Training without Intermediate Communication
Adaptive Identification of Populations with Treatment Benefit in Clinical Trials: Machine Learning Challenges and Solutions
Efficient Sequence Transduction by Jointly Predicting Tokens and Durations
Buying Information for Stochastic Optimization
Thompson Sampling for High-Dimensional Sparse Linear Contextual Bandits
Preprocessors Matter! Realistic Decision-Based Attacks on Machine Learning Systems
Quantum Ridgelet Transform: Winning Lottery Ticket of Neural Networks with Quantum Computation
Federated Conformal Predictors for Distributed Uncertainty Quantification
Learning in POMDPs is Sample-Efficient with Hindsight Observability
MonoNeRF: Learning Generalizable NeRFs from Monocular Videos without Camera Pose
Sharper Bounds for $\ell_p$ Sensitivity Sampling
Multi-Environment Pretraining Enables Transfer to Action Limited Datasets
Maximum Optimality Margin: A Unified Approach for Contextual Linear Programming and Inverse Linear Programming
Latent Traversals in Generative Models as Potential Flows
Taxonomy-Structured Domain Adaptation
Hybrid Energy Based Model in the Feature Space for Out-of-Distribution Detection
A New PHO-rmula for Improved Performance of Semi-Structured Networks
A Flexible Diffusion Model
On Sampling with Approximate Transport Maps
COLA: Orchestrating Error Coding and Learning for Robust Neural Network Inference Against Hardware Defects
On the Occupancy Measure of Non-Markovian Policies in Continuous MDPs
How Jellyfish Characterise Alternating Group Equivariant Neural Networks
ODS: Test-Time Adaptation in the Presence of Open-World Data Shift
Differentiable Multi-Target Causal Bayesian Experimental Design
Fair yet Asymptotically Equal Collaborative Learning
From Noisy Fixed-Point Iterations to Private ADMM for Centralized and Federated Learning
Safe Offline Reinforcement Learning with Real-Time Budget Constraints
Simplex Random Features
Differential Privacy has Bounded Impact on Fairness in Classification
The Role of Entropy and Reconstruction for Multi-View Self-Supervised Learning
SlotGAT: Slot-based Message Passing for Heterogeneous Graphs
Generating Private Synthetic Data with Genetic Algorithms
Hierarchical Diffusion for Offline Decision Making
A Connection between One-Step Regularization and Critic Regularization in Reinforcement Learning
LipsNet: A Smooth and Robust Neural Network with Adaptive Lipschitz Constant for High Accuracy Optimal Control
Fast Excess Risk Rates via Offset Rademacher Complexity
Doubly Adversarial Federated Bandits
Architecture-Agnostic Masked Image Modeling -- From ViT back to CNN
AudioLDM: Text-to-Audio Generation with Latent Diffusion Models
Cooperative Open-ended Learning Framework for Zero-Shot Coordination
Stochastic Gradient Descent under Markov-Chain Sampling Schemes
Learning Instance-Specific Augmentations by Capturing Local Invariances
Sampling-based Nyström Approximation and Kernel Quadrature
Achieving High Accuracy with PINNs via Energy Natural Gradient Descent
Graph Switching Dynamical Systems
Boosting Graph Contrastive Learning via Graph Contrastive Saliency
Bigger, Better, Faster: Human-level Atari with human-level efficiency
Nonparametric Iterative Machine Teaching
Invariance in Policy Optimisation and Partial Identifiability in Reward Learning
HarsanyiNet: Computing Accurate Shapley Values in a Single Forward Propagation
Improving Hyperparameter Learning under Approximate Inference in Gaussian Process Models
Understanding Self-Predictive Learning for Reinforcement Learning
VA-learning as a more efficient alternative to Q-learning
CRISP: Curriculum based Sequential neural decoders for Polar code family
Sliced-Wasserstein on Symmetric Positive Definite Matrices for M/EEG Signals
On Uni-Modal Feature Learning in Supervised Multi-Modal Learning
Revisiting Weighted Aggregation in Federated Learning with Neural Networks
Hardness of Independent Learning and Sparse Equilibrium Computation in Markov Games
SRATTA: Sample Re-ATTribution Attack of Secure Aggregation in Federated Learning.
Statistical Inference on Multi-armed Bandits with Delayed Feedback
StriderNet: A Graph Reinforcement Learning Approach to Optimize Atomic Structures on Rough Energy Landscapes
SAAL: Sharpness-Aware Active Learning
Towards Deep Attention in Graph Neural Networks: Problems and Remedies
Existence and Estimation of Critical Batch Size for Training Generative Adversarial Networks with Two Time-Scale Update Rule
Live in the Moment: Learning Dynamics Model Adapted to Evolving Policy
Straightening Out the Straight-Through Estimator: Overcoming Optimization Challenges in Vector Quantized Networks
Approximately Optimal Core Shapes for Tensor Decompositions
Understanding the Impact of Adversarial Robustness on Accuracy Disparity
Improved Online Conformal Prediction via Strongly Adaptive Online Learning
Sample Complexity Bounds for Learning High-dimensional Simplices in Noisy Regimes
Analyzing Privacy Leakage in Machine Learning via Multiple Hypothesis Testing: A Lesson From Fano
Fast Combinatorial Algorithms for Min Max Correlation Clustering
Grounding Language Models to Images for Multimodal Inputs and Outputs
GNN&GBDT-Guided Fast Optimizing Framework for Large-scale Integer Programming
Neural Continuous-Discrete State Space Models for Irregularly-Sampled Time Series
Scaling Laws for Reward Model Overoptimization
SE(3) diffusion model with application to protein backbone generation
On the Correctness of Automatic Differentiation for Neural Networks with Machine-Representable Parameters
Graph Mixup with Soft Alignments
Transformers Meet Directed Graphs
Unveiling The Mask of Position-Information Pattern Through the Mist of Image Features
The Fast Johnson-Lindenstrauss Transform Is Even Faster
Normalizing Flows for Interventional Density Estimation
BEATs: Audio Pre-Training with Acoustic Tokenizers
Efficient Approximations of Complete Interatomic Potentials for Crystal Property Prediction
GLOBE-CE: A Translation Based Approach for Global Counterfactual Explanations
Learning Distributions over Quantum Measurement Outcomes
DDGR: Continual Learning with Deep Diffusion-based Generative Replay
Efficiently predicting high resolution mass spectra with graph neural networks
Model-Aware Contrastive Learning: Towards Escaping the Dilemmas
ConCerNet: A Contrastive Learning Based Framework for Automated Conservation Law Discovery and Trustworthy Dynamical System Prediction
Coordinated Dynamic Bidding in Repeated Second-Price Auctions with Budgets
Wasserstein Barycenter Matching for Graph Size Generalization of Message Passing Neural Networks
OpenFE: Automated Feature Generation with Expert-level Performance
Random Grid Neural Processes for Parametric Partial Differential Equations
On the Identifiability and Estimation of Causal Location-Scale Noise Models
ModelDiff: A Framework for Comparing Learning Algorithms
Shedding a PAC-Bayesian Light on Adaptive Sliced-Wasserstein Distances
GAT: Guided Adversarial Training with Pareto-optimal Auxiliary Tasks
Cold Analysis of Rao-Blackwellized Straight-Through Gumbel-Softmax Gradient Estimator
End-to-End Full-Atom Antibody Design
Byzantine-Robust Learning on Heterogeneous Data via Gradient Splitting
Personalized Federated Learning with Inferred Collaboration Graphs
Progressive Purification for Instance-Dependent Partial Label Learning
SNeRL: Semantic-aware Neural Radiance Fields for Reinforcement Learning
Improving Statistical Fidelity for Neural Image Compression with Implicit Local Likelihood Models
FusionRetro: Molecule Representation Fusion via In-Context Learning for Retrosynthetic Planning
Fast Private Kernel Density Estimation via Locality Sensitive Quantization
FedBR: Improving Federated Learning on Heterogeneous Data via Local Learning Bias Reduction
Fed-CBS: A Heterogeneity-Aware Client Sampling Mechanism for Federated Learning via Class-Imbalance Reduction
Orthogonality-Enforced Latent Space in Autoencoders: An Approach to Learning Disentangled Representations
Building Neural Networks on Matrix Manifolds: A Gyrovector Space Approach
When and How Does Known Class Help Discover Unknown Ones? Provable Understanding Through Spectral Analysis
Bayesian online change point detection with Hilbert space approximate Student-t process
Graph Positional Encoding via Random Feature Propagation
Averaged Method of Multipliers for Bi-Level Optimization without Lower-Level Strong Convexity
Accuracy on the Curve: On the Nonlinear Correlation of ML Performance Between Data Subpopulations
Robust Satisficing MDPs
Learning to Optimize Differentiable Games
A Closer Look at the Intervention Procedure of Concept Bottleneck Models
Regression with Label Permutation in Generalized Linear Model
Multi-Epoch Matrix Factorization Mechanisms for Private Machine Learning
A Coupled Flow Approach to Imitation Learning
Reliable Measures of Spread in High Dimensional Latent Spaces
Adaptive Annealed Importance Sampling with Constant Rate Progress
Large Language Models Struggle to Learn Long-Tail Knowledge
Neural Diffusion Processes
Diversity-enhancing Generative Network for Few-shot Hypothesis Adaptation
Target-Aware Generative Augmentations for Single-Shot Adaptation
On Heterogeneous Treatment Effects in Heterogeneous Causal Graphs
Dynamics-inspired Neuromorphic Visual Representation Learning
On the Power of Foundation Models
Detecting Adversarial Data by Probing Multiple Perturbations Using Expected Perturbation Score
InfoOT: Information Maximizing Optimal Transport
On the Connection Between MPNN and Graph Transformer
Understanding Incremental Learning of Gradient Descent: A Fine-grained Analysis of Matrix Sensing
Interval Bound Interpolation for Few-shot Learning with Few Tasks
Are Equivariant Equilibrium Approximators Beneficial?
ChiPFormer: Transferable Chip Placement via Offline Decision Transformer
MetricGAN-OKD: Multi-Metric Optimization of MetricGAN via Online Knowledge Distillation for Speech Enhancement
Complementary Attention for Multi-Agent Reinforcement Learning
Simple MViT: A Hierarchical Vision Transformer without the Bells-and-Whistles
Explaining the effects of non-convergent MCMC in the training of Energy-Based Models
Interpolation for Robust Learning: Data Augmentation on Geodesics
Nearly Optimal Competitive Ratio for Online Allocation Problems with Two-sided Resource Constraints and Finite Requests
Generative Causal Representation Learning for Out-of-Distribution Motion Forecasting
Context-Aware Bayesian Network Actor-Critic Methods for Cooperative Multi-Agent Reinforcement Learning
DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation
Optimal Goal-Reaching Reinforcement Learning via Quasimetric Learning
A Kernel Stein Test of Goodness of Fit for Sequential Models
I$^2$SB: Image-to-Image Schrödinger Bridge
Continual Vision-Language Representation Learning with Off-Diagonal Information
Memory-Based Dual Gaussian Processes for Sequential Learning
Faster Rates of Convergence to Stationary Points in Differentially Private Optimization
AdaBoost is not an Optimal Weak to Strong Learner
Differentially Private Stochastic Convex Optimization under a Quantile Loss Function
Self-Attention Amortized Distributional Projection Optimization for Sliced Wasserstein Point-Cloud Reconstruction
Learning Affinity with Hyperbolic Representation for Spatial Propagation
Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models
Bootstrap in High Dimension with Low Computation
Synthetic data for model selection
Quantum Lower Bounds for Finding Stationary Points of Nonconvex Functions
Graph Neural Networks can Recover the Hidden Features Solely from the Graph Structure
For Pre-Trained Vision Models in Motor Control, Not All Policy Learning Methods are Created Equal
BPipe: Memory-Balanced Pipeline Parallelism for Training Large Language Models
SurProGenes: Survival Risk-Ordered Representation of Cancer Patients and Genes for the Identification of Prognostic Genes
Better Training of GFlowNets with Local Credit and Incomplete Trajectories
Defects of Convolutional Decoder Networks in Frequency Representation
Exploring Chemical Space with Score-based Out-of-distribution Generation
Concept-based Explanations for Out-of-Distribution Detectors
Reparameterized Policy Learning for Multimodal Trajectory Optimization
Optimal Shrinkage for Distributed Second-Order Optimization
Towards Robust Graph Incremental Learning on Evolving Graphs
Not all Strongly Rayleigh Distributions Have Small Probabilistic Generating Circuits
Nugget: Neural Agglomerative Embeddings of Text
Automatically marginalized MCMC in probabilistic programming
Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice
Neural networks trained with SGD learn distributions of increasing complexity
Regret Minimization and Convergence to Equilibria in General-sum Markov Games
Learning to Decouple Complex Systems
Prompting Large Language Model for Machine Translation: A Case Study
Non-autoregressive Conditional Diffusion Models for Time Series Prediction
FedDisco: Federated Learning with Discrepancy-Aware Collaboration
Enabling First-Order Gradient-Based Learning for Equilibrium Computation in Markets
SpeedDETR: Speed-aware Transformers for End-to-end Object Detection
Local Vertex Colouring Graph Neural Networks
CLIPood: Generalizing CLIP to Out-of-Distributions
Metagenomic Binning using Connectivity-constrained Variational Autoencoders
Randomized Gaussian Process Upper Confidence Bound with Tighter Bayesian Regret Bounds
Scaling of Class-wise Training Losses for Post-hoc Calibration
Two-Scale Gradient Descent Ascent Dynamics Finds Mixed Nash Equilibria of Continuous Games: A Mean-Field Perspective
A Law of Robustness beyond Isoperimetry
Tight and fast generalization error bound of graph embedding in metric space.
ESC: Exploration with Soft Commonsense Constraints for Zero-shot Object Navigation
No One Idles: Efficient Heterogeneous Federated Learning with Parallel Edge and Server Computation
Hierarchical Programmatic Reinforcement Learning via Learning to Compose Programs
Generalization Analysis for Contrastive Representation Learning
Gaussian Process Priors for Systems of Linear Partial Differential Equations with Constant Coefficients
Understanding Backdoor Attacks through the Adaptability Hypothesis
Random Classification Noise does not defeat All Convex Potential Boosters Irrespective of Model Choice
Continual Learners are Incremental Model Generalizers
LinSATNet: The Positive Linear Satisfiability Neural Networks
Optimal Arms Identification with Knapsacks
DIVISION: Memory Efficient Training via Dual Activation Precision
Open-VCLIP: Transforming CLIP to an Open-vocabulary Video Model via Interpolated Weight Optimization
Efficient and Degree-Guided Graph Generation via Discrete Diffusion Modeling
Scalable Set Encoding with Universal Mini-Batch Consistency and Unbiased Full Set Gradient Approximation
The Ideal Continual Learner: An Agent That Never Forgets
TIPS: Topologically Important Path Sampling for Anytime Neural Networks
Unleashing Mask: Explore the Intrinsic Out-of-Distribution Detection Capability
Exploring Model Dynamics for Accumulative Poisoning Discovery
Muse: Text-To-Image Generation via Masked Generative Transformers
Delayed Feedback in Kernel Bandits
FaDIn: Fast Discretized Inference for Hawkes Processes with General Parametric Kernels
Is Learning Summary Statistics Necessary for Likelihood-free Inference?
FLEX: an Adaptive Exploration Algorithm for Nonlinear Systems
ClusterFuG: Clustering Fully connected Graphs by Multicut
A unified optimization framework of ANN-SNN Conversion: towards optimal mapping from activation values to firing rates
PFNs4BO: Meta-Learning the surrogate model for Bayesian optimization from scratch using Transformers
Improving the Model Consistency of Decentralized Federated Learning
Implicit Jacobian regularization weighted with impurity of probability output
Dink-Net: Neural Clustering on Large Graphs
Weak Proxies are Sufficient and Preferable for Fairness with Missing Sensitive Attributes
Improving Visual Prompt Tuning for Self-supervised Vision Transformers
Online Prototype Alignment for Few-shot Policy Transfer
Probabilistic Concept Bottleneck Models
Multiple Thinking Achieving Meta-Ability Decoupling for Object Navigation
Learning to acquire novel cognitive tasks with evolution, plasticity and meta-meta-learning
Geometric Autoencoders - What You See is What You Decode
Robust Camera Pose Refinement for Multi-Resolution Hash Encoding
Faster Gradient-Free Algorithms for Nonsmooth Nonconvex Stochastic Optimization
Competing for Shareable Arms in Multi-Player Multi-Armed Bandits
Contrast with Reconstruct: Contrastive 3D Representation Learning Guided by Generative Pretraining
Estimating Possible Causal Effects with Latent Variables via Adjustment
Bit Allocation using Optimization
Solving High-Dimensional PDEs with Latent Spectral Models
Oscillation-free Quantization for Low-bit Vision Transformers
FedHPO-Bench: A Benchmark Suite for Federated Hyperparameter Optimization
Sketching Meets Differential Privacy: Fast Algorithm for Dynamic Kronecker Projection Maintenance
Robust Perception through Equivariance
Mitigating Propagation Failures in Physics-informed Neural Networks using Retain-Resample-Release (R3) Sampling
On the Optimality of Misspecified Kernel Ridge Regression
Multi-View Masked World Models for Visual Robotic Manipulation
Performative Recommendation: Diversifying Content via Strategic Incentives
How Does Information Bottleneck Help Deep Learning?
Momentum Ensures Convergence of SIGNSGD under Weaker Assumptions
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Gradient Descent Finds the Global Optima of Two-Layer Physics-Informed Neural Networks
Attribute-Efficient PAC Learning of Low-Degree Polynomial Threshold Functions with Nasty Noise
Towards Controlled Data Augmentations for Active Learning
Bi-directional Masks for Efficient N:M Sparse Training
Margin-based Neural Network Watermarking
Great Models Think Alike: Improving Model Reliability via Inter-Model Latent Agreement
Opponent-Limited Online Search for Imperfect Information Games
Pruning via Sparsity-indexed ODE: a Continuous Sparsity Viewpoint
Nesterov Meets Optimism: Rate-Optimal Separable Minimax Optimization
Revisiting Discriminative vs. Generative Classifiers: Theory and Implications
Lazy Agents: A New Perspective on Solving Sparse Reward Problem in Multi-agent Reinforcement Learning
Long Horizon Temperature Scaling
Robust Weak Supervision with Variational Auto-Encoders
Using Perturbation to Improve Goodness-of-Fit Tests based on Kernelized Stein Discrepancy
Task-Specific Skill Localization in Fine-tuned Language Models
Near-Optimal $\Phi$-Regret Learning in Extensive-Form Games
Which Tricks are Important for Learning to Rank?
The SSL Interplay: Augmentations, Inductive Bias, and Generalization
Policy Mirror Ascent for Efficient and Independent Learning in Mean Field Games
On Investigating the Conservative Property of Score-Based Generative Models
Curriculum Co-disentangled Representation Learning across Multiple Environments for Social Recommendation
Rethinking Explaining Graph Neural Networks via Non-parametric Subgraph Matching
UPSCALE: Unconstrained Channel Pruning
Practical and Matching Gradient Variance Bounds for Black-Box Variational Bayesian Inference
Fair and Optimal Multi-Class Classification via Post-Processing
Do Not Train It: A Linear Neural Architecture Search of Graph Neural Networks
Optimizing Mode Connectivity for Class Incremental Learning
Never mind the metrics---what about the uncertainty? Visualising confusion matrix metric distributions
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
Provable Dynamic Fusion for Low-Quality Multimodal Data
CLUSTSEG: Clustering for Universal Segmentation
GFlowNet-EM for Learning Compositional Latent Variable Models
SDDM: Score-Decomposed Diffusion Models on Manifolds for Unpaired Image-to-Image Translation
CoCo: A Coupled Contrastive Framework for Unsupervised Domain Adaptive Graph Classification
FAIRER: Fairness as Decision Rationale Alignment
Social learning spontaneously emerges by searching optimal heuristics with deep reinforcement learning
Crafting Training Degradation Distribution for the Accuracy-Generalization Trade-off in Real-World Super-Resolution
Spherical Fourier Neural Operators: Learning Stable Dynamics on the Sphere
SEGA: Structural Entropy Guided Anchor View for Graph Contrastive Learning
Model-Bellman Inconsistency for Model-based Offline Reinforcement Learning
Composer: Creative and Controllable Image Synthesis with Composable Conditions
Evolving Semantic Prototype Improves Generative Zero-Shot Learning
ProtST: Multi-Modality Learning of Protein Sequences and Biomedical Texts
Retrieval-Augmented Multimodal Language Modeling
A Closer Look at Few-shot Classification Again
Boosting Offline Reinforcement Learning with Action Preference Query
Patch-level Contrastive Learning via Positional Query for Visual Pre-training
Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language
Model Ratatouille: Recycling Diverse Models for Out-of-Distribution Generalization
Learning GFlowNets From Partial Episodes For Improved Convergence And Stability
Cones: Concept Neurons in Diffusion Models for Customized Generation
Towards Omni-generalizable Neural Methods for Vehicle Routing Problems
Long-Term Rhythmic Video Soundtracker
XAI Beyond Classification: Interpretable Neural Clustering
Selective Machine Learning of the Average Treatment Effect with an Invalid Instrumental Variable
abess: A Fast Best-Subset Selection Library in Python and R
Exploiting locality in high-dimensional Factorial hidden Markov models
FP-Diffusion: Improving Score-based Diffusion Models by Enforcing the Underlying Score Fokker-Planck Equation
Are Diffusion Models Vulnerable to Membership Inference Attacks?
Mitigating the Effects of Non-Identifiability on Inference for Bayesian Neural Networks with Latent Variables
CleanRL: High-quality Single-file Implementations of Deep Reinforcement Learning Algorithms
Project and Forget: Solving Large-Scale Metric Constrained Problems
Constraint Reasoning Embedded Structured Prediction
Existence, Stability and Scalability of Orthogonal Convolutional Neural Networks
Let's Make Block Coordinate Descent Converge Faster: Faster Greedy Rules, Message-Passing, Active-Set Complexity, and Superlinear Convergence
On the Convergence Rates of Policy Gradient Methods
Cluster-Specific Predictions with Multi-Task Gaussian Processes
Data-Derived Weak Universal Consistency
A Framework for Adapting Offline Algorithms to Solve Combinatorial Multi-Armed Bandit Problems with Bandit Feedback
Non-asymptotic Properties of Individualized Treatment Rules from Sequentially Rule-Adaptive Trials
Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks
Multi-Agent Online Optimization with Delays: Asynchronicity, Adaptivity, and Optimism
Underspecification Presents Challenges for Credibility in Modern Machine Learning
Distributed Stochastic Gradient Descent: Nonconvexity, Nonsmoothness, and Convergence to Local Minima
Towards Learning to Imitate from a Single Video Demonstration
Faith-Shap: The Faithful Shapley Interaction Index
Knowledge Hypergraph Embedding Meets Relational Algebra
Deep linear networks can benignly overfit when shallow ones do
Flexible Model Aggregation for Quantile Regression
Counterfactual Analysis in Dynamic Latent State Models
Deep Temporal Sets with Evidential Reinforced Attentions for Unique Behavioral Pattern Discovery
Nested Elimination: A Simple Algorithm for Best-Item Identification From Choice-Based Feedback
Neural Wasserstein Gradient Flows for Discrepancies with Riesz Kernels
Lookahead When It Matters: Adaptive Non-causal Transformers for Streaming Neural Transducers
Learning Temporally AbstractWorld Models without Online Experimentation
Consistency of Multiple Kernel Clustering
Robustness in Multimodal Learning under Train-Test Modality Mismatch
NerfDiff: Single-image View Synthesis with NeRF-guided Distillation from 3D-aware Diffusion
Lowering the Pre-training Tax for Gradient-based Subset Training: A Lightweight Distributed Pre-Training Toolkit
Shortest Edit Path Crossover: A Theory-driven Solution to the Permutation Problem in Evolutionary Neural Architecture Search
Neural Network Approximations of PDEs Beyond Linearity: A Representational Perspective
Improved Analysis of Score-based Generative Modeling: User-Friendly Bounds under Minimal Smoothness Assumptions
Discrete Key-Value Bottleneck
TabDDPM: Modelling Tabular Data with Diffusion Models
GFlowOut: Dropout with Generative Flow Networks
Benign Overfitting in Two-layer ReLU Convolutional Neural Networks
BiBench: Benchmarking and Analyzing Network Binarization
Efficient Personalized Federated Learning via Sparse Model-Adaptation
Minimum Width of Leaky-ReLU Neural Networks for Uniform Universal Approximation
Diffusion Models as Artists: Are we Closing the Gap between Humans and Machines?
Free-Form Variational Inference for Gaussian Process State-Space Models
Cyclic Block Coordinate Descent With Variance Reduction for Composite Nonconvex Optimization
The Benefits of Mixup for Feature Learning
EF21-P and Friends: Improved Theoretical Communication Complexity for Distributed Optimization with Bidirectional Compression
On the Generalization of Multi-modal Contrastive Learning
Fast Sampling of Diffusion Models via Operator Learning
Superhuman Fairness
Cross-Entropy Loss Functions: Theoretical Analysis and Applications
PPG Reloaded: An Empirical Study on What Matters in Phasic Policy Gradient
Optimizing NOTEARS Objectives via Topological Swaps
dugMatting: Decomposed-Uncertainty-Guided Matting
Is Overfitting Necessary for Implicit Video Representation?
Quantifying the Knowledge in GNNs for Reliable Distillation into MLPs
Statistical Inference and A/B Testing for First-Price Pacing Equilibria
Analyzing Convergence in Quantum Neural Networks: Deviations from Neural Tangent Kernels
PreNAS: Preferred One-Shot Learning Towards Efficient Neural Architecture Search
Forward-Backward Gaussian Variational Inference via JKO in the Bures-Wasserstein Space
Revisiting Pseudo-Label for Single-Positive Multi-Label Learning
WL meet VC
Sketching for First Order Method: Efficient Algorithm for Low-Bandwidth Channel and Vulnerability
An Adaptive Entropy-Regularization Framework for Multi-Agent Reinforcement Learning
How to Trust Your Diffusion Model: A Convex Optimization Approach to Conformal Risk Control
Tilted Sparse Additive Models
Differential Privacy, Linguistic Fairness, and Training Data Influence: Impossibility and Possibility Theorems for Multilingual Language Models
Policy Contrastive Imitation Learning
Semi-Supervised Offline Reinforcement Learning with Action-Free Trajectories
Convergence of Proximal Point and Extragradient-Based Methods Beyond Monotonicity: the Case of Negative Comonotonicity
Auxiliary Modality Learning with Generalized Curriculum Distillation
DevFormer: A Symmetric Transformer for Context-Aware Device Placement
HETAL: Efficient Privacy-preserving Transfer Learning with Homomorphic Encryption
Toward Fair and Robust Estimation of Optimal Treatment Regimes
Distribution Free Prediction Sets for Node Classification
Few-Sample Feature Selection via Feature Manifold Learning
Constrained Phi-Equilibria
End-to-end Differentiable Clustering with Associative Memories
Conformal Prediction for Federated Uncertainty Quantification Under Label Shift
Parameter-Level Soft-Masking for Continual Learning
Adversarial robustness of amortized Bayesian inference
Why Is Public Pretraining Necessary for Private Model Training?
End-to-End Multi-Object Detection with a Regularized Mixture Model
Variance Control for Distributional Reinforcement Learning
Searching Large Neighborhoods for Integer Linear Programs with Contrastive Learning
Reachability-Aware Laplacian Representation in Reinforcement Learning
Generating Novel, Designable, and Diverse Protein Structures by Equivariantly Diffusing Oriented Residue Clouds
Model-agnostic Measure of Generalization Difficulty
Quantized Distributed Training of Large Models with Convergence Guarantees
Pareto Regret Analyses in Multi-objective Multi-armed Bandit
Contextual Reliability: When Different Features Matter in Different Contexts
Fast Federated Machine Unlearning with Nonlinear Functional Theory
Hierarchical Grammar-Induced Geometry for Data-Efficient Molecular Property Prediction
Feature Expansion for Graph Neural Networks
Off-Policy Evaluation for Large Action Spaces via Conjunct Effect Modeling
On the convergence of the MLE as an estimator of the learning rate in the Exp3 algorithm
Improving Expert Predictions with Conformal Prediction
PLay: Parametrically Conditioned Layout Generation using Latent Diffusion
The Power of Preconditioning in Overparameterized Low-Rank Matrix Sensing
StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis
Model-Free Robust Average-Reward Reinforcement Learning
LongCoder: A Long-Range Pre-trained Language Model for Code Completion
Change is Hard: A Closer Look at Subpopulation Shift
Secure Federated Correlation Test and Entropy Estimation
Thompson Sampling with Less Exploration is Fast and Optimal
Efficient Online Reinforcement Learning with Offline Data
Understanding the Complexity Gains of Single-Task RL with a Curriculum
Explainability as statistical inference
"Why did the Model Fail?": Attributing Model Performance Changes to Distribution Shifts
Controlled Text Generation with Natural Language Instructions
Representation-Driven Reinforcement Learning
Demystifying Uneven Vulnerability of Link Stealing Attacks against Graph Neural Networks
MultiRobustBench: Benchmarking Robustness Against Multiple Attacks
Algorithmic Stability of Heavy-Tailed SGD with General Loss Functions
The Acquisition of Physical Knowledge in Generative Neural Networks
Text-To-Concept (and Back) via Cross-Model Alignment
Tight Regret Bounds for Single-pass Streaming Multi-armed Bandits
DP-Fast MH: Private, Fast, and Accurate Metropolis-Hastings for Large-Scale Bayesian Inference
Communication-Constrained Bandits under Additive Gaussian Noise
A Deep Conjugate Direction Method for Iteratively Solving Linear Systems
Run-off Election: Improved Provable Defense against Data Poisoning Attacks
A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition
Random Matrix Analysis to Balance between Supervised and Unsupervised Learning under the Low Density Separation Assumption
Learning Regions of Interest for Bayesian Optimization with Adaptive Level-Set Estimation
BiRT: Bio-inspired Replay in Vision Transformers for Continual Learning
Submodular Order Functions and Assortment Optimization
AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
Q-Flow: Generative Modeling for Differential Equations of Open Quantum Dynamics with Normalizing Flows
Quantum 3D Graph Learning with Applications to Molecule Embedding
Reinforcement Learning with History Dependent Dynamic Contexts
MetaModulation: Learning Variational Feature Hierarchies for Few-Shot Learning with Fewer Tasks
Multi-User Reinforcement Learning with Low Rank Rewards
Adaptive Barrier Smoothing for First-Order Policy Gradient with Contact Dynamics
Cluster Explanation via Polyhedral Descriptions
Measuring the Impact of Programming Language Distribution
Representation Learning with Multi-Step Inverse Kinematics: An Efficient and Optimal Approach to Rich-Observation RL
Towards Understanding the Role of Attention in Prompt-tuning
On the Privacy-Robustness-Utility Trilemma in Distributed Learning
Random Teachers are Good Teachers
Fast Online Value-Maximizing Prediction Sets with Conformal Cost Control
Fully-Adaptive Composition in Differential Privacy
Adaptively Weighted Data Augmentation Consistency Regularization for Robust Optimization under Concept Shift
User-defined Event Sampling and Uncertainty Quantification in Diffusion Models for Physical Dynamical Systems
Near-Minimax-Optimal Risk-Sensitive Reinforcement Learning with CVaR
Reconstructive Neuron Pruning for Backdoor Defense
RLEG: Vision-Language Representation Learning with Diffusion-based Embedding Generation
Sequential Underspecified Instrument Selection for Cause-Effect Estimation
Poisoning Language Models During Instruction Tuning
Domain Adaptation Under Relaxed Label Shift
Learning Deep Time-index Models for Time Series Forecasting
AutoCoreset: An Automatic Practical Coreset Construction Framework
TR0N: Translator Networks for 0-Shot Plug-and-Play Conditional Generation
Learning Controllable Degradation for Real-World Super-Resolution via Constrained Flows
Synthetic Data, Real Errors: How (Not) to Publish and Use Synthetic Data
Hypothesis Transfer Learning with Surrogate Classification Losses: Generalization Bounds through Algorithmic Stability
Efficient Training of Language Models using Few-Shot Learning
Fighting Fire with Fire: Contrastive Debiasing without Bias-free Data via Generative Bias-transformation
On the Interplay Between Misspecification and Sub-optimality Gap in Linear Contextual Bandits
SegCLIP: Patch Aggregation with Learnable Centers for Open-Vocabulary Semantic Segmentation
Q-learning Decision Transformer: Leveraging Dynamic Programming for Conditional Sequence Modelling in Offline RL
Action Matching: Learning Stochastic Dynamics from Samples
Masked Bayesian Neural Networks : Theoretical Guarantee and its Posterior Inference
Discover-Then-Rank Unlabeled Support Vectors in the Dual Space for Multi-Class Active Learning
Decentralized SGD and Average-direction SAM are Asymptotically Equivalent
SOM-CPC: Unsupervised Contrastive Learning with Self-Organizing Maps for Structured Representations of High-Rate Time Series
Hierarchical Learning in Hyperbolic Space: Revisit and Beyond
Evaluating Unsupervised Denoising Requires Unsupervised Metrics
Margin-based sampling in high dimensions: When being active is less efficient than staying passive
Neural Wave Machines: Learning Spatiotemporally Structured Representations with Locally Coupled Oscillatory Recurrent Neural Networks
Trading-Off Payments and Accuracy in Online Classification with Paid Stochastic Experts
Fair and Accurate Decision Making through Group-Aware Learning
CO-BED: Information-Theoretic Contextual Optimization via Bayesian Experimental Design
Nonlinear Advantage: Trained Networks Might Not Be As Complex as You Think
Bidirectional Learning for Offline Model-based Biological Sequence Design
SeedGNN: Graph Neural Network for Supervised Seeded Graph Matching
Loss Balancing for Fair Supervised Learning
TAN Without a Burn: Scaling Laws of DP-SGD
Controllable Neural Symbolic Regression
Predictable MDP Abstraction for Unsupervised Model-Based RL
Variational Curriculum Reinforcement Learning for Unsupervised Discovery of Skills
Online Nonstochastic Control with Adversarial and Static Constraints
EM-Network: Oracle Guided Self-distillation for Sequence Learning
Quantifying Human Priors over Social and Navigation Networks
One-vs-the-Rest Loss to Focus on Important Samples in Adversarial Training
Deep Laplacian-based Options for Temporally-Extended Exploration
Structure Learning of Latent Factors via Clique Search on Correlation Thresholded Graphs
Corruption-Robust Algorithms with Uncertainty Weighting for Nonlinear Contextual Bandits and MDPs
Patch-level Routing in Mixture-of-Experts is Provably Sample-efficient for Convolutional Neural Networks
Rethinking Visual Reconstruction: Experience-Based Content Completion Guided by Visual Cues
Weighted Tallying Bandits: Overcoming Intractability via Repeated Exposure Optimality
Continuously Parameterized Mixture Models
Distribution-dependent McDiarmid-type Inequalities for Functions of Unbounded Interaction
Learning Noisy OR Bayesian Networks with Max-Product Belief Propagation
Estimating Heterogeneous Treatment Effects: Mutual Information Bounds and Learning Algorithms
Best Arm Identification in Multi-Agent Multi-Armed Bandits
Neural Stochastic Differential Games for Time-series Analysis
Brauer's Group Equivariant Neural Networks
Label differential privacy and private training data release
Data Efficient Neural Scaling Law via Model Reusing
Synergies between Disentanglement and Sparsity: Generalization and Identifiability in Multi-Task Learning
Restoration-Degradation Beyond Linear Diffusions: A Non-Asymptotic Analysis For DDIM-type Samplers
Sample and Predict Your Latent: Modality-free Sequential Disentanglement via Contrastive Estimation
Learning Neural PDE Solvers with Parameter-Guided Channel Attention
Privacy-Aware Compression for Federated Learning Through Numerical Mechanism Design
Truncating Trajectories in Monte Carlo Reinforcement Learning
On the Forward Invariance of Neural ODEs
Global Optimization with Parametric Function Approximation
From Perception to Programs: Regularize, Overparameterize, and Amortize
Identifying Interpretable Subspaces in Image Representations
Online Learning in Stackelberg Games with an Omniscient Follower
Revisiting Sampling for Combinatorial Optimization
The Value of Out-of-Distribution Data
Bridging the Gap between Neural and Classical Approaches in Abstract Geometric Reasoning with Attention-based Lattice Symmetry Priors
Actor-Critic Alignment for Offline-to-Online Reinforcement Learning
Understanding the Distillation Process from Deep Generative Models to Tractable Probabilistic Circuits
LSDS++ : Dual Sampling for Accelerated k-means++
Locally Regularized Neural Differential Equations: Some Black Boxes were meant to remain closed!
Masked Trajectory Models for Prediction, Representation, and Control
In Search of Insights, Not Magic Bullets: Towards Demystification of the Model Selection Dilemma in Heterogeneous Treatment Effect Estimation
ReDi: Efficient Learning-Free Diffusion Inference via Trajectory Retrieval
The Monge Gap: A Regularizer to Learn All Transport Maps
Raising the Cost of Malicious AI-Powered Image Editing
Polarity Is All You Need to Learn and Transfer Faster
AbODE: Ab initio antibody design using conjoined ODEs
Learning-augmented private algorithms for multiple quantile release
A modern look at the relationship between sharpness and generalization
Horizon-free Learning for Markov Decision Processes and Games: Stochastically Bounded Rewards and Improved Bounds
RACE: Improve Multi-Agent Reinforcement Learning with Representation Asymmetry and Collaborative Evolution
LIV: Language-Image Representations and Rewards for Robotic Control
Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the Machiavelli Benchmark
Unveiling the Latent Space Geometry of Push-Forward Generative Models
K-SHAP: Policy Clustering Algorithm for Anonymous Multi-Agent State-Action Pairs
Efficient Latency-Aware CNN Depth Compression via Two-Stage Dynamic Programming
Optimal Convergence Rates for Agnostic Nystrom Kernel Learning
Variational Autoencoding Neural Operators
Efficient Parametric Approximations of Neural Network Function Space Distance
A Unifying Framework to the Analysis of Interaction Methods using Synergy Functions
Beyond the Universal Law of Robustness: Sharper Laws for Random Features and Neural Tangent Kernels
Distribution Free Domain Generalization
Theory on Forgetting and Generalization of Continual Learning
Robust Situational Reinforcement Learning in Face of Context Disturbances
Trapdoor Normalization with Irreversible Ownership Verification
Discover and Cure: Concept-aware Mitigation of Spurious Correlation
DADAO: Decoupled Accelerated Decentralized Asynchronous Optimization
A Simple Zero-shot Prompt Weighting Technique to Improve Prompt Ensembling in Text-Image Models
Optimally-weighted Estimators of the Maximum Mean Discrepancy for Likelihood-Free Inference
KDEformer: Accelerating Transformers via Kernel Density Estimation
DoCoFL: Downlink Compression for Cross-Device Federated Learning
Learning Representations without Compositional Assumptions
Deterministic equivalent and error universality of deep random features learning
Fast Algorithms for Distributed k-Clustering with Outliers
Minimax estimation of discontinuous optimal transport maps: The semi-discrete case
Robust Budget Pacing with a Single Sample
Compositional Exemplars for In-context Learning
Evaluating Self-Supervised Learning via Risk Decomposition
Looped Transformers as Programmable Computers
Generating Language Corrections for Teaching Physical Control Tasks
Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels
Does Continual Learning Equally Forget All Parameters?
Simple Embodied Language Learning as a Byproduct of Meta-Reinforcement Learning
Sparsity by Redundancy: Solving $L_1$ with SGD
Beyond Lipschitz Smoothness: A Tighter Analysis for Nonconvex Optimization
Evidential Interactive Learning for Medical Image Captioning
Two Losses Are Better Than One: Faster Optimization Using a Cheaper Proxy
Sequential Changepoint Detection via Backward Confidence Sequences
Robust Consensus in Ranking Data Analysis: Definitions, Properties and Computational Issues
Improving Adversarial Robustness by Putting More Regularizations on Less Robust Samples
simple diffusion: End-to-end diffusion for high resolution images
Multi-class Graph Clustering via Approximated Effective $p$-Resistance
Chemically Transferable Generative Backmapping of Coarse-Grained Proteins
Layered State Discovery for Incremental Autonomous Exploration
CataBEEM: Integrating Latent Interaction Categories in Node-wise Community Detection Models for Network Data
QuantumDARTS: Differentiable Quantum Architecture Search for Variational Quantum Algorithms
Accounting For Informative Sampling When Learning to Forecast Treatment Outcomes Over Time
High-Probability Bounds for Stochastic Optimization and Variational Inequalities: the Case of Unbounded Variance
Phase-aware Adversarial Defense for Improving Adversarial Robustness
LeadFL: Client Self-Defense against Model Poisoning in Federated Learning
Unifying Nesterov's Accelerated Gradient Methods for Convex and Strongly Convex Objective Functions
Controlling Type Confounding in Ad Hoc Teamwork with Instance-wise Teammate Feedback Rectification
Language Instructed Reinforcement Learning for Human-AI Coordination
A Neural PDE Solver with Temporal Stencil Modeling
On Second-Order Scoring Rules for Epistemic Uncertainty Quantification
Efficient Transformed Gaussian Processes for Non-Stationary Dependent Multi-class Classification
Scalable Multi-Agent Reinforcement Learning through Intelligent Information Aggregation
Deep Clustering with Incomplete Noisy Pairwise Annotations: A Geometric Regularization Approach
Distance Weighted Supervised Learning for Offline Interaction Data
PCA-based Multi-Task Learning: a Random Matrix Approach
Label Distributionally Robust Losses for Multi-class Classification: Consistency, Robustness and Adaptivity
Optimizing Hyperparameters with Conformal Quantile Regression
Recovering Top-Two Answers and Confusion Probability in Multi-Choice Crowdsourcing
Reduce, Reuse, Recycle: Compositional Generation with Energy-Based Diffusion Models and MCMC
Bandits with Knapsacks: Advice on Time-Varying Demands
Adaptive Compositional Continual Meta-Learning
One-Shot Compression of Large Edge-Exchangeable Graphs using Bits-Back Coding
Differentially Private Hierarchical Clustering with Provable Approximation Guarantees
Learning for Edge-Weighted Online Bipartite Matching with Robustness Guarantees
Equivariant Polynomials for Graph Neural Networks
Learning Intuitive Policies Using Action Features
Long-Tailed Recognition by Mutual Information Maximization between Latent Features and Ground-Truth Labels
Monge, Bregman and Occam: Interpretable Optimal Transport in High-Dimensions with Feature-Sparse Maps
LESSON: Learning to Integrate Exploration Strategies for Reinforcement Learning via an Option Framework
Do Embodied Agents Dream of Pixelated Sheep: Embodied Decision Making using Language Guided World Modelling
Improving Graph Neural Networks with Learnable Propagation Operators
Demonstration-free Autonomous Reinforcement Learning via Implicit and Bidirectional Curriculum
Effective Neural Topic Modeling with Embedding Clustering Regularization
The Price of Differential Privacy under Continual Observation
Analysis of Error Feedback in Federated Non-Convex Optimization with Biased Compression: Linear Speedup and Partial Participation
Equivariance with Learned Canonicalization Functions
Future-conditioned Unsupervised Pretraining for Decision Transformer
Low Complexity Homeomorphic Projection to Ensure Neural-Network Solution Feasibility for Optimization over (Non-)Convex Set
Approximate Thompson Sampling with Logarithmic Batches: Bandits and Reinforcement Learning
Reprogramming Pretrained Language Models for Antibody Sequence Infilling
Supported Trust Region Optimization for Offline Reinforcement Learning
Minimizing Trajectory Curvature of ODE-based Generative Models
Not All Semantics are Created Equal: Contrastive Self-supervised Learning with Automatic Temperature Individualization
Gradient Descent in Neural Networks as Sequential Learning in Reproducing Kernel Banach Space
Provably Convergent Schrödinger Bridge with Applications to Probabilistic Time Series Imputation
Difference-in-Differences Meets Tree-based Methods: Heterogeneous Treatment Effects Estimation with Unmeasured Confounding
Generative Adversarial Symmetry Discovery
On Excess Mass Behavior in Gaussian Mixture Models with Orlicz-Wasserstein Distances
Repository-Level Prompt Generation for Large Language Models of Code
Offline Learning in Markov Games with General Function Approximation
Local Optimization Achieves Global Optimality in Multi-Agent Reinforcement Learning
Fairness in Matching under Uncertainty
Learning Unnormalized Statistical Models via Compositional Optimization
Tensor Gaussian Process with Contraction for Multi-Channel Imaging Analysis
A Near-Optimal Algorithm for Safe Reinforcement Learning Under Instantaneous Hard Constraints
Scaling Laws for Multilingual Neural Machine Translation
Controlling Posterior Collapse by an Inverse Lipschitz Constraint on the Decoder Network
Likelihood Adjusted Semidefinite Programs for Clustering Heterogeneous Data
Efficient Bound of Lipschitz Constant for Convolutional Layers by Gram Iteration
How Many Perturbations Break This Model? Evaluating Robustness Beyond Adversarial Accuracy
Understanding Self-Distillation in the Presence of Label Noise
Data-Driven Subgroup Discovery for Linear Regression
Divide and Conquer Dynamic Programming: An Almost Linear Time Change Point Detection Methodology in High Dimensions
Smooth Non-stationary Bandits
PromptBoosting: Black-Box Text Classification with Ten Forward Passes
Formalizing Preferences Over Runtime Distributions
Second-order regression models exhibit progressive sharpening to the edge of stability
Scaling Spherical CNNs
Why does Throwing Away Data Improve Worst-Group Error?
Interventional Causal Representation Learning
On the Statistical Benefits of Temporal Difference Learning
Loss-Guided Diffusion Models for Plug-and-Play Controllable Generation
Hardware-Aware Compression with Random Operation Access Specific Tile (ROAST) Hashing
A Toy Model of Universality: Reverse Engineering how Networks Learn Group Operations
Fully Bayesian Autoencoders with Latent Sparse Gaussian Processes
Fairness in Streaming Submodular Maximization over a Matroid Constraint
Principled Reinforcement Learning with Human Feedback from Pairwise or K-wise Comparisons
Generalized Teacher Forcing for Learning Chaotic Dynamics
Distortion and Uncertainty Aware Loss for Panoramic Depth Completion
Is Consensus Acceleration Possible in Decentralized Optimization over Slowly Time-Varying Networks?
Why Random Pruning Is All We Need to Start Sparse
FREDIS: A Fusion Framework of Refinement and Disambiguation for Unreliable Partial Label Learning
On the Robustness of Text Vectorizers
SpotEM: Efficient Video Search for Episodic Memory
Multi-task Hierarchical Adversarial Inverse Reinforcement Learning
Hierarchical Imitation Learning with Vector Quantized Models
Minimalistic Predictions to Schedule Jobs with Online Precedence Constraints
Unscented Autoencoder
Vector-Valued Control Variates
Feature Directions Matter: Long-Tailed Learning via Rotated Balanced Representation
Expected Gradients of Maxout Networks and Consequences to Parameter Initialization
Do You Remember? Overcoming Catastrophic Forgetting for Fake Audio Detection
Learning Signed Distance Functions from Noisy 3D Point Clouds via Noise to Noise Mapping
AdaNPC: Exploring Non-Parametric Classifier for Test-Time Adaptation
A Closer Look at Self-Supervised Lightweight Vision Transformers
Uncovering Adversarial Risks of Test-Time Adaptation
Distributed Contextual Linear Bandits with Minimax Optimal Communication Cost
Fast Inference from Transformers via Speculative Decoding
Fair Densities via Boosting the Sufficient Statistics of Exponential Families
Horizon-Free and Variance-Dependent Reinforcement Learning for Latent Markov Decision Processes
Offline Reinforcement Learning with Closed-Form Policy Improvement Operators
Difference of submodular minimization via DC programming
A Fast Optimistic Method for Monotone Variational Inequalities
Neural Prediction Errors enable Analogical Visual Reasoning in Human Standard Intelligence Tests
Theoretical Guarantees of Learning Ensembling Strategies with Applications to Time Series Forecasting
Mitigating Memorization of Noisy Labels by Clipping the Model Prediction
A Gromov--Wasserstein Geometric View of Spectrum-Preserving Graph Coarsening
Revisiting Simple Regret: Fast Rates for Returning a Good Arm
Escaping saddle points in zeroth-order optimization: the power of two-point estimators
Beyond the Edge of Stability via Two-step Gradient Updates
Perturbation Analysis of Neural Collapse
PWSHAP: A Path-Wise Explanation Model for Targeted Variables
Tighter Bounds on the Expressivity of Transformer Encoders
Enhancing Activity Prediction Models in Drug Discovery with the Ability to Understand Human Language
Dual Focal Loss for Calibration
Everyone's Preference Changes Differently: A Weighted Multi-Interest Model For Retrieval
Disentangled Multiplex Graph Representation Learning
The Implicit Regularization of Dynamical Stability in Stochastic Gradient Descent
Conformal Inference is (almost) Free for Neural Networks Trained with Early Stopping
Beyond Homophily: Reconstructing Structure for Graph-agnostic Clustering
Surrogate Model Extension (SME): A Fast and Accurate Weight Update Attack on Federated Learning
Tight Certification of Adversarially Trained Neural Networks via Nonconvex Low-Rank Semidefinite Relaxations
Quantum Policy Gradient Algorithm with Optimized Action Decoding
Learning Functional Distributions with Private Labels
Learning Subpocket Prototypes for Generalizable Structure-based Drug Design
Learning Antidote Data to Individual Unfairness
Stratified Adversarial Robustness with Rejection
Marginalization is not Marginal: No Bad VAE Local Minima when Learning Optimal Sparse Representations
Fundamental Tradeoffs in Learning with Prior Information
Does a Neural Network Really Encode Symbolic Concepts?
Semiparametrically Efficient Off-Policy Evaluation in Linear Markov Decision Processes
Causal Strategic Classification: A Tale of Two Shifts
Trajectory-Aware Eligibility Traces for Off-Policy Reinforcement Learning
MAHALO: Unifying Offline Reinforcement Learning and Imitation Learning from Observations
Answering Complex Logical Queries on Knowledge Graphs via Query Computation Tree Optimization
A Complete Expressiveness Hierarchy for Subgraph GNNs via Subgraph Weisfeiler-Lehman Tests
Personalized Subgraph Federated Learning
Rethink DARTS Search Space and Renovate a New Benchmark
Optimal LP Rounding and Linear-Time Approximation Algorithms for Clustering Edge-Colored Hypergraphs
Random Shuffle Transformer for Image Restoration
Learning Physical Models that Can Respect Conservation Laws
Leveraging Demonstrations to Improve Online Learning: Quality Matters
Learning-Rate-Free Learning by D-Adaptation
Probabilistic Contrastive Learning Recovers the Correct Aleatoric Uncertainty of Ambiguous Inputs
Efficient exploration via epistemic-risk-seeking policy gradients
GEAR: A GPU-Centric Experience Replay System for Large Reinforcement Learning Models
Why Target Networks Stabilise Temporal Difference Methods
Taming graph kernels with random features
SGD-induced drift of representation in a two-layer neural network
Polyhedral Complex Extraction from ReLU Networks using 1-skeleton
Trainability, Expressivity and Interpretability in Gated Neural ODEs
Hidden symmetries of ReLU networks
Which Features are Learnt by Contrastive Learning? On the Role of Simplicity Bias in Class Collapse and Feature Suppression
Probing the Deep Neural Manifold of Reinforcement Learning to Expose Volatility
Single Point-Based Distributed Zeroth-Order Optimization with a Non-Convex Stochastic Objective Function
Accelerated Cyclic Coordinate Dual Averaging with Extrapolation for Composite Convex Optimization
ACAT: Adversarial Counterfactual Attention for Classification and Detection in Medical Imaging
Sparse Learning of Dynamical Systems in RKHS: An Operator-Theoretic Approach
Robust Collaborative Learning with Linear Gradient Overhead
Simple Disentanglement of Style and Content in Visual Representations
We use cookies to store which papers have been visited.
I agree
ICML uses cookies to remember that you are logged in. By using our websites, you agree to the placement of these cookies.
Our Privacy Policy »
Accept Cookies