ICML 2023 Papers

Layout:

mini compact topic detail

Sample Complexity Bounds for Learning High-dimensional Simplices in Noisy Regimes

Grounding Language Models to Images for Multimodal Inputs and Outputs

Training-Free Neural Active Learning with Initialization-Robustness Guarantees

Scaling Vision Transformers to 22 Billion Parameters

ED-Batch: Efficient Automatic Batching of Dynamic Neural Networks via Learned Finite State Machines

Chameleon: Adapting to Peer Images for Planting Durable Backdoors in Federated Learning

Partial Optimality in Cubic Correlation Clustering

Poisoning Language Models During Instruction Tuning

Learning Distributions over Quantum Measurement Outcomes

Recasting Self-Attention with Holographic Reduced Representations

Wasserstein Barycenter Matching for Graph Size Generalization of Message Passing Neural Networks

Unveiling The Mask of Position-Information Pattern Through the Mist of Image Features

Fast Federated Machine Unlearning with Nonlinear Functional Theory

End-to-End Full-Atom Antibody Design

Dimension-independent Certified Neural Network Watermarks via Mollifier Smoothing

Efficient Personalized Federated Learning via Sparse Model-Adaptation

Byzantine-Robust Learning on Heterogeneous Data via Gradient Splitting

Equivariant Polynomials for Graph Neural Networks

Invariant Slot Attention: Object Discovery with Slot-Centric Reference Frames

Fed-CBS: A Heterogeneity-Aware Client Sampling Mechanism for Federated Learning via Class-Imbalance Reduction

QASA: Advanced Question Answering on Scientific Articles

Set-membership Belief State-based Reinforcement Learning for POMDPs

Towards Unbiased Training in Federated Open-world Semi-supervised Learning

NP-SemiSeg: When Neural Processes meet Semi-Supervised Semantic Segmentation

Temporally Consistent Transformers for Video Generation

Decentralized SGD and Average-direction SAM are Asymptotically Equivalent

Tilted Sparse Additive Models

Test-Time Style Shifting: Handling Arbitrary Styles in Domain Generalization

Which is Better for Learning with Noisy Labels: The Semi-supervised Method or Modeling Label Noise?

A Simple Zero-shot Prompt Weighting Technique to Improve Prompt Ensembling in Text-Image Models

A Coupled Flow Approach to Imitation Learning

On Strengthening and Defending Graph Reconstruction Attack with Markov Chain Approximation

Adaptive Annealed Importance Sampling with Constant Rate Progress

DoCoFL: Downlink Compression for Cross-Device Federated Learning

Topological Point Cloud Clustering

Constant Matters: Fine-grained Error Bound on Differentially Private Continual Observation

PixelAsParam: A Gradient View on Diffusion Sampling with Guidance

Adaptive Whitening in Neural Populations with Gain-modulating Interneurons

The Hessian perspective into the Nature of Convolutional Neural Networks

Reprogramming Pretrained Language Models for Antibody Sequence Infilling

Unifying Nesterov's Accelerated Gradient Methods for Convex and Strongly Convex Objective Functions

Shiftable Context: Addressing Training-Inference Context Mismatch in Simultaneous Speech Translation

Complementary Attention for Multi-Agent Reinforcement Learning

MultiRobustBench: Benchmarking Robustness Against Multiple Attacks

Scaling Laws for Reward Model Overoptimization

Reflected Diffusion Models

LEVER: Learning to Verify Language-to-Code Generation with Execution

Learning to Bid in Repeated First-Price Auctions with Budgets

Semi-Dual Unbalanced Quadratic Optimal Transport: fast statistical rates and convergent algorithm.

Multi-task Representation Learning for Pure Exploration in Linear Bandits

Online Restless Bandits with Unobserved States

Banker Online Mirror Descent: A Universal Approach for Delayed Online Bandit Learning

Out-of-Domain Robustness via Targeted Augmentations

Tuning Language Models as Training Data Generators for Augmentation-Enhanced Few-Shot Learning

Test-time Adaptation with Slot-Centric Models

Meta-SAGE: Scale Meta-Learning Scheduled Adaptation with Guided Exploration for Mitigating Scale Shift on Combinatorial Optimization

The Blessing of Heterogeneity in Federated Q-Learning: Linear Speedup and Beyond

Pre-computed memory or on-the-fly encoding? A hybrid approach to retrieval augmentation makes the most of your compute

Conditions and Assumptions for Constraint-based Causal Structure Learning

Fast Online Node Labeling for Very Large Graphs

Selective Machine Learning of the Average Treatment Effect with an Invalid Instrumental Variable

What Makes Entities Similar? A Similarity Flooding Perspective for Multi-sourced Knowledge Graph Embeddings

Drug Discovery under Covariate Shift with Domain-Informed Prior Distributions over Functions

Raising the Cost of Malicious AI-Powered Image Editing

Competitive Gradient Optimization

Graph Inductive Biases in Transformers without Message Passing

Fast Sampling of Diffusion Models via Operator Learning

Model Ratatouille: Recycling Diverse Models for Out-of-Distribution Generalization

Offline Reinforcement Learning with Closed-Form Policy Improvement Operators

Efficient and Degree-Guided Graph Generation via Discrete Diffusion Modeling

On the convergence of the MLE as an estimator of the learning rate in the Exp3 algorithm

Revisiting Discriminative vs. Generative Classifiers: Theory and Implications

Optimizing Hyperparameters with Conformal Quantile Regression

Optimizing DDPM Sampling with Shortcut Fine-Tuning

Federated Conformal Predictors for Distributed Uncertainty Quantification

Future-conditioned Unsupervised Pretraining for Decision Transformer

Learning Subpocket Prototypes for Generalizable Structure-based Drug Design

Synthetic Data, Real Errors: How (Not) to Publish and Use Synthetic Data

Paging with Succinct Predictions

Optimal Shrinkage for Distributed Second-Order Optimization

A new near-linear time algorithm for k-nearest neighbor search using a compressed cover tree

Analyzing Convergence in Quantum Neural Networks: Deviations from Neural Tangent Kernels

On Distribution Dependent Sub-Logarithmic Query Time of Learned Indexing

Principled Reinforcement Learning with Human Feedback from Pairwise or K-wise Comparisons

Beyond Reward: Offline Preference-guided Policy Optimization

Cold Analysis of Rao-Blackwellized Straight-Through Gumbel-Softmax Gradient Estimator

Correcting discount-factor mismatch in on-policy policy gradient methods

The multimarginal optimal transport formulation of adversarial multiclass classification

CLIPood: Generalizing CLIP to Out-of-Distributions

A Law of Robustness beyond Isoperimetry

Differential Privacy has Bounded Impact on Fairness in Classification

Neural Status Registers

Dimensionality Reduction for General KDE Mode Finding

Which Invariance Should We Transfer? A Causal Minimax Learning Approach

Does a Neural Network Really Encode Symbolic Concepts?

AdaBoost is not an Optimal Weak to Strong Learner

The Fast Johnson-Lindenstrauss Transform Is Even Faster

Nonlinear Advantage: Trained Networks Might Not Be As Complex as You Think

Learning the Dynamics of Sparsely Observed Interacting Systems

Constrained Decision Transformer for Offline Safe Reinforcement Learning

Low-Switching Policy Gradient with Exploration via Online Sensitivity Sampling

LinSATNet: The Positive Linear Satisfiability Neural Networks

GEAR: A GPU-Centric Experience Replay System for Large Reinforcement Learning Models

Consistency Models

Comparison of meta-learners for estimating multi-valued treatment heterogeneous effects

Self-supervised learning of Split Invariant Equivariant representations

Privacy-Aware Compression for Federated Learning Through Numerical Mechanism Design

Gaussian processes at the Helm(holtz): A more fluid model for ocean currents

A Two-Stage Active Learning Algorithm for k-Nearest Neighbors

Data-Copying in Generative Models: A Formal Framework

Generalized Reductions: Making any Hierarchical Clustering Fair and Balanced with Low Cost

Accelerated Primal-Dual Methods for Convex-Strongly-Concave Saddle Point Problems

From Noisy Fixed-Point Iterations to Private ADMM for Centralized and Federated Learning

Multi-Symmetry Ensembles: Improving Diversity and Generalization via Opposing Symmetries

Towards Theoretical Understanding of Inverse Reinforcement Learning

Accelerated Cyclic Coordinate Dual Averaging with Extrapolation for Composite Convex Optimization

Learning Preconditioners for Conjugate Gradient PDE Solvers

Anchor Sampling for Federated Learning with Partial Client Participation

MG-GNN: Multigrid Graph Neural Networks for Learning Multilevel Domain Decomposition Methods

Human-Timescale Adaptation in an Open-Ended Task Space

Improving the Model Consistency of Decentralized Federated Learning

Men Also Do Laundry: Multi-Attribute Bias Amplification

Auxiliary Modality Learning with Generalized Curriculum Distillation

ELSA: Efficient Label Shift Adaptation through the Lens of Semiparametric Models

SMURF-THP: Score Matching-based UnceRtainty quantiFication for Transformer Hawkes Process

Prometheus: Taming Sample and Communication Complexities in Constrained Decentralized Stochastic Bilevel Learning

Live in the Moment: Learning Dynamics Model Adapted to Evolving Policy

A General Representation Learning Framework with Generalization Performance Guarantees

SAAL: Sharpness-Aware Active Learning

Brainformers: Trading Simplicity for Efficiency

Multiple Thinking Achieving Meta-Ability Decoupling for Object Navigation

Normalizing Flows for Interventional Density Estimation

Is Overfitting Necessary for Implicit Video Representation?

Fast Combinatorial Algorithms for Min Max Correlation Clustering

Scaling Laws for Generative Mixed-Modal Language Models

DSGD-CECA: Decentralized SGD with Communication-Optimal Exact Consensus Algorithm

A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition

Memory-Based Meta-Learning on Non-Stationary Distributions

LipsNet: A Smooth and Robust Neural Network with Adaptive Lipschitz Constant for High Accuracy Optimal Control

Context-Aware Bayesian Network Actor-Critic Methods for Cooperative Multi-Agent Reinforcement Learning

Scalable Set Encoding with Universal Mini-Batch Consistency and Unbiased Full Set Gradient Approximation

Margin-based Neural Network Watermarking

Vector Quantized Wasserstein Auto-Encoder

Unleashing Mask: Explore the Intrinsic Out-of-Distribution Detection Capability

On the Importance of Feature Decorrelation for Unsupervised Representation Learning in Reinforcement Learning

Geometric Autoencoders - What You See is What You Decode

Estimating Heterogeneous Treatment Effects: Mutual Information Bounds and Learning Algorithms

FAIRER: Fairness as Decision Rationale Alignment

SE(3) diffusion model with application to protein backbone generation

Performative Recommendation: Diversifying Content via Strategic Incentives

How Does Information Bottleneck Help Deep Learning?

Momentum Ensures Convergence of SIGNSGD under Weaker Assumptions

Bidirectional Adaptation for Robust Semi-Supervised Learning with Inconsistent Data Distributions

OpenFE: Automated Feature Generation with Expert-level Performance

Model-based Reinforcement Learning with Scalable Composite Policy Gradient Estimators

On the Expressive Power of Geometric Graph Neural Networks

When and How Does Known Class Help Discover Unknown Ones? Provable Understanding Through Spectral Analysis

Eliminating Adversarial Noise via Information Discard and Robust Representation Restoration

Optimal No-Regret Learning for One-Sided Lipschitz Functions

Horizon-Free and Variance-Dependent Reinforcement Learning for Latent Markov Decision Processes

A Nearly-Optimal Bound for Fast Regression with $\ell_\infty$ Guarantee

Simple and Fast Group Robustness by Automatic Feature Reweighting

XTab: Cross-table Pretraining for Tabular Transformers

Controlled Text Generation with Natural Language Instructions

Adversarial robustness of amortized Bayesian inference

Explaining Reinforcement Learning with Shapley Values

Explaining the effects of non-convergent MCMC in the training of Energy-Based Models

Exploring Chemical Space with Score-based Out-of-distribution Generation

Second-Order Optimization with Lazy Hessians

Great Models Think Alike: Improving Model Reliability via Inter-Model Latent Agreement

Diffusion Models as Artists: Are we Closing the Gap between Humans and Machines?

Hindsight Learning for MDPs with Exogenous Inputs

Brauer's Group Equivariant Neural Networks

Constrained Efficient Global Optimization of Expensive Black-box Functions

Auto-Differentiation of Relational Computations for Very Large Scale Machine Learning

A Deep Conjugate Direction Method for Iteratively Solving Linear Systems

How Powerful are Shallow Neural Networks with Bandlimited Random Weights?

Nonparametric Generative Modeling with Conditional Sliced-Wasserstein Flows

The Wisdom of Hindsight Makes Language Models Better Instruction Followers

Exploring Model Dynamics for Accumulative Poisoning Discovery

High-dimensional Clustering onto Hamiltonian Cycle

Bag of Tricks for Training Data Extraction from Language Models

Unearthing InSights into Mars: Unsupervised Source Separation with Limited Data

Faith-Shap: The Faithful Shapley Interaction Index

Curriculum Co-disentangled Representation Learning across Multiple Environments for Social Recommendation

Understand and Modularize Generator Optimization in ELECTRA-style Pretraining

Vertical Federated Graph Neural Network for Recommender System

Practical and Matching Gradient Variance Bounds for Black-Box Variational Bayesian Inference

Concept-based Explanations for Out-of-Distribution Detectors

Understanding Plasticity in Neural Networks

GeCoNeRF: Few-shot Neural Radiance Fields via Geometric Consistency

Improved Learning-Augmented Algorithms for the Multi-Option Ski Rental Problem via Best-Possible Competitive Analysis

The Statistical Benefits of Quantile Temporal-Difference Learning for Value Estimation

Gradient-Free Structured Pruning with Unlabeled Data

Interpretable Neural-Symbolic Concept Reasoning

XAI Beyond Classification: Interpretable Neural Clustering

Blossom: an Anytime Algorithm for Computing Optimal Decision Trees

Less is More: Task-aware Layer-wise Distillation for Language Model Compression

Probabilistic Concept Bottleneck Models

Optimizing Mode Connectivity for Class Incremental Learning

Smart Initial Basis Selection for Linear Programs

Minimum Width of Leaky-ReLU Neural Networks for Uniform Universal Approximation

Rethink DARTS Search Space and Renovate a New Benchmark

Averaged Method of Multipliers for Bi-Level Optimization without Lower-Level Strong Convexity

Understanding Gradient Regularization in Deep Learning: Efficient Finite-Difference Computation and Implicit Bias

Policy Contrastive Imitation Learning

Statistical Inference and A/B Testing for First-Price Pacing Equilibria

Stratified Adversarial Robustness with Rejection

RLang: A Declarative Language for Describing Partial World Knowledge to Reinforcement Learning Agents

Offline Meta Reinforcement Learning with In-Distribution Online Adaptation

Answering Complex Logical Queries on Knowledge Graphs via Query Computation Tree Optimization

Uncertainty Estimation for Molecules: Desiderata and Methods

Transformers Meet Directed Graphs

Modeling Temporal Data as Continuous Functions with Stochastic Process Diffusion

Crafting Training Degradation Distribution for the Accuracy-Generalization Trade-off in Real-World Super-Resolution

SEGA: Structural Entropy Guided Anchor View for Graph Contrastive Learning

Learning Neural Constitutive Laws from Motion Observations for Generalizable PDE Dynamics

Evolving Semantic Prototype Improves Generative Zero-Shot Learning

Enhancing Activity Prediction Models in Drug Discovery with the Ability to Understand Human Language

Contextual Combinatorial Bandits with Probabilistically Triggered Arms

Dataset Distillation with Convexified Implicit Gradients

Structure-informed Language Models Are Protein Designers

Cross-Modal Fine-Tuning: Align then Refine

Prompting Large Language Model for Machine Translation: A Case Study

Patch-level Contrastive Learning via Positional Query for Visual Pre-training

Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models

Linear optimal partial transport embedding

Understanding Int4 Quantization for Language Models: Latency Speedup, Composability, and Failure Cases

Neural Network Approximations of PDEs Beyond Linearity: A Representational Perspective

Counterfactual Identifiability of Bijective Causal Models

Subequivariant Graph Reinforcement Learning in 3D Environments

Generalized Polyak Step Size for First Order Optimization with Momentum

On Investigating the Conservative Property of Score-Based Generative Models

Oscillation-free Quantization for Low-bit Vision Transformers

Leveraging Demonstrations to Improve Online Learning: Quality Matters

Loss-Guided Diffusion Models for Plug-and-Play Controllable Generation

Causal Isotonic Calibration for Heterogeneous Treatment Effects

What is Essential for Unseen Goal Generalization of Offline Goal-conditioned RL?

Towards Robust and Safe Reinforcement Learning with Benign Off-policy Data

Dynamical Linear Bandits

Beyond the Universal Law of Robustness: Sharper Laws for Random Features and Neural Tangent Kernels

Function-Space Regularization in Neural Networks: A Probabilistic Perspective

Feature Directions Matter: Long-Tailed Learning via Rotated Balanced Representation

Interpolation for Robust Learning: Data Augmentation on Wasserstein Geodesics

Streaming Active Learning with Deep Neural Networks

Towards Omni-generalizable Neural Methods for Vehicle Routing Problems

Adversarial Example Does Good: Preventing Painting Imitation from Diffusion Models via Adversarial Examples

Escaping saddle points in zeroth-order optimization: the power of two-point estimators

UMD: Unsupervised Model Detection for X2X Backdoor Attacks

Infusing Lattice Symmetry Priors in Attention Mechanisms for Sample-Efficient Abstract Geometric Reasoning

Kernel QuantTree

Accuracy on the Curve: On the Nonlinear Correlation of ML Performance Between Data Subpopulations

Long-Term Rhythmic Video Soundtracker

Rethinking Warm-Starts with Predictions: Learning Predictions Close to Sets of Optimal Solutions for Faster $\text{L}$-/$\text{L}^\natural$-Convex Function Minimization

Memory-Based Dual Gaussian Processes for Sequential Learning

Semi-Autoregressive Energy Flows: Exploring Likelihood-Free Training of Normalizing Flows

Multi-Task Differential Privacy Under Distribution Skew

Exploiting locality in high-dimensional Factorial hidden Markov models

Inflow, Outflow, and Reciprocity in Machine Learning

End-to-End Learning for Stochastic Optimization: A Bayesian Perspective

Behavior Contrastive Learning for Unsupervised Skill Discovery

Beyond Homophily: Reconstructing Structure for Graph-agnostic Clustering

Project and Forget: Solving Large-Scale Metric Constrained Problems

Target-based Surrogates for Stochastic Optimization

Transformers as Algorithms: Generalization and Stability in In-context Learning

Sequence Modeling with Multiresolution Convolutional Memory

Let's Make Block Coordinate Descent Converge Faster: Faster Greedy Rules, Message-Passing, Active-Set Complexity, and Superlinear Convergence

Near-Optimal Algorithms for Private Online Optimization in the Realizable Regime

Improved Techniques for Maximum Likelihood Estimation for Diffusion ODEs

ILLUME: Rationalizing Vision-Language Models through Human Interactions

On the Convergence Rates of Policy Gradient Methods

Quantifying Human Priors over Social and Navigation Networks

Cluster-Specific Predictions with Multi-Task Gaussian Processes

Disentangled Multiplex Graph Representation Learning

Beyond Lipschitz Smoothness: A Tighter Analysis for Nonconvex Optimization

Graph Generative Model for Benchmarking Graph Neural Networks

A New PHO-rmula for Improved Performance of Semi-Structured Networks

Federated Online and Bandit Convex Optimization

Towards Trustworthy Explanation: On Causal Rationalization

Emergent Agentic Transformer from Chain of Hindsight Experience

Generating Novel, Designable, and Diverse Protein Structures by Equivariantly Diffusing Oriented Residue Clouds

Rigid Body Flows for Sampling Molecular Crystal Structures

Continual Learning in Linear Classification on Separable Data

Hypervolume Knowledge Gradient: A Lookahead Approach for Multi-Objective Bayesian Optimization with Partial Information

Combinatorial Neural Bandits

Distributed Stochastic Gradient Descent: Nonconvexity, Nonsmoothness, and Convergence to Local Minima

Semi-Parametric Contextual Pricing Algorithm using Cox Proportional Hazards Model

When Sparsity Meets Contrastive Models: Less Graph Data Can Bring Better Class-Balanced Representations

Improved Analysis of Score-based Generative Modeling: User-Friendly Bounds under Minimal Smoothness Assumptions

Constrained Optimization via Exact Augmented Lagrangian and Randomized Iterative Sketching

Competing for Shareable Arms in Multi-Player Multi-Armed Bandits

Regularization-free Diffeomorphic Temporal Alignment Nets

CircuitNet: A Generic Neural Network to Realize Universal Circuit Motif Modeling

Learning Physical Models that Can Respect Conservation Laws

Sampling random graph homomorphisms and applications to network data analysis

Cooperative Open-ended Learning Framework for Zero-Shot Coordination

Lazy Agents: A New Perspective on Solving Sparse Reward Problem in Multi-agent Reinforcement Learning

Multi-Modal Classifiers for Open-Vocabulary Object Detection

Optimizing the Collaboration Structure in Cross-Silo Federated Learning

Effective Neural Topic Modeling with Embedding Clustering Regularization

Probably Anytime-Safe Stochastic Combinatorial Semi-Bandits

Shedding a PAC-Bayesian Light on Adaptive Sliced-Wasserstein Distances

LeadFL: Client Self-Defense against Model Poisoning in Federated Learning

Generative Pretraining for Black-Box Optimization

Diffusion Models for Black-Box Optimization

Robust Budget Pacing with a Single Sample

Mu$^2$SLAM: Multitask, Multilingual Speech and Language Models

Thompson Sampling with Diffusion Generative Prior

Revisiting Pseudo-Label for Single-Positive Multi-Label Learning

Progressive Purification for Instance-Dependent Partial Label Learning

Global Convergence of Sub-gradient Method for Robust Matrix Recovery: Small Initialization, Noisy Measurements, and Over-parameterization

Patch-level Routing in Mixture-of-Experts is Provably Sample-efficient for Convolutional Neural Networks

A Complete Expressiveness Hierarchy for Subgraph GNNs via Subgraph Weisfeiler-Lehman Tests

DDGR: Continual Learning with Deep Diffusion-based Generative Replay

Generalized-Smooth Nonconvex Optimization is As Efficient As Smooth Nonconvex Optimization

A General Theory for Federated Optimization with Asynchronous and Heterogeneous Clients Updates

Free-Form Variational Inference for Gaussian Process State-Space Models

Transformed Distribution Matching for Missing Value Imputation

Identifying Useful Learnwares for Heterogeneous Label Spaces

Graph Neural Networks can Recover the Hidden Features Solely from the Graph Structure

Disentangled Generative Models for Robust Prediction of System Dynamics

Simplifying Momentum-based Positive-definite Submanifold Optimization with Applications to Deep Learning

Linearly Constrained Bilevel Optimization: A Smoothed Implicit Gradient Approach

Dirichlet Diffusion Score Model for Biological Sequence Generation

Lifelong Language Pretraining with Distribution-Specialized Experts

Self-Interpretable Time Series Prediction with Counterfactual Explanations

Robust Perception through Equivariance

Is Learning Summary Statistics Necessary for Likelihood-free Inference?

Simplex Random Features

Demonstration-free Autonomous Reinforcement Learning via Implicit and Bidirectional Curriculum

Efficient Bound of Lipschitz Constant for Convolutional Layers by Gram Iteration

Data Poisoning Attacks Against Multimodal Encoders

Properties of the Mallows Model Depending on the Number of Alternatives: A Warning for an Experimentalist

Data-OOB: Out-of-bag Estimate as a Simple and Efficient Data Value

Diagnosis, Feedback, Adaptation: A Human-in-the-Loop Framework for Test-Time Policy Adaptation

Specializing Smaller Language Models towards Multi-Step Reasoning

SeMAIL: Eliminating Distractors in Visual Imitation via Separated Models

Continual Task Allocation in Meta-Policy Network via Sparse Prompting

Structured Cooperative Learning with Graphical Model Priors

Does Continual Learning Equally Forget All Parameters?

InfoDiffusion: Representation Learning Using Information Maximizing Diffusion Models

Learning-Rate-Free Learning by D-Adaptation

EM-Network: Oracle Guided Self-distillation for Sequence Learning

Coordinated Dynamic Bidding in Repeated Second-Price Auctions with Budgets

Implicit Graph Neural Networks: A Monotone Operator Viewpoint

SurProGenes: Survival Risk-Ordered Representation of Cancer Patients and Genes for the Identification of Prognostic Genes

Rockmate: an Efficient, Fast, Automatic and Generic Tool for Re-materialization in PyTorch

Towards Understanding Ensemble Distillation in Federated Learning

On the Convergence of Federated Averaging with Cyclic Client Participation

One-sided Matrix Completion from Two Observations Per Row

Autoregressive Diffusion Model for Graph Generation

Fast Private Kernel Density Estimation via Locality Sensitive Quantization

Statistical Inference on Multi-armed Bandits with Delayed Feedback

Robust Explanation for Free or At the Cost of Faithfulness

Tight Data Access Bounds for Private Top-$k$ Selection

dugMatting: Decomposed-Uncertainty-Guided Matting

R-U-SURE? Uncertainty-Aware Code Suggestions By Maximizing Utility Across Random User Intents

Learning Controllable Degradation for Real-World Super-Resolution via Constrained Flows

Bandit Multi-linear DR-Submodular Maximization and Its Applications on Adversarial Submodular Bandits

Are Random Decompositions all we need in High Dimensional Bayesian Optimisation?

Computational Doob h-transforms for Online Filtering of Discretely Observed Diffusions

Resurrecting Recurrent Neural Networks for Long Sequences

The Value of Out-of-Distribution Data

Meta Optimal Transport

A Picture of the Space of Typical Learnable Tasks

Discover and Cure: Concept-aware Mitigation of Spurious Correlation

Estimating Joint Treatment Effects by Combining Multiple Experiments

Efficient Graph Field Integrators Meet Point Clouds

Towards Reliable Neural Specifications

On Computing Optimal Tree Ensembles

Learning Mixtures of Gaussians with Censored Data

Nonlinear Causal Discovery with Latent Confounders

Functional Neural Networks: Shift invariant models for functional data with applications to EEG classification

When is Realizability Sufficient for Off-Policy Reinforcement Learning?

Provable Benefit of Mixup for Finding Optimal Decision Boundaries

Near-Optimal Cryptographic Hardness of Agnostically Learning Halfspaces and ReLU Regression under Gaussian Marginals

Performative Reinforcement Learning

Neuro-Symbolic Continual Learning: Knowledge, Reasoning Shortcuts and Concept Rehearsal

Learning Control-Oriented Dynamical Structure from Data

Mitigating Propagation Failures in Physics-informed Neural Networks using Retain-Resample-Release (R3) Sampling

Uncertain Evidence in Probabilistic Models and Stochastic Simulators

Open-VCLIP: Transforming CLIP to an Open-vocabulary Video Model via Interpolated Weight Optimization

POUF: Prompt-Oriented Unsupervised Fine-tuning for Large Pre-trained Models

Shape-Guided Dual-Memory Learning for 3D Anomaly Detection

Temporal Label Smoothing for Early Event Prediction

Identifiability and Generalizability in Constrained Inverse Reinforcement Learning

Deep Clustering with Incomplete Noisy Pairwise Annotations: A Geometric Regularization Approach

Learning Rate Schedules in the Presence of Distribution Shift

GOAT: A Global Transformer on Large-scale Graphs

Local Vertex Colouring Graph Neural Networks

A Hybrid Quantum-Classical Approach based on the Hadamard Transform for the Convolutional Layer

PAC-Bayesian Offline Contextual Bandits With Guarantees

On Balancing Bias and Variance in Unsupervised Multi-Source-Free Domain Adaptation

Multi-class Graph Clustering via Approximated Effective $p$-Resistance

A Framework for Adapting Offline Algorithms to Solve Combinatorial Multi-Armed Bandit Problems with Bandit Feedback

Accelerated Stochastic Optimization Methods under Quasar-convexity

Personalized Subgraph Federated Learning

A Kernel Stein Test of Goodness of Fit for Sequential Models

MultiAdam: Parameter-wise Scale-invariant Optimizer for Multiscale Training of Physics-informed Neural Networks

VIMA: Robot Manipulation with Multimodal Prompts

Gradient Descent Finds the Global Optima of Two-Layer Physics-Informed Neural Networks

On User-Level Private Convex Optimization

Tensor Gaussian Process with Contraction for Multi-Channel Imaging Analysis

Learning to Jump: Thinning and Thickening Latent Counts for Generative Modeling

Are labels informative in semi-supervised learning? Estimating and leveraging the missing-data mechanism.

A Neural PDE Solver with Temporal Stencil Modeling

Understanding Oversquashing in GNNs through the Lens of Effective Resistance

The Numerical Stability of Hyperbolic Representation Learning

Interval Bound Interpolation for Few-shot Learning with Few Tasks

A Model-free Closeness-of-influence Test for Features in Supervised Learning

Generalized Disparate Impact for Configurable Fairness Solutions in ML

Truncating Trajectories in Monte Carlo Reinforcement Learning

Trapdoor Normalization with Irreversible Ownership Verification

For Pre-Trained Vision Models in Motor Control, Not All Policy Learning Methods are Created Equal

HyperTuning: Toward Adapting Large Language Models without Back-propagation

Underspecification Presents Challenges for Credibility in Modern Machine Learning

Doubly Optimal No-Regret Learning in Monotone Games

Kernel Sufficient Dimension Reduction and Variable Selection for Compositional Data via Amalgamation

Active Learning based Structural Inference

Curious Replay for Model-based Adaptation

From Temporal to Contemporaneous Iterative Causal Discovery in the Presence of Latent Confounders

The Power of Uniform Sampling for k-Median

Towards Understanding and Reducing Graph Structural Noise for GNNs

BEATs: Audio Pre-Training with Acoustic Tokenizers

Optimistic Planning by Regularized Dynamic Programming

FedAvg Converges to Zero Training Loss Linearly for Overparameterized Multi-Layer Neural Networks

A Closer Look at Self-Supervised Lightweight Vision Transformers

Feed Two Birds with One Scone: Exploiting Wild Data for Both Out-of-Distribution Generalization and Detection

Defects of Convolutional Decoder Networks in Frequency Representation

An Instrumental Variable Approach to Confounded Off-Policy Evaluation

Multi-agent Online Scheduling: MMS Allocations for Indivisible Items

Cooperation in the Latent Space: The Benefits of Adding Mixture Components in Variational Autoencoders

Learning Belief Representations for Partially Observable Deep RL

Discrete Continuous Optimization Framework for Simultaneous Clustering and Training in Mixture Models

Entropy-driven Unsupervised Keypoint Representation Learning in Videos

Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling

Regret Bounds for Markov Decision Processes with Recursive Optimized Certainty Equivalents

A Reinforcement Learning Framework for Dynamic Mediation Analysis

Subset-Based Instance Optimality in Private Estimation

Returning The Favour: When Regression Benefits From Probabilistic Causal Knowledge

Automatic Data Augmentation via Invariance-Constrained Learning

Exponential Smoothing for Off-Policy Learning

Not all Strongly Rayleigh Distributions Have Small Probabilistic Generating Circuits

Adversarial Policies Beat Superhuman Go AIs

On the Impact of Knowledge Distillation for Model Interpretability

Adaptive Barrier Smoothing for First-Order Policy Gradient with Contact Dynamics

On the Correctness of Automatic Differentiation for Neural Networks with Machine-Representable Parameters

Revisiting Simple Regret: Fast Rates for Returning a Good Arm

simple diffusion: End-to-end diffusion for high resolution images

The Optimal Approximation Factors in Misspecified Off-Policy Value Function Estimation

A Fully First-Order Method for Stochastic Bilevel Optimization

Stochastic Gradient Succeeds for Bandits

Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice

Towards Learning Geometric Eigen-Lengths Crucial for Fitting Tasks

Parallel Online Clustering of Bandits via Hedonic Game

Long-Tailed Recognition by Mutual Information Maximization between Latent Features and Ground-Truth Labels

Model-Free Robust Average-Reward Reinforcement Learning

Dividing and Conquering a BlackBox to a Mixture of Interpretable Models: Route, Interpret, Repeat

Special Properties of Gradient Descent with Large Learning Rates

Inverse Reinforcement Learning without Reinforcement Learning

Do Embodied Agents Dream of Pixelated Sheep: Embodied Decision Making using Language Guided World Modelling

Regularizing Towards Soft Equivariance Under Mixed Symmetries

Probabilistic Imputation for Time-series Classification with Missing Data

The Virtues of Laziness in Model-based RL: A Unified Objective and Algorithms

Multi-User Reinforcement Learning with Low Rank Rewards

Towards Better Graph Representation Learning with Parameterized Decomposition & Filtering

Short-lived High-volume Bandits

Bidirectional Learning for Offline Model-based Biological Sequence Design

DevFormer: A Symmetric Transformer for Context-Aware Device Placement

Generalization on the Unseen, Logic Reasoning and Degree Curriculum

Accounting For Informative Sampling When Learning to Forecast Treatment Outcomes Over Time

Learning Perturbations to Explain Time Series Predictions

Optimality of Thompson Sampling with Noninformative Priors for Pareto Bandits

GAT: Guided Adversarial Training with Pareto-optimal Auxiliary Tasks

Robust Satisficing MDPs

Robust One-Class Classification with Signed Distance Function using 1-Lipschitz Neural Networks

Weakly Supervised Disentangled Generative Causal Representation Learning

Buying Information for Stochastic Optimization

Neural FIM for learning Fisher information metrics from point cloud data

Lowering the Pre-training Tax for Gradient-based Subset Training: A Lightweight Distributed Pre-Training Toolkit

Learning Dense Correspondences between Photos and Sketches

Synthetic data for model selection

Towards Coherent Image Inpainting Using Denoising Diffusion Implicit Models

Learning to Suggest Breaks: Sustainable Optimization of Long-Term User Engagement

Feature Expansion for Graph Neural Networks

D2Match: Leveraging Deep Learning and Degeneracy for Subgraph Matching

How Do Transformers Learn Topic Structure: Towards a Mechanistic Understanding

Randomized Gaussian Process Upper Confidence Bound with Tighter Bayesian Regret Bounds

Scaling of Class-wise Training Losses for Post-hoc Calibration

Q-Flow: Generative Modeling for Differential Equations of Open Quantum Dynamics with Normalizing Flows

TIPS: Topologically Important Path Sampling for Anytime Neural Networks

On the Effectiveness of Offline RL for Dialogue Response Generation

Learning Antidote Data to Individual Unfairness

On the Stepwise Nature of Self-Supervised Learning

Identification of the Adversary from a Single Adversarial Example

Discover-Then-Rank Unlabeled Support Vectors in the Dual Space for Multi-Class Active Learning

Effectively Using Public Data in Privacy Preserving Machine Learning

Multiplier Bootstrap-based Exploration

Gradient Descent in Neural Networks as Sequential Learning in Reproducing Kernel Banach Space

Pareto Manifold Learning: Tackling multiple tasks via ensembles of single-task models

Uncovering Adversarial Risks of Test-Time Adaptation

Self-supervised Neural Factor Analysis for Disentangling Utterance-level Speech Representations

On Coresets for Clustering in Small Dimensional Euclidean spaces

CLUTR: Curriculum Learning via Unsupervised Task Representation Learning

Feature learning in deep classifiers through Intermediate Neural Collapse

Internet Explorer: Targeted Representation Learning on the Open Web

MetricGAN-OKD: Multi-Metric Optimization of MetricGAN via Online Knowledge Distillation for Speech Enhancement

A Category-theoretical Meta-analysis of Definitions of Disentanglement

Random Classification Noise does not defeat All Convex Potential Boosters Irrespective of Model Choice

Understanding the Role of Feedback in Online Learning with Switching Costs

Differentially Private Episodic Reinforcement Learning with Heavy-tailed Rewards

DIVISION: Memory Efficient Training via Dual Activation Precision

Implicit Regularization Leads to Benign Overfitting for Sparse Linear Regression

Hiding Data Helps: On the Benefits of Masking for Sparse Coding

Improving Statistical Fidelity for Neural Image Compression with Implicit Local Likelihood Models

Provably Learning Diverse Features in Multi-View Data with Midpoint Mixup

Improving Adversarial Robustness by Putting More Regularizations on Less Robust Samples

The Price of Differential Privacy under Continual Observation

Delayed Feedback in Kernel Bandits

Continuous Spatiotemporal Transformer

LSDS++ : Dual Sampling for Accelerated k-means++

InfoOT: Information Maximizing Optimal Transport

Neural signature kernels as infinite-width-depth-limits of controlled ResNets

Regression with Label Permutation in Generalized Linear Model

Local Optimization Achieves Global Optimality in Multi-Agent Reinforcement Learning

On the Within-Group Fairness of Screening Classifiers

Analysis of Error Feedback in Federated Non-Convex Optimization with Biased Compression: Fast Convergence and Partial Participation

Achieving High Accuracy with PINNs via Energy Natural Gradient Descent

TRAK: Attributing Model Behavior at Scale

Learning to Incentivize Information Acquisition: Proper Scoring Rules Meet Principal-Agent Model

Two-Scale Gradient Descent Ascent Dynamics Finds Mixed Nash Equilibria of Continuous Games: A Mean-Field Perspective

Provably Efficient Representation Learning with Tractable Planning in Low-Rank POMDP

Enforcing Hard Constraints with Soft Barriers: Safe Reinforcement Learning in Unknown Stochastic Environments

Git-Theta: A Git Extension for Collaborative Development of Machine Learning Models

DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule

Rethinking Backdoor Attacks

Linear CNNs Discover the Statistical Structure of the Dataset Using Only the Most Dominant Frequencies

Finding the Missing-half: Graph Complementary Learning for Homophily-prone and Heterophily-prone Graphs

$\pi$-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation

Compositional Score Modeling for Simulation-Based Inference

Nearly Optimal Algorithms with Sublinear Computational Complexity for Online Kernel Regression

A Watermark for Large Language Models

Online Prototype Alignment for Few-shot Policy Transfer

MetaModulation: Learning Variational Feature Hierarchies for Few-Shot Learning with Fewer Tasks

Flexible Model Aggregation for Quantile Regression

Looped Transformers as Programmable Computers

Deep Perturbation Learning: Enhancing the Network Performance via Image Perturbations

Improving Adversarial Robustness Through the Contrastive-Guided Diffusion Process

A Gromov--Wasserstein Geometric View of Spectrum-Preserving Graph Coarsening

Spherical Fourier Neural Operators: Learning Stable Dynamics on the Sphere

Double-Weighting for Covariate Shift Adaptation

NTK-approximating MLP Fusion for Efficient Language Model Fine-tuning

How Bad is Top-$K$ Recommendation under Competing Content Creators?

Continuation Path Learning for Homotopy Optimization

Random Matrix Analysis to Balance between Supervised and Unsupervised Learning under the Low Density Separation Assumption

Neural Wasserstein Gradient Flows for Discrepancies with Riesz Kernels

A Critical View of Vision-Based Long-Term Dynamics Prediction Under Environment Misalignment

Aligning Language Models with Preferences through $f$-divergence Minimization

SpotEM: Efficient Video Search for Episodic Memory

Disentangled Multi-Fidelity Deep Bayesian Active Learning

Reinforcement Learning with History Dependent Dynamic Contexts

ModelDiff: A Framework for Comparing Learning Algorithms

MetaDiffuser: Diffusion Model as Conditional Planner for Offline Meta-RL

Randomized Schur Complement Views for Graph Contrastive Learning

AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners

A Game-Theoretic Framework for Managing Risk in Multi-Agent Systems

On the Convergence Rate of Gaussianization with Random Rotations

SOM-CPC: Unsupervised Contrastive Learning with Self-Organizing Maps for Structured Representations of High-Rate Time Series

CHiLS: Zero-Shot Image Classification with Hierarchical Label Sets

Revisiting Sampling for Combinatorial Optimization

Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision Processes

Consistency of Multiple Kernel Clustering

Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic Shortest Path

Optimal Online Generalized Linear Regression with Stochastic Noise and Its Application to Heteroscedastic Bandits

Non-asymptotic Properties of Individualized Treatment Rules from Sequentially Rule-Adaptive Trials

Long Horizon Temperature Scaling

Improving l1-Certified Robustness via Randomized Smoothing by Leveraging Box Constraints

Using Perturbation to Improve Goodness-of-Fit Tests based on Kernelized Stein Discrepancy

Understanding Self-Predictive Learning for Reinforcement Learning

Online Learning in Stackelberg Games with an Omniscient Follower

Low Complexity Homeomorphic Projection to Ensure Neural-Network Solution Feasibility for Optimization over (Non-)Convex Set

High-dimensional Location Estimation via Norm Concentration for Subgamma Vectors

Fast Inference from Transformers via Speculative Decoding

Semi-Offline Reinforcement Learning for Optimized Text Generation

DRCFS: Doubly Robust Causal Feature Selection

Improving Medical Predictions by Irregular Multimodal Electronic Health Records Modeling

On the Role of Attention in Prompt-tuning

Corruption-Robust Algorithms with Uncertainty Weighting for Nonlinear Contextual Bandits and Markov Decision Processes

Learning to Learn from APIs: Black-Box Data-Free Meta-Learning

Multi-Task Off-Policy Learning from Bandit Feedback

A Statistical Perspective on Retrieval-Based Models

Near-optimal Conservative Exploration in Reinforcement Learning under Episode-wise Constraints

Transformer-based Stagewise Decomposition for Large-Scale Multistage Stochastic Optimization

LegendreTron: Uprising Proper Multiclass Loss Learning

Modality-Agnostic Variational Compression of Implicit Neural Representations

The Persistent Laplacian for Data Science: Evaluating Higher-Order Persistent Spectral Representations of Data

Never mind the metrics---what about the uncertainty? Visualising binary confusion matrix metric distributions to put performance in perspective

SparseGPT: Massive Language Models Can be Accurately Pruned in One-Shot

Off-Policy Average Reward Actor-Critic with Deterministic Policy Search

Quantum Speedups for Zero-Sum Games via Improved Dynamic Gibbs Sampling

Improving Adversarial Robustness of Deep Equilibrium Models with Explicit Regulations Along the Neural Dynamics

Delay-Adapted Policy Optimization and Improved Regret for Adversarial MDP with Delayed Bandit Feedback

Near-Optimal $\Phi$-Regret Learning in Extensive-Form Games

Team Belief DAG: Generalizing the Sequence Form to Team Games for Fast Computation of Correlated Team Max-Min Equilibria via Regret Minimization

Dink-Net: Neural Clustering on Large Graphs

What Can Be Learnt With Wide Convolutional Neural Networks?

Invariance in Policy Optimisation and Partial Identifiability in Reward Learning

SRATTA: Sample Re-ATTribution Attack of Secure Aggregation in Federated Learning.

Theory on Forgetting and Generalization of Continual Learning

Internally Rewarded Reinforcement Learning

Generalization Bounds using Data-Dependent Fractal Dimensions

Flash: Concept Drift Adaptation in Federated Learning

DIFF2: Differential Private Optimization via Gradient Differences for Nonconvex Distributed Learning

Tight and fast generalization error bound of graph embedding in metric space

Primal and Dual Analysis of Entropic Fictitious Play for Finite-sum Problems

Modeling Dynamic Environments with Scene Graph Memory

A Scalable Frank-Wolfe-Based Algorithm for the Max-Cut SDP

High Fidelity Image Counterfactuals with Probabilistic Causal Models

On the Functional Similarity of Robust and Non-Robust Neural Representations

Provably Learning Object-Centric Representations

Open-Vocabulary Universal Image Segmentation with MaskCLIP

Active Policy Improvement from Multiple Black-box Oracles

CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling

Compositional Exemplars for In-context Learning

In Search for a Generalizable Method for Source Free Domain Adaptation

From Relational Pooling to Subgraph GNNs: A Universal Framework for More Expressive Graph Neural Networks

ODS: Test-Time Adaptation in the Presence of Open-World Data Shift

User-level Private Stochastic Convex Optimization with Optimal Rates

NeuralSlice: Neural 3D Triangle Mesh Reconstruction via Slicing 4D Tetrahedral Meshes

Prototype-oriented unsupervised anomaly detection for multivariate time series

Best of Both Worlds Policy Optimization

Bayesian Progressive Deep Topic Model with Knowledge Informed Textual Data Coarsening Process

ESC: Exploration with Soft Commonsense Constraints for Zero-shot Object Navigation

High Probability Convergence of Stochastic Gradient Methods

abess: A Fast Best-Subset Selection Library in Python and R

Covariate balancing using the integral probability metric for causal inference

Scalable Multi-Agent Reinforcement Learning through Intelligent Information Aggregation

Taxonomy-Structured Domain Adaptation

NeuralStagger: Accelerating Physics-constrained Neural PDE Solver with Spatial-temporal Decomposition

Robust Weak Supervision with Variational Auto-Encoders

Delay-agnostic Asynchronous Coordinate Update Algorithm

Task-specific experimental design for treatment effect estimation

Boosting Graph Contrastive Learning via Graph Contrastive Saliency

Explore and Exploit the Diverse Knowledge in Model Zoo for Domain Generalization

Finite-Sample Analysis of Learning High-Dimensional Single ReLU Neuron

Quantized Distributed Training of Large Models with Convergence Guarantees

Cramming: Training a Language Model on a single GPU in one day.

Can We Scale Transformers to Predict Parameters of Diverse ImageNet Models?

Towards Stable and Efficient Adversarial Training against $l_1$ Bounded Adversarial Attacks

Learning Functional Distributions with Private Labels

What do CNNs Learn in the First Layer and Why? A Linear Systems Perspective

Are Equivariant Equilibrium Approximators Beneficial?

Knowledge Hypergraph Embedding Meets Relational Algebra

When do Minimax-fair Learning and Empirical Risk Minimization Coincide?

Random Grid Neural Processes for Parametric Partial Differential Equations

SNeRL: Semantic-aware Neural Radiance Fields for Reinforcement Learning

Weighted Sampling without Replacement for Deep Top-$k$ Classification

Efficient Learning of Mesh-Based Physical Simulation with Bi-Stride Multi-Scale Graph Neural Network

One-vs-the-Rest Loss to Focus on Important Samples in Adversarial Training

Fundamental Tradeoffs in Learning with Prior Information

Model-agnostic Measure of Generalization Difficulty

Scalable Safe Policy Improvement via Monte Carlo Tree Search

Quantum Policy Gradient Algorithm with Optimized Action Decoding

Boosting Offline Reinforcement Learning with Action Preference Query

Tight Certification of Adversarially Trained Neural Networks via Nonconvex Low-Rank Semidefinite Relaxations

Projected Tensor Power Method for Hypergraph Community Recovery

Retrieval-Augmented Multimodal Language Modeling

Neural Network Accelerated Implicit Filtering: Integrating Neural Network Surrogates With Provably Convergent Derivative Free Optimization Methods

BiBench: Benchmarking and Analyzing Network Binarization

On Data Manifolds Entailed by Structural Causal Models

The Acquisition of Physical Knowledge in Generative Neural Networks

Computationally Efficient PAC RL in POMDPs with Latent Determinism and Conditional Embeddings

Federated Heavy Hitter Recovery under Linear Sketching

Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models

Equivariance with Learned Canonicalization Functions

FedDisco: Federated Learning with Discrepancy-Aware Collaboration

Gradient-based Wang--Landau Algorithm: A Novel Sampler for Output Distribution of Neural Networks over the Input Space

Federated Adversarial Learning: A Framework with Convergence Analysis

Trajectory-Aware Eligibility Traces for Off-Policy Reinforcement Learning

Towards Learning to Imitate from a Single Video Demonstration

Speeding Up Bellman Ford via Minimum Violation Permutations

Fully Dynamic Submodular Maximization over Matroids

On the Forward Invariance of Neural ODEs

Hybrid Energy Based Model in the Feature Space for Out-of-Distribution Detection

Graph Neural Networks with Learnable and Optimal Polynomial Bases

Masked Bayesian Neural Networks : Theoretical Guarantee and its Posterior Inference

MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation

Multi-View Masked World Models for Visual Robotic Manipulation

Multisample Flow Matching: Straightening Flows with Minibatch Couplings

Policy Mirror Ascent for Efficient and Independent Learning in Mean Field Games

Do Not Train It: A Linear Neural Architecture Search of Graph Neural Networks

Towards Controlled Data Augmentations for Active Learning

Conformal Prediction for Federated Uncertainty Quantification Under Label Shift

Blackout Diffusion: Generative Diffusion Models in Discrete-State Spaces

Differentially Private Stochastic Convex Optimization under a Quantile Loss Function

SDDM: Score-Decomposed Diffusion Models on Manifolds for Unpaired Image-to-Image Translation

In or Out? Fixing ImageNet Out-of-Distribution Detection Evaluation

Over-parametrization via Lifting for Low-rank Matrix Sensing: Conversion of Spurious Solutions to Strict Saddle Points

Diversity-enhancing Generative Network for Few-shot Hypothesis Adaptation

Retrosynthetic Planning with Dual Value Networks

A Critical Revisit of Adversarial Robustness in 3D Point Cloud Recognition with Diffusion-Driven Purification

Cones: Concept Neurons in Diffusion Models for Customized Generation

Finding Generalization Measures by Contrasting Signal and Noise

On Uni-Modal Feature Learning in Supervised Multi-Modal Learning

Personalized Federated Learning under Mixture of Distributions

Evaluating Unsupervised Denoising Requires Unsupervised Metrics

On the Robustness of Randomized Ensembles to Adversarial Perturbations

A Closer Look at the Intervention Procedure of Concept Bottleneck Models

Offline Learning in Markov Games with General Function Approximation

Decoding Layer Saliency in Language Transformers

Nested Elimination: A Simple Algorithm for Best-Item Identification From Choice-Based Feedback

Provable Dynamic Fusion for Low-Quality Multimodal Data

Block Subsampled Randomized Hadamard Transform for Nyström Approximation on Distributed Architectures

One-shot Imitation in a Non-Stationary Environment via Multi-Modal Skill

Conditional Graph Information Bottleneck for Molecular Relational Learning

Domain Adaptation for Time Series Under Feature and Label Shifts

Strategic Classification with Unknown User Manipulations

Simplified Temporal Consistency Reinforcement Learning

Image Restoration with Mean-Reverting Stochastic Differential Equations

Transcendental Idealism of Planner: Evaluating Perception from Planning Perspective for Autonomous Driving

Stochastic Policy Gradient Methods: Improved Sample Complexity for Fisher-non-degenerate Policies

Implicit Jacobian regularization weighted with impurity of probability output

Adversarial Cheap Talk

Slot-VAE: Object-Centric Scene Generation with Slot Attention

Pre-training for Speech Translation: CTC Meets Optimal Transport

H-Likelihood Approach to Deep Neural Networks with Temporal-Spatial Random Effects for High-Cardinality Categorical Features

Fast $(1+\varepsilon)$-Approximation Algorithms for Binary Matrix Factorization

Robust and private stochastic linear bandits

Algorithms for bounding contribution for histogram estimation under user-level privacy

Bidirectional Looking with A Novel Double Exponential Moving Average to Adaptive and Non-adaptive Momentum Optimizers

A Likelihood Approach to Nonparametric Estimation of a Singular Distribution Using Deep Generative Models

IRNeXt: Rethinking Convolutional Network Design for Image Restoration

TabLeak: Tabular Data Leakage in Federated Learning

Efficient Algorithms for Exact Graph Matching on Correlated Stochastic Block Models with Constant Correlation

Bootstrap in High Dimension with Low Computation

Theoretical Guarantees of Learning Ensembling Strategies with Applications to Time Series Forecasting

Simple Disentanglement of Style and Content in Visual Representations

Reparameterized Policy Learning for Multimodal Trajectory Optimization

Variational Curriculum Reinforcement Learning for Unsupervised Discovery of Skills

Universal Physics-Informed Neural Networks: Symbolic Differential Operator Discovery with Sparse Data

Deep Graph Representation Learning and Optimization for Influence Maximization

A Large-Scale Study of Probabilistic Calibration in Neural Network Regression

Global optimality of Elman-type RNNs in the mean-field regime

Proper Losses for Discrete Generative Models

Adversarial Collaborative Learning on Non-IID Features

Multi-Agent Learning from Learners

On Sampling with Approximate Transport Maps

State and parameter learning with PARIS particle Gibbs

Inferring Relational Potentials in Interacting Systems

BiRT: Bio-inspired Replay in Vision Transformers for Continual Learning

The Computational Complexity of Concise Hypersphere Classification

The Test of Tests: A Framework for Differentially Private Hypothesis Testing

CO-BED: Information-Theoretic Contextual Optimization via Bayesian Experimental Design

A Generalization of ViT/MLP-Mixer to Graphs

A Study on Transformer Configuration and Training Objective

Restoration-Degradation Beyond Linear Diffusions: A Non-Asymptotic Analysis For DDIM-type Samplers

Deep Generative Symbolic Regression with Monte-Carlo-Tree-Search

Hyperparameters in Reinforcement Learning and How To Tune Them

Learning Hidden Markov Models When the Locations of Missing Observations are Unknown

Optimizing NOTEARS Objectives via Topological Swaps

Infinite Action Contextual Bandits with Reusable Data Exhaust

Efficient Quantum Algorithms for Quantum Optimal Control

SLAMB: Accelerated Large Batch Training with Sparse Communication

Random Shuffle Transformer for Image Restoration

Conditional Tree Matching for Inference-Time Adaptation of Tree Prediction Models

Robust Situational Reinforcement Learning in Face of Context Disturbances

Can Neural Network Memorization Be Localized?

Exphormer: Sparse Transformers for Graphs

The Unintended Consequences of Discount Regularization: Improving Regularization in Certainty Equivalence Reinforcement Learning

Model-based Offline Reinforcement Learning with Count-based Conservatism

Topologically Faithful Image Segmentation via Induced Matching of Persistence Barcodes

Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels

Fair Neighbor Embedding

Robust Camera Pose Refinement for Multi-Resolution Hash Encoding

Effective and Efficient Structural Inference with Reservoir Computing

Ewald-based Long-Range Message Passing for Molecular Graphs

General Sequential Episodic Memory Model

End-to-end Training of Deep Boltzmann Machines by Unbiased Contrastive Divergence with Local Mode Initialization

Constrained Monotonic Neural Networks

SparseProp: Efficient Sparse Backpropagation for Faster Training of Neural Networks at the Edge

Random Teachers are Good Teachers

Provably Convergent Schrödinger Bridge with Applications to Probabilistic Time Series Imputation

Improving Graph Neural Networks with Learnable Propagation Operators

Nearly-Optimal Hierarchical Clustering for Well-Clustered Graphs

Nonparametric Density Estimation under Distribution Drift

Tensor Decompositions Meet Control Theory: Learning General Mixtures of Linear Dynamical Systems

Exploring the Benefits of Training Expert Language Models over Instruction Tuning

Low-Variance Gradient Estimation in Unrolled Computation Graphs with ES-Single

Difference-in-Differences Meets Tree-based Methods: Heterogeneous Treatment Effects Estimation with Unmeasured Confounding

Scaling Spherical CNNs

AdaNPC: Exploring Non-Parametric Classifier for Test-Time Adaptation

Iterative Approximate Cross-Validation

Conformal Prediction Sets for Graph Neural Networks

Locally Regularized Neural Differential Equations: Some Black Boxes were meant to remain closed!

TabDDPM: Modelling Tabular Data with Diffusion Models

Out-of-Distribution Generalization of Federated Learning via Implicit Invariant Relationships

Distributed Contextual Linear Bandits with Minimax Optimal Communication Cost

Approximate Causal Effect Identification under Weak Confounding

Self-Attention Amortized Distributional Projection Optimization for Sliced Wasserstein Point-Cloud Reconstruction

Probabilistic Categorical Adversarial Attack and Adversarial Training

Alternately Optimized Graph Neural Networks

Complexity of Block Coordinate Descent with Proximal Regularization and Applications to Wasserstein CP-dictionary Learning

Node Embedding from Neural Hamiltonian Orbits in Graph Neural Networks

Robust Speech Recognition via Large-Scale Weak Supervision

A/B Testing in Network Data with Covariate-Adaptive Randomization

Coarse-to-Fine: a Hierarchical Diffusion Model for Molecule Generation in 3D

MEWL: Few-shot multimodal word learning with referential uncertainty

Fair and Robust Estimation of Heterogeneous Treatment Effects for Policy Learning

Forget Unlearning: Towards True Data-Deletion in Machine Learning

Extending Conformal Prediction to Hidden Markov Models with Exact Validity via de Finetti's Theorem for Markov Chains

Gradient Descent Converges Linearly for Logistic Regression on Separable Data

Regions of Reliability in the Evaluation of Multivariate Probabilistic Forecasts

Learning-augmented private algorithms for multiple quantile release

Learning Deep Time-index Models for Time Series Forecasting

Spatial Implicit Neural Representations for Global-Scale Species Mapping

On Preemption and Learning in Stochastic Scheduling

Controlled Differential Equations on Long Sequences via Non-standard Wavelets

LookupFFN: Making Transformers Compute-lite for CPU inference

Optimistic Online Mirror Descent for Bridging Stochastic and Adversarial Online Convex Optimization

Compressing Tabular Data via Latent Variable Estimation

A Connection between One-Step RL and Critic Regularization in Reinforcement Learning

Denoising MCMC for Accelerating Diffusion-Based Generative Models

Linear Time GPs for Inferring Latent Trajectories from Neural Spike Trains

Few-Sample Feature Selection via Feature Manifold Learning

Representation-Driven Reinforcement Learning

How to Trust Your Diffusion Model: A Convex Optimization Approach to Conformal Risk Control

Taming graph kernels with random features

Causal Discovery with Latent Confounders Based on Higher-Order Cumulants

Improved Regret for Efficient Online Reinforcement Learning with Linear Function Approximation

Neural Stochastic Differential Games for Time-series Analysis

Learning Regions of Interest for Bayesian Optimization with Adaptive Level-Set Estimation

Revisiting the Linear-Programming Framework for Offline RL with General Function Approximation

Delayed Bandits: When Do Intermediate Observations Help?

On Over-Squashing in Message Passing Neural Networks: The Impact of Width, Depth, and Topology

Reducing SO(3) Convolutions to SO(2) for Efficient Equivariant GNNs

Q-learning Decision Transformer: Leveraging Dynamic Programming for Conditional Sequence Modelling in Offline RL

PreNAS: Preferred One-Shot Learning Towards Efficient Neural Architecture Search

Adversarially Robust PAC Learnability of Real-Valued Functions

Weighted Flow Diffusion for Local Graph Clustering with Node Attributes: an Algorithm and Statistical Guarantees

Mixing Predictions for Online Metric Algorithms

Reinforcement Learning in Low-rank MDPs with Density Features

The Power of Preconditioning in Overparameterized Low-Rank Matrix Sensing

Sequential Monte Carlo Learning for Time Series Structure Discovery

Multicalibration as Boosting for Regression

Reasons for the Superiority of Stochastic Estimators over Deterministic Ones: Robustness, Consistency and Perceptual Quality

Cut your Losses with Squentropy

Sparse Learning of Dynamical Systems in RKHS: An Operator-Theoretic Approach

K-SHAP: Policy Clustering Algorithm for Anonymous Multi-Agent State-Action Pairs

Towards Understanding Generalization of Macro-AUC in Multi-label Learning

Controllable Neural Symbolic Regression

Stable Estimation of Heterogeneous Treatment Effects

Gradient Descent Monotonically Decreases the Sharpness of Gradient Flow Solutions in Scalar Networks and Beyond

Constraint Reasoning Embedded Structured Prediction

Optimal Convergence Rates for Agnostic Nyström Kernel Learning

Active Ranking of Experts Based on their Performances in Many Tasks

Efficient Parametric Approximations of Neural Network Function Space Distance

Improving Expert Predictions with Conformal Prediction

Sketched Ridgeless Linear Regression: The Role of Downsampling

mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video

Shortest Edit Path Crossover: A Theory-driven Solution to the Permutation Problem in Evolutionary Neural Architecture Search

Probabilistic Unrolling: Scalable, Inverse-Free Maximum Likelihood Estimation for Latent Gaussian Models

Fisher Information Embedding for Node and Graph Learning

Efficient Exploration via Epistemic-Risk-Seeking Policy Optimization

Efficient Latency-Aware CNN Depth Compression via Two-Stage Dynamic Programming

Naive imputation implicitly regularizes high-dimensional linear models

Single Point-Based Distributed Zeroth-Order Optimization with a Non-Convex Stochastic Objective Function

Likelihood Adjusted Semidefinite Programs for Clustering Heterogeneous Data

Dual Focal Loss for Calibration

Coin Sampling: Gradient-Based Bayesian Inference without Learning Rates

Distribution Free Domain Generalization

Why Target Networks Stabilise Temporal Difference Methods

Are Gaussian Data All You Need? The Extents and Limits of Universality in High-Dimensional Generalized Linear Estimation

Machine Learning Force Fields with Data Cost Aware Training

Principled Offline RL in the Presence of Rich Exogenous Information

Bayesian online change point detection with Hilbert space approximate Student-t process

BNN-DP: Robustness Certification of Bayesian Neural Networks via Dynamic Programming

Why do Nearest Neighbor Language Models Work?

QAS-Bench: Rethinking Quantum Architecture Search and A Benchmark

Bilevel Optimization with Coupled Decision-Dependent Distributions

Curiosity in Hindsight: Intrinsic Exploration in Stochastic Environments

Variational Mixture of HyperGenerators for Learning Distributions over Functions

Nearly Optimal Competitive Ratio for Online Allocation Problems with Two-sided Resource Constraints and Finite Requests

Statistical Indistinguishability of Learning Algorithms

Graph Mixup with Soft Alignments

Provably Efficient Offline Reinforcement Learning with Perturbed Data Sources

Parallel Neurosymbolic Integration with Concordia

Posterior Sampling for Deep Reinforcement Learning

Reinforcement Learning Can Be More Efficient with Multiple Rewards

2D-Shapley: A Framework for Fragmented Data Valuation

One-Shot Federated Conformal Prediction

E$(n)$ Equivariant Message Passing Simplicial Networks

Thompson Sampling for High-Dimensional Sparse Linear Contextual Bandits

DeSRA: Detect and Delete the Artifacts of GAN-based Real-World Super-Resolution Models

Recovery Bounds on Class-Based Optimal Transport: A Sum-of-Norms Regularization Framework

From Adaptive Query Release to Machine Unlearning

Recovering Top-Two Answers and Confusion Probability in Multi-Choice Crowdsourcing

Optimal LP Rounding and Linear-Time Approximation Algorithms for Clustering Edge-Colored Hypergraphs

Surrogate Module Learning: Reduce the Gradient Error Accumulation in Training Spiking Neural Networks

Abstract-to-Executable Trajectory Translation for One-Shot Task Generalization

Automatically Auditing Large Language Models via Discrete Optimization

Efficient Transformed Gaussian Processes for Non-Stationary Dependent Multi-class Classification

Learning to Initiate and Reason in Event-Driven Cascading Processes

Robust and Scalable Bayesian Online Changepoint Detection

In Search of Insights, Not Magic Bullets: Towards Demystification of the Model Selection Dilemma in Heterogeneous Treatment Effect Estimation

Graph Positional Encoding via Random Feature Propagation

Estimating Possible Causal Effects with Latent Variables via Adjustment

Benign Overfitting in Deep Neural Networks under Lazy Training

A Near-Optimal Algorithm for Safe Reinforcement Learning Under Instantaneous Hard Constraints

Explainability as statistical inference

Learning Prescriptive ReLU Networks

Learning Dynamic Query Combinations for Transformer-based Object Detection and Segmentation

On the Robustness of Text Vectorizers

Kernel Logistic Regression Approximation of an Understandable ReLU Neural Network

Learning Compiler Pass Orders using Coreset and Normalized Value Prediction

Robustness in Multimodal Learning under Train-Test Modality Mismatch

Achieving Linear Speedup in Non-IID Federated Bilevel Learning

Estimation Beyond Data Reweighting: Kernel Method of Moments

General Covariance Data Augmentation for Neural PDE Solvers

Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles

PASTA: Pessimistic Assortment Optimization

Hypothesis Transfer Learning with Surrogate Classification Losses: Generalization Bounds through Algorithmic Stability

Thompson Sampling with Less Exploration is Fast and Optimal

FARE: Provably Fair Representation Learning with Practical Certificates

Stabilizing Transformer Training by Preventing Attention Entropy Collapse

Optimally-weighted Estimators of the Maximum Mean Discrepancy for Likelihood-Free Inference

Tuning Computer Vision Models With Task Rewards

Quantifying the Variability Collapse of Neural Networks

Polarity Is All You Need to Learn and Transfer Faster

A Unified Optimization Framework of ANN-SNN Conversion: Towards Optimal Mapping from Activation Values to Firing Rates

Faster Gradient-Free Algorithms for Nonsmooth Nonconvex Stochastic Optimization

Learning to Design Analog Circuits to Meet Threshold Specifications

Constrained Causal Bayesian Optimization

Prefer to Classify: Improving Text Classifiers via Auxiliary Preference Learning

A Modern Look at the Relationship between Sharpness and Generalization

Learning Unnormalized Statistical Models via Compositional Optimization

Half-Hop: A graph upsampling approach for slowing down message passing

Probabilistic Contrastive Learning Recovers the Correct Aleatoric Uncertainty of Ambiguous Inputs

Attributing Image Generative Models using Latent Fingerprints

Last Switch Dependent Bandits with Monotone Payoff Functions

Distribution Free Prediction Sets for Node Classification

StriderNet: A Graph Reinforcement Learning Approach to Optimize Atomic Structures on Rough Energy Landscapes

Deterministic equivalent and error universality of deep random features learning

Demystifying Disagreement-on-the-Line in High Dimensions

MixFlows: principled variational inference via mixed flows

GRAFENNE: Learning on Graphs with Heterogeneous and Dynamic Feature Sets

Nesterov Meets Optimism: Rate-Optimal Separable Minimax Optimization

A Kernelized Stein Discrepancy for Biological Sequences

Minimax estimation of discontinuous optimal transport maps: The semi-discrete case

Estimating Causal Effects using a Multi-task Deep Ensemble

Unveiling the Latent Space Geometry of Push-Forward Generative Models

Model-Aware Contrastive Learning: Towards Escaping the Dilemmas

Theoretical Bounds on the Network Community Profile from Low-rank Semi-definite Programming

Conditionally Strongly Log-Concave Generative Models

Nugget: Neural Agglomerative Embeddings of Text

Stein Variational Goal Generation for adaptive Exploration in Multi-Goal Reinforcement Learning

BPipe: Memory-Balanced Pipeline Parallelism for Training Large Language Models

The Role of Entropy and Reconstruction in Multi-View Self-Supervised Learning

PAL: Program-aided Language Models

Causal Proxy Models for Concept-based Model Explanations

Multi-Environment Pretraining Enables Transfer to Action Limited Datasets

Semi-Supervised Offline Reinforcement Learning with Action-Free Trajectories

An Information-Theoretic Analysis of Nonstationary Bandit Learning

GuardHFL: Privacy Guardian for Heterogeneous Federated Learning

Federated Linear Contextual Bandits with User-level Differential Privacy

NerfDiff: Single-image View Synthesis with NeRF-guided Distillation from 3D-aware Diffusion

Context Consistency Regularization for Label Sparsity in Time Series

A Conditional Normalizing Flow for Accelerated Multi-Coil MR Imaging

Evaluating Self-Supervised Learning via Risk Decomposition

Homomorphism AutoEncoder --- Learning Group Structured Representations from Observed Transitions

Global optimality for Euclidean CCCP under Riemannian convexity

Approximation and Estimation Ability of Transformers for Sequence-to-Sequence Functions with Infinite Dimensional Input

Cooperative Multi-Agent Reinforcement Learning: Asynchronous Communication and Linear Function Approximation

PaLM-E: An Embodied Multimodal Language Model

Universal Morphology Control via Contextual Modulation

Dissecting the Effects of SGD Noise in Distinct Regimes of Deep Learning

Learning Control by Iterative Inversion

Uncertainty Estimation by Fisher Information-based Evidential Deep Learning

Do Perceptually Aligned Gradients Imply Robustness?

Analyzing Diffusion as Serial Reproduction

Certified Robust Neural Networks: Generalization and Corruption Resistance

FP-Diffusion: Improving Score-based Diffusion Models by Enforcing the Underlying Score Fokker-Planck Equation

Synthetic Prompting: Generating Chain-of-Thought Demonstrations for Large Language Models

Emergent Asymmetry of Precision and Recall for Measuring Fidelity and Diversity of Generative Models in High Dimensions

MANSA: Learning Fast and Slow in Multi-Agent Systems

Spherical Inducing Features for Orthogonally-Decoupled Gaussian Processes

Generative Decoding of Visual Stimuli

Beyond the Edge of Stability via Two-step Gradient Updates

Few-bit Backward: Quantized Gradients of Activation Functions for Memory Footprint Reduction

Contextual Conservative Interleaving Bandits

Efficient List-Decodable Regression using Batches

Do Machine Learning Models Learn Statistical Rules Inferred from Data?

Emergence of Adaptive Circadian Rhythms in Deep Reinforcement Learning

Deep Temporal Sets with Evidential Reinforced Attentions for Unique Behavioral Pattern Discovery

Learning Optimal Group-structured Individualized Treatment Rules with Many Treatments

Tighter Bounds on the Expressivity of Transformer Encoders

Parameter-Level Soft-Masking for Continual Learning

Learnability and Algorithm for Continual Learning

TGRL: An Algorithm for Teacher Guided Reinforcement Learning

Importance Weighted Expectation-Maximization for Protein Sequence Design

Graph Switching Dynamical Systems

Protecting Language Generation Models via Invisible Watermarking

ReDi: Efficient Learning-Free Diffusion Inference via Trajectory Retrieval

Perturbation Analysis of Neural Collapse

Contrast with Reconstruct: Contrastive 3D Representation Learning Guided by Generative Pretraining

UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers

Von Mises Mixture Distributions for Molecular Conformation Generation

The Power of Learned Locally Linear Models for Nonlinear Policy Optimization

Straightening Out the Straight-Through Estimator: Overcoming Optimization Challenges in Vector Quantized Networks

System Identification of Neural Systems: If We Got It Right, Would We Know?

Multi-Agent Best Arm Identification with Private Communications

Muse: Text-To-Image Generation via Masked Generative Transformers

Learn to Accumulate Evidence from All Training Samples: Theory and Practice

Pairwise Ranking Losses of Click-Through Rates Prediction for Welfare Maximization in Ad Auctions

Addressing Budget Allocation and Revenue Allocation in Data Market Environments Using an Adaptive Sampling Algorithm

Robust Counterfactual Explanations for Neural Networks With Probabilistic Guarantees

Communication-Constrained Bandits under Additive Gaussian Noise

Everyone's Preference Changes Differently: A Weighted Multi-Interest Model For Retrieval

Poisoning Generative Replay in Continual Learning to Promote Forgetting

Differential Privacy, Linguistic Fairness, and Training Data Influence: Impossibility and Possibility Theorems for Multilingual Language Models

The Regret of Exploration and the Control of Bad Episodes in Reinforcement Learning

STEP: Learning N:M Structured Sparsity Masks from Scratch with Precondition

Supervised Metric Learning to Rank for Retrieval via Contextual Similarity Optimization

Constrained Phi-Equilibria

Optimal Rates and Efficient Algorithms for Online Bayesian Persuasion

Flexible Phase Dynamics for Bio-Plausible Contrastive Learning

How much does Initialization Affect Generalization?

Exploring the Limits of Model-Targeted Indiscriminate Data Poisoning Attacks

ClusterFuG: Clustering Fully connected Graphs by Multicut

Motion Question Answering via Modular Motion Programs

Statistical Foundations of Prior-Data Fitted Networks

RLEG: Vision-Language Representation Learning with Diffusion-based Embedding Generation

Partially Observable Multi-agent RL with (Quasi-)Efficiency: The Blessing of Information Sharing

Linear Causal Disentanglement via Interventions

Neural Algorithmic Reasoning with Causal Regularisation

A theory of representation learning gives a deep generalisation of kernel methods

PromptBoosting: Black-Box Text Classification with Ten Forward Passes

Hierarchies of Reward Machines

Nearly-tight Bounds for Deep Kernel Learning

Simple Embodied Language Learning as a Byproduct of Meta-Reinforcement Learning

Fractional Denoising for 3D Molecular Pre-training

GNN&GBDT-Guided Fast Optimizing Framework for Large-scale Integer Programming

Optimization for Amortized Inverse Problems

Causal Bounds in Quasi-Markovian Graphs

spred: Solving L1 Penalty with SGD

Evidential Interactive Learning for Medical Image Captioning

SegCLIP: Patch Aggregation with Learnable Centers for Open-Vocabulary Semantic Segmentation

RGE: A Repulsive Graph Rectification for Node Classification via Influence

Investigating the Role of Model-Based Learning in Exploration and Transfer

Leveraging Label Non-Uniformity for Node Classification in Graph Neural Networks

Efficient RL via Disentangled Environment and Agent Representations

Divide and Conquer Dynamic Programming: An Almost Linear Time Change Point Detection Methodology in High Dimensions

Online Mechanism Design for Information Acquisition

Optimal Stochastic Non-smooth Non-convex Optimization through Online-to-Non-convex Conversion

Directed Chain Generative Adversarial Networks

Layered State Discovery for Incremental Autonomous Exploration

Bayes-optimal Learning of Deep Random Networks of Extensive-width

CataBEEM: Integrating Latent Interaction Categories in Node-wise Community Detection Models for Network Data

Maximum Optimality Margin: A Unified Approach for Contextual Linear Programming and Inverse Linear Programming

Under-Counted Tensor Completion with Neural Incorporation of Attributes

Polyhedral Complex Extraction from ReLU Networks using Edge Subdivision

Nearly-Linear Time and Streaming Algorithms for Outlier-Robust PCA

A Closer Look at Few-shot Classification Again

QuantumDARTS: Differentiable Quantum Architecture Search for Variational Quantum Algorithms

Sequential Multi-Dimensional Self-Supervised Learning for Clinical Time Series

X-Paste: Revisiting Scalable Copy-Paste for Instance Segmentation using CLIP and StableDiffusion

GibbsDDRM: A Partially Collapsed Gibbs Sampler for Solving Blind Inverse Problems with Denoising Diffusion Restoration

Quantum 3D Graph Learning with Applications to Molecule Embedding

Submodular Order Functions and Assortment Optimization

Go Beyond Imagination: Maximizing Episodic Reachability with World Models

Fascinating Supervisory Signals and Where to Find Them: Deep Anomaly Detection with Scale Learning

Bit Allocation using Optimization

Interventional Causal Representation Learning

Convergence of Proximal Point and Extragradient-Based Methods Beyond Monotonicity: the Case of Negative Comonotonicity

Tied-Augment: Controlling Representation Similarity Improves Data Augmentation

Mimetic Initialization of Self-Attention Layers

Text-To-4D Dynamic Scene Generation

Towards Quantum Machine Learning for Constrained Combinatorial Optimization: a Quantum QAP Solver

Minimalistic Predictions to Schedule Jobs with Online Precedence Constraints

Convex Geometry of ReLU-layers, Injectivity on the Ball and Local Reconstruction

Robust Consensus in Ranking Data Analysis: Definitions, Properties and Computational Issues

Subset Selection Based On Multiple Rankings in the Presence of Bias: Effectiveness of Fairness Constraints for Multiwinner Voting Score Functions

Phase-aware Adversarial Defense for Improving Adversarial Robustness

Bayesian Design Principles for Frequentist Sequential Learning

Weighted Tallying Bandits: Overcoming Intractability via Repeated Exposure Optimality

Language Instructed Reinforcement Learning for Human-AI Coordination

Generated Graph Detection

The Catalog Problem: Clustering and Ordering Variable-Sized Sets

Adaptive Smoothing Gradient Learning for Spiking Neural Networks

A Theoretical Analysis of the Learning Dynamics under Class Imbalance

PCA-based Multi-Task Learning: a Random Matrix Approach

CoCo: A Coupled Contrastive Framework for Unsupervised Domain Adaptive Graph Classification

Efficient preconditioned stochastic gradient descent for estimation in latent variable models

Multi-Layer Neural Networks as Trainable Ladders of Hilbert Spaces

Lower Bounds for Learning in Revealing POMDPs

On the Relationship Between Explanation and Prediction: A Causal View

Revisiting Domain Randomization via Relaxed State-Adversarial Policy Optimization

SemSup-XC: Semantic Supervision for Zero and Few-shot Extreme Classification

Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models

Convergence of First-Order Methods for Constrained Nonconvex Optimization with Dependent Data

Expected Gradients of Maxout Networks and Consequences to Parameter Initialization

Task-Specific Skill Localization in Fine-tuned Language Models

Facial Expression Recognition with Adaptive Frame Rate based on Multiple Testing Correction

Efficient Approximations of Complete Interatomic Potentials for Crystal Property Prediction

Integrating Prior Knowledge in Contrastive Learning with Kernel

Polynomial Preconditioning for Gradient Methods

From Robustness to Privacy and Back

Fast as CHITA: Neural Network Pruning with Combinatorial Optimization

Private Federated Learning with Autotuned Compression

Proper Scoring Rules for Survival Analysis

CRISP: Curriculum based Sequential neural decoders for Polar code family

FLEX: an Adaptive Exploration Algorithm for Nonlinear Systems

Parallel $Q$-Learning: Scaling Off-policy Reinforcement Learning under Massively Parallel Simulation

Graphically Structured Diffusion Models

On Many-Actions Policy Gradient

Monge, Bregman and Occam: Interpretable Optimal Transport in High-Dimensions with Feature-Sparse Maps

The Statistical Scope of Multicalibration

Hardware-Aware Compression with Random Operation Access Specific Tile (ROAST) Hashing

Ske2Grid: Skeleton-to-Grid Representation Learning for Action Recognition

Stabilizing GANs' Training with Brownian Motion Controller

Featured Graph Coarsening with Similarity Guarantees

Biases in Evaluation of Molecular Optimization Methods and Bias Reduction Strategies

Fighting Fire with Fire: Contrastive Debiasing without Bias-free Data via Generative Bias-transformation

Improved Online Learning Algorithms for CTR Prediction in Ad Auctions

Improved Policy Evaluation for Randomized Trials of Algorithmic Resource Allocation

Width and Depth Limits Commute in Residual Networks

Continual Learners are Incremental Model Generalizers

Online Platt Scaling with Calibeating

SpENCNN: Orchestrating Encoding and Sparsity for Fast Homomorphically Encrypted Neural Network Inference

Data Feedback Loops: Model-driven Amplification of Dataset Biases

Variational Autoencoding Neural Operators

Meta-Learning the Inductive Bias of Simple Neural Circuits

Cyclic Block Coordinate Descent With Variance Reduction for Composite Nonconvex Optimization

Predictive Flows for Faster Ford-Fulkerson

Neural Latent Aligner: Cross-trial Alignment for Learning Representations of Complex, Naturalistic Neural Data

Can Large Language Models Reason about Program Invariants?

Provable Reset-free Reinforcement Learning by No-Regret Reduction

On the Impact of Algorithmic Recourse on Social Segregation

Understanding and Generalizing Contrastive Learning from the Inverse Optimal Transport Perspective

Dropout Reduces Underfitting

Towards a Persistence Diagram that is Robust to Noise and Varied Densities

Conformal Prediction with Missing Values

Diffusion Models are Minimax Optimal Distribution Estimators

DualHSIC: HSIC-Bottleneck and Alignment for Continual Learning

MODeL: Memory Optimizations for Deep Learning

Optimal Sets and Solution Paths of ReLU Networks

Traversing Between Modes in Function Space for Fast Ensembling

Fast Rates in Time-Varying Strongly Monotone Games

Tighter Information-Theoretic Generalization Bounds from Supersamples

Learning in POMDPs is Sample-Efficient with Hindsight Observability

Reduce, Reuse, Recycle: Compositional Generation with Energy-Based Diffusion Models and MCMC

Atari-5: Distilling the Arcade Learning Environment down to Five Games

Data-Driven Subgroup Identification for Linear Regression

Supported Trust Region Optimization for Offline Reinforcement Learning

Multi-Objective Population Based Training

Differentially Private Sharpness-Aware Training

Nonparametric Extensions of Randomized Response for Private Confidence Sets

Statistical Learning under Heterogenous Distribution Shift

Spatial-Temporal Graph Learning with Adversarial Contrastive Adaptation

Lookahead When It Matters: Adaptive Non-causal Transformers for Streaming Neural Transducers

Adaptive Coordination in Social Embodied Rearrangement

Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks

MonoFlow: Rethinking Divergence GANs via the Perspective of Wasserstein Gradient Flows

Fundamental Limits of Two-layer Autoencoders, and Achieving Them with Gradient Methods

Bigger, Better, Faster: Human-level Atari with human-level efficiency

PLay: Parametrically Conditioned Layout Generation using Latent Diffusion

Repository-Level Prompt Generation for Large Language Models of Code

On Pre-Training for Visuo-Motor Control: Revisiting a Learning-from-Scratch Baseline

Bandits with Knapsacks: Advice on Time-Varying Demands

Does Sparsity Help in Learning Misspecified Linear Bandits?

The SSL Interplay: Augmentations, Inductive Bias, and Generalization

Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding

Improved Algorithms for White-Box Adversarial Streams

GREAD: Graph Neural Reaction-Diffusion Networks

Structural Re-weighting Improves Graph Domain Adaptation

One-Step Estimator for Permuted Sparse Recovery

Preprocessors Matter! Realistic Decision-Based Attacks on Machine Learning Systems

PAC Generalization via Invariant Representations

Bayesian Neural Networks Avoid Encoding Complex and Perturbation-Sensitive Concepts

Adaptive IMLE for Few-shot Pretraining-free Generative Modelling

Polynomial Time and Private Learning of Unbounded Gaussian Mixture Models

On the Interplay Between Misspecification and Sub-optimality Gap in Linear Contextual Bandits

On Excess Mass Behavior in Gaussian Mixture Models with Orlicz-Wasserstein Distances

Toward Large Kernel Models

Optimal Horizon-Free Reward-Free Exploration for Linear Mixture MDPs

Mirror Sinkhorn: Fast Online Optimization on Transport Polytopes

Efficient Online Reinforcement Learning with Offline Data

Self-Repellent Random Walks on General Graphs - Achieving Minimal Sampling Variance via Nonlinear Markov Chains

Moccasin: Efficient Tensor Rematerialization for Neural Networks

On the Global Convergence of Risk-Averse Policy Gradient Methods with Expected Conditional Risk Measures

"Why did the Model Fail?": Attributing Model Performance Changes to Distribution Shifts

Wrapped Cauchy Distributed Angular Softmax for Long-Tailed Visual Recognition

Change is Hard: A Closer Look at Subpopulation Shift

Achieving Hierarchy-Free Approximation for Bilevel Programs with Equilibrium Constraints

Analyzing Privacy Leakage in Machine Learning via Multiple Hypothesis Testing: A Lesson From Fano

Cocktail Party Attack: Breaking Aggregation-Based Privacy in Federated Learning Using Independent Component Analysis

On Penalty-based Bilevel Gradient Descent Method

Fairness in Matching under Uncertainty

Meta Learning of Interface Conditions for Multi-Domain Physics-Informed Neural Networks

Entity Divider with Language Grounding in Multi-Agent Reinforcement Learning

Improving Hyperparameter Learning under Approximate Inference in Gaussian Process Models

On Regularization and Inference with Label Constraints

Why does Throwing Away Data Improve Worst-Group Error?

Unsupervised Skill Discovery for Learning Shared Structures across Changing Environments

Leveraging Proxy of Training Data for Test-Time Adaptation

Generating Language Corrections for Teaching Physical Control Tasks

Predictable MDP Abstraction for Unsupervised Model-Based RL

Variance Control for Distributional Reinforcement Learning

Anti-Exploration by Random Network Distillation

Revisiting Bellman Errors for Offline Model Selection

Robust Weight Signatures: Gaining Robustness as Easy as Patching Weights?

Interactive Object Placement with Reinforcement Learning

Compressed Decentralized Proximal Stochastic Gradient Method for Nonconvex Composite Problems with Heterogeneous Data

GC-Flow: A Graph-Based Flow Network for Effective Clustering

Warm-Start Actor-Critic: From Approximation Error to Sub-optimality Gap

Coordinate Descent Methods for Fractional Minimization

Neural Continuous-Discrete State Space Models for Irregularly-Sampled Time Series

Network Effects in Performative Prediction Games

On Enhancing Expressive Power via Compositions of Single Fixed-Size ReLU Network

Best Arm Identification in Multi-Agent Multi-Armed Bandits

AudioLDM: Text-to-Audio Generation with Latent Diffusion Models

Approximation Algorithms for Fair Range Clustering

Probabilistic Attention-to-Influence Neural Models for Event Sequences

RankMe: Assessing the Downstream Performance of Pretrained Self-Supervised Representations by Their Rank

How Many Perturbations Break This Model? Evaluating Robustness Beyond Adversarial Accuracy

Improving Fair Training under Correlation Shifts

ACAT: Adversarial Counterfactual Attention for Classification and Detection in Medical Imaging

Representation Learning with Multi-Step Inverse Kinematics: An Efficient and Optimal Approach to Rich-Observation RL

End-to-End Multi-Object Detection with a Regularized Mixture Model

Confidence and Dispersity Speak: Characterizing Prediction Matrix for Unsupervised Accuracy Estimation

Towards Explaining Distribution Shifts

RSC: Accelerate Graph Neural Networks Training via Randomized Sparse Computations

Delving into Noisy Label Detection with Clean Data

Smooth Non-stationary Bandits

On Kinetic Optimal Probability Paths for Generative Models

Multi-Fidelity Covariance Estimation in the Log-Euclidean Geometry

Controlling Type Confounding in Ad Hoc Teamwork with Instance-wise Teammate Feedback Rectification

Restoration based Generative Models

MAGANet: Achieving Combinatorial Generalization by Modeling a Group Action

Feature Programming for Multivariate Time Series Prediction

Reliable Measures of Spread in High Dimensional Latent Spaces

Bayesian Estimation of Differential Privacy

Learning useful representations for shifting tasks and distributions

Toward Efficient Gradient-Based Value Estimation

All in a Row: Compressed Convolution Networks for Graphs

Dynamics-inspired Neuromorphic Visual Representation Learning

Stable and Consistent Prediction of 3D Characteristic Orientation via Invariant Residual Learning

Neural networks trained with SGD learn distributions of increasing complexity

Rethinking Weak Supervision in Helping Contrastive Learning

Abstracting Imperfect Information Away from Two-Player Zero-Sum Games

Learning Intuitive Policies Using Action Features

DecompDiff: Diffusion Models with Decomposed Priors for Structure-Based Drug Design

Sharper Bounds for $\ell_p$ Sensitivity Sampling

Gaussian Process Priors for Systems of Linear Partial Differential Equations with Constant Coefficients

Formalizing Preferences Over Runtime Distributions

Effective Minkowski Dimension of Deep Nonparametric Regression: Function Approximation and Statistical Theories

Pricing Experimental Design: Causal Effect, Expected Revenue and Tail Risk

Differentiable and Transportable Structure Learning

Proximal Causal Learning of Conditional Average Treatment Effects

Score Approximation, Estimation and Distribution Recovery of Diffusion Models on Low-Dimensional Data

Weakly Supervised Regression with Interval Targets

Deep Latent State Space Models for Time-Series Generation

Implicit Neural Spatial Representations for Time-dependent PDEs

Improving Bi-level Optimization Based Methods with Inspiration from Humans' Classroom Study Techniques

DiscoBAX - Discovery of optimal intervention sets in genomic experiment design

Sample Complexity of Probability Divergences under Group Symmetry

Learning Instance-Specific Augmentations by Capturing Local Invariances

HarsanyiNet: Computing Accurate Shapley Values in a Single Forward Propagation

Learning Temporally AbstractWorld Models without Online Experimentation

Input uncertainty propagation through trained neural networks

Sequential Counterfactual Risk Minimization

Applied Online Algorithms with Heterogeneous Predictors

Omnipredictors for Constrained Optimization

Semi Bandit dynamics in Congestion Games: Convergence to Nash Equilibrium and No-Regret Guarantees.

What can online reinforcement learning with function approximation benefit from general coverage conditions?

An Effective Meaningful Way to Evaluate Survival Models

The Dormant Neuron Phenomenon in Deep Reinforcement Learning

On the Global Convergence of Fitted Q-Iteration with Two-layer Neural Network Parametrization

Counterfactual Analysis in Dynamic Latent State Models

On Bridging the Gap between Mean Field and Finite Width Deep Random Multilayer Perceptron with Batch Normalization

Fully-Adaptive Composition in Differential Privacy

Flipping Coins to Estimate Pseudocounts for Exploration in Reinforcement Learning

Settling the Reward Hypothesis

Learning Lightweight Object Detectors via Multi-Teacher Progressive Distillation

Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition

TAN Without a Burn: Scaling Laws of DP-SGD

Quantile Credit Assignment

The Benefits of Model-Based Generalization in Reinforcement Learning

SpeedDETR: Speed-aware Transformers for End-to-end Object Detection

Provably and Practically Efficient Neural Contextual Bandits

Quantum Ridgelet Transform: Winning Lottery Ticket of Neural Networks with Quantum Computation

On the Power of Pre-training for Generalization in RL: Provable Benefits and Hardness

Understanding Incremental Learning of Gradient Descent: A Fine-grained Analysis of Matrix Sensing

Distributed Linear Bandits under Communication Constraints

A Unifying Framework to the Analysis of Interaction Methods using Synergy Functions

Sequential Kernelized Independence Testing

Sequential Strategic Screening

Automated Search for Conjectures on Mathematical Constants using Analysis of Integer Sequences

Provable Multi-instance Deep AUC Maximization with Stochastic Pooling

TIDE: Time Derivative Diffusion for Deep Learning on Graphs

Geometric Latent Diffusion Models for 3D Molecule Generation

On the Statistical Benefits of Temporal Difference Learning

Information-Theoretic State Space Model for Multi-View Reinforcement Learning

Continual Vision-Language Representation Learning with Off-Diagonal Information

Private Statistical Estimation of Many Quantiles

AbODE: Ab initio antibody design using conjoined ODEs

Trustworthy Policy Learning under the Counterfactual No-Harm Criterion

Propensity Matters: Measuring and Enhancing Balancing for Recommendation

Improving Graph Generation by Restricting Graph Bandwidth

Solving Linear Programs with Fast Online Learning Algorithms

LESS-VFL: Communication-Efficient Feature Selection for Vertical Federated Learning

Robust Collaborative Learning with Linear Gradient Overhead

Towards Understanding and Improving GFlowNet Training

MALTS: Matching After Learning to Stretch

PINA: Leveraging Side Information in eXtreme Multi-label Classification via Predicted Instance Neighborhood Aggregation

Efficient Training of Language Models using Few-Shot Learning

A Universal Unbiased Method for Classification from Aggregate Observations

On the Convergence of SARSA with Linear Function Approximation

Mitigating Memorization of Noisy Labels by Clipping the Model Prediction

PAC-Bayesian Generalization Bounds for Adversarial Generative Models

Fairness in Streaming Submodular Maximization over a Matroid Constraint

Optimal randomized multilevel Monte Carlo for repeatedly nested expectations

PAC Prediction Sets for Large Language Models of Code

Scalable Adaptive Computation for Iterative Generation

ConCerNet: A Contrastive Learning Based Framework for Automated Conservation Law Discovery and Trustworthy Dynamical System Prediction

Sequential Changepoint Detection via Backward Confidence Sequences

Architecture-Agnostic Masked Image Modeling -- From ViT back to CNN

Distribution-dependent McDiarmid-type Inequalities for Functions of Unbounded Interaction

On the Privacy-Robustness-Utility Trilemma in Distributed Learning

Identifiability of Label Noise Transition Matrix

Model Transferability with Responsive Decision Subjects

Intrinsic Sliced Wasserstein Distances for Comparing Collections of Probability Distributions on Manifolds and Graphs

Concurrent Shuffle Differential Privacy Under Continual Observation

Weak Proxies are Sufficient and Preferable for Fairness with Missing Sensitive Attributes

Efficient displacement convex optimization with particle gradient descent

PPG Reloaded: An Empirical Study on What Matters in Phasic Policy Gradient

Collaborative Causal Inference with Fair Incentives

Fair yet Asymptotically Equal Collaborative Learning

Global Context Vision Transformers

Distortion and Uncertainty Aware Loss for Panoramic Depth Completion

A Kernel-Based View of Language Model Fine-Tuning

From Perception to Programs: Regularize, Overparameterize, and Amortize

SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

Geometric Clifford Algebra Networks

Efficiently predicting high resolution mass spectra with graph neural networks

Learning Mixtures of Markov Chains and MDPs

Generalized Implicit Follow-The-Regularized-Leader

Spurious Valleys and Clustering Behavior of Neural Networks

Dynamic Regularized Sharpness Aware Minimization in Federated Learning: Approaching Global Consistency and Smooth Landscape

A Robust Test for the Stationarity Assumption in Sequential Decision Making

Towards a better understanding of representation dynamics under TD-learning

Efficient Rate Optimal Regret for Adversarial Contextual MDPs Using Online Function Approximation

Rethinking Explaining Graph Neural Networks via Non-parametric Subgraph Matching

Label Distributionally Robust Losses for Multi-class Classification: Consistency, Robustness and Adaptivity

Explainable Data-Driven Optimization: From Context to Decision and Back Again

Why Random Pruning Is All We Need to Start Sparse

Direct Parameterization of Lipschitz-Bounded Deep Networks

FREDIS: A Fusion Framework of Refinement and Disambiguation for Unreliable Partial Label Learning

Generalization Analysis for Contrastive Representation Learning

An Investigation into Pre-Training Object-Centric Representations for Reinforcement Learning

Predicting Rare Events by Shrinking Towards Proportional Odds

Online Nonstochastic Control with Adversarial and Static Constraints

Multi-task Hierarchical Adversarial Inverse Reinforcement Learning

Surface Snapping Optimization Layer for Single Image Object Shape Reconstruction

Relevant Walk Search for Explaining Graph Neural Networks

VectorMapNet: End-to-end Vectorized HD Map Learning

Trading-Off Payments and Accuracy in Online Classification with Paid Stochastic Experts

Representer Point Selection for Explaining Regularized High-dimensional Models

Estimating the Contamination Factor's Distribution in Unsupervised Anomaly Detection

Communication-Efficient Federated Hypergradient Computation via Aggregated Iterative Differentiation

Cell-Free Latent Go-Explore

Towards Understanding Generalization of Graph Neural Networks

The Implicit Regularization of Dynamical Stability in Stochastic Gradient Descent

Fair and Optimal Classification via Post-Processing

Stochastic Marginal Likelihood Gradients using Neural Tangent Kernels

Graph Neural Tangent Kernel: Convergence on Large Graphs

Tractable Control for Autoregressive Language Generation

Speed-Oblivious Online Scheduling: Knowing (Precise) Speeds is not Necessary

Graph Contrastive Backdoor Attacks

Jump-Start Reinforcement Learning

COMCAT: Towards Efficient Compression and Customization of Attention-Based Vision Models

The Impact of Exploration on Convergence and Performance of Multi-Agent Q-Learning Dynamics

Vector-Valued Control Variates

Algorithmic Collective Action in Machine Learning

Causal Structure Learning for Latent Intervened Non-stationary Data

Neural Inverse Operators for Solving PDE Inverse Problems

A Distribution Optimization Framework for Confidence Bounds of Risk Measures

Exact Inference in High-order Structured Prediction

On the Complexity of Bayesian Generalization

Attribute-Efficient PAC Learning of Low-Degree Polynomial Threshold Functions with Nasty Noise

SGD with AdaGrad Stepsizes: Full Adaptivity with High Probability to Unknown Parameters, Unbounded Gradients and Affine Variance

On the Convergence of Gradient Flow on Multi-layer Linear Models

Unscented Autoencoder

Individually Fair Learning with One-Sided Feedback

Hierarchical Grammar-Induced Geometry for Data-Efficient Molecular Property Prediction

Training Deep Surrogate Models with Large Scale Online Learning

Quantum Lower Bounds for Finding Stationary Points of Nonconvex Functions

Near-Optimal Quantum Coreset Construction Algorithms for Clustering

CSP: Self-Supervised Contrastive Spatial Pre-Training for Geospatial-Visual Representations

Differentiable Tree Operations Promote Compositional Generalization

CrossSplit: Mitigating Label Noise Memorization through Data Splitting

Generalizing Neural Wave Functions

Deep Laplacian-based Options for Temporally-Extended Exploration

Fourmer: An Efficient Global Modeling Paradigm for Image Restoration

Shapley Based Residual Decomposition for Instance Analysis

A Group Symmetric Stochastic Differential Equation Model for Molecule Multi-modal Pretraining

Reachability-Aware Laplacian Representation in Reinforcement Learning

Provably Invariant Learning without Domain Information

Improved Online Conformal Prediction via Strongly Adaptive Online Learning

Total Variation Graph Neural Networks

ReLOAD: Reinforcement Learning with Optimistic Ascent-Descent for Last-Iterate Convergence in Constrained MDPs

Hidden Symmetries of ReLU Networks

Topological Singularity Detection at Multiple Scales

Better Training of GFlowNets with Local Credit and Incomplete Trajectories

Do You Remember? Overcoming Catastrophic Forgetting for Fake Audio Detection

Symmetry-Aware Robot Design with Structured Subgroups

Automatic Intrinsic Reward Shaping for Exploration in Deep Reinforcement Learning

Provable Data Subset Selection For Efficient Neural Networks Training

GFlowOut: Dropout with Generative Flow Networks

FedCR: Personalized Federated Learning Based on Across-Client Common Representation with Conditional Mutual Information Regularization

Controllability-Aware Unsupervised Skill Discovery

ChiPFormer: Transferable Chip Placement via Offline Decision Transformer

Towards credible visual model interpretation with path attribution

Learning Signed Distance Functions from Noisy 3D Point Clouds via Noise to Noise Mapping

Towards Practical Preferential Bayesian Optimization with Skew Gaussian Processes

The Edge of Orthogonality: A Simple View of What Makes BYOL Tick

Reinforcement Learning with General Utilities: Simpler Variance Reduction and Large State-Action Space

Hyperbolic Image-text Representations

LongCoder: A Long-Range Pre-trained Language Model for Code Completion

WL meet VC

Regret-Minimizing Double Oracle for Extensive-Form Games

Adaptive Identification of Populations with Treatment Benefit in Clinical Trials: Machine Learning Challenges and Solutions

Grounding Large Language Models in Interactive Environments with Online Reinforcement Learning

Personalized Federated Learning with Inferred Collaboration Graphs

Run-off Election: Improved Provable Defense against Data Poisoning Attacks

Regret Minimization and Convergence to Equilibria in General-sum Markov Games

A Fast, Well-Founded Approximation to the Empirical Neural Tangent Kernel

Learning Expressive Priors for Generalization and Uncertainty Estimation in Neural Networks

Hyperbolic Representation Learning: Revisiting and Advancing

RACE: Improve Multi-Agent Reinforcement Learning with Representation Asymmetry and Collaborative Evolution

No One Idles: Efficient Heterogeneous Federated Learning with Parallel Edge and Server Computation

Magneto: A Foundation Transformer

Minimizing Trajectory Curvature of ODE-based Generative Models

How Jellyfish Characterise Alternating Group Equivariant Neural Networks

Hyperbolic Diffusion Embedding and Distance for Hierarchical Representation Learning

Safe Offline Reinforcement Learning with Real-Time Budget Constraints

Improved Active Multi-Task Representation Learning via Lasso

Rethinking Visual Reconstruction: Experience-Based Content Completion Guided by Visual Cues

SlotGAT: Slot-based Message Passing for Heterogeneous Graphs

Hierarchical Diffusion for Offline Decision Making

Stochastic Gradient Descent under Markovian Sampling Schemes

Mitigating the Effects of Non-Identifiability on Inference for Bayesian Neural Networks with Latent Variables

Difference of submodular minimization via DC programming

Variational Sparse Inverse Cholesky Approximation for Latent Gaussian Processes via Double Kullback-Leibler Minimization

Nonparametric Iterative Machine Teaching

A Fast Optimistic Method for Monotone Variational Inequalities

Deep Regression Unlearning

A Robust Optimisation Perspective on Counterexample-Guided Repair of Neural Networks

StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis

Decentralized Stochastic Bilevel Optimization with Improved per-Iteration Complexity

Existence and Estimation of Critical Batch Size for Training Generative Adversarial Networks with Two Time-Scale Update Rule

Neural Prediction Errors enable Analogical Visual Reasoning in Human Standard Intelligence Tests

Minimal Width for Universal Property of Deep RNN

Mixture Proportion Estimation Beyond Irreducibility

Sliced-Wasserstein on Symmetric Positive Definite Matrices for M/EEG Signals

Reinforcement Learning from Passive Data via Latent Intentions

Refined Regret for Adversarial MDPs with Linear Function Approximation

Label differential privacy and private training data release

Metagenomic Binning using Connectivity-constrained Variational Autoencoders

Automatically marginalized MCMC in probabilistic programming

Calibrating Multimodal Learning

On the Optimality of Misspecified Kernel Ridge Regression

Actor-Critic Alignment for Offline-to-Online Reinforcement Learning

Harmonic Neural Networks

Approximately Optimal Core Shapes for Tensor Decompositions

COLA: Orchestrating Error Coding and Learning for Robust Neural Network Inference Against Hardware Defects

A Flexible Diffusion Model

LazyGNN: Large-Scale Graph Neural Networks via Lazy Propagation

Generative Graph Dictionary Learning

Solving High-Dimensional PDEs with Latent Spectral Models

Distance Weighted Supervised Learning for Offline Interaction Data

Optimal Arms Identification with Knapsacks

Effective Structured Prompting by Meta-Learning and Representative Verbalizer

Markovian Gaussian Process Variational Autoencoders

Active causal structure learning with advice

New metrics and search algorithms for weighted causal DAGs

End-to-end Differentiable Clustering with Associative Memories

Trainability, Expressivity and Interpretability in Gated Neural ODEs

Differentially Private Hierarchical Clustering with Provable Approximation Guarantees

Adapting to game trees in zero-sum imperfect information games

Fast Excess Risk Rates via Offset Rademacher Complexity

Alternating Local Enumeration (TnALE): Solving Tensor Network Structure Search with Fewer Evaluations

Reward-Mixing MDPs with Few Latent Contexts are Learnable

Maximal Initial Learning Rates in Deep ReLU Networks

Improving Visual Prompt Tuning for Self-supervised Vision Transformers

One-Shot Compression of Large Edge-Exchangeable Graphs using Bits-Back Coding

Instrumental Variable Estimation of Average Partial Causal Effects

Path Neural Networks: Expressive and Accurate Graph Neural Networks

Action Matching: Learning Stochastic Dynamics from Samples

How to address monotonicity for model risk management?

IncDSI: Incrementally Updatable Document Retrieval

Computational Asymmetries in Robust Classification

Instant Soup: Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models

GNOT: A General Neural Operator Transformer for Operator Learning

NUNO: A General Framework for Learning Parametric PDEs with Non-Uniform Data

Robust Subtask Learning for Compositional Generalization

One Transformer Fits All Distributions in Multi-Modal Diffusion at Scale

DRew: Dynamically Rewired Message Passing with Delay

Graph Ladling: Shockingly Simple Parallel GNN Training without Intermediate Communication

Are Large Kernels Better Teachers than Transformers for ConvNets?

GLOBE-CE: A Translation Based Approach for Global Counterfactual Explanations

The case for 4-bit precision: k-bit Inference Scaling Laws

SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient

Generating Private Synthetic Data with Genetic Algorithms

The Flan Collection: Designing Data and Methods for Effective Instruction Tuning

Causal Strategic Classification: A Tale of Two Shifts

Revisiting Over-smoothing and Over-squashing Using Ollivier-Ricci Curvature

DoMo-AC: Doubly Multi-step Off-policy Actor-Critic Algorithm

Bootstrapped Representations in Reinforcement Learning

CleanRL: High-quality Single-file Implementations of Deep Reinforcement Learning Algorithms

VA-learning as a more efficient alternative to Q-learning

Robust Non-Linear Feedback Coding via Power-Constrained Deep Learning

Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities

An SDE for Modeling SAM: Theory and Insights

SeedGNN: Graph Neural Network for Supervised Seeded Graph Matching

Adaptively Weighted Data Augmentation Consistency Regularization for Robust Optimization under Concept Shift

Learning Unforeseen Robustness from Out-of-distribution Data Using Equivariant Domain Translator

Learning the Right Layers a Data-Driven Layer-Aggregation Strategy for Semi-Supervised Learning on Multilayer Graphs

Causal Modeling of Policy Interventions From Treatment–Outcome Sequences

Surrogate Model Extension (SME): A Fast and Accurate Weight Update Attack on Federated Learning

Towards Sustainable Learning: Coresets for Data-efficient Deep Learning

Controlling Posterior Collapse by an Inverse Lipschitz Constraint on the Decoder Network

The Monge Gap: A Regularizer to Learn All Transport Maps

Tighter Lower Bounds for Shuffling SGD: Random Permutations and Beyond

Refining Generative Process with Discriminator Guidance in Score-based Diffusion Models

FlexRound: Learnable Rounding based on Element-wise Division for Post-Training Quantization

Hierarchical Programmatic Reinforcement Learning via Learning to Compose Programs

Marginalization is not Marginal: No Bad VAE Local Minima when Learning Optimal Sparse Representations

Extrapolative Controlled Sequence Generation via Iterative Refinement

Sampling-based Nyström Approximation and Kernel Quadrature

Global Optimization with Parametric Function Approximation

Towards Constituting Mathematical Structures for Learning to Optimize

On the Initialization of Graph Neural Networks

From Hypergraph Energy Functions to Hypergraph Neural Networks

Data Efficient Neural Scaling Law via Model Reusing

Learning to Optimize Differentiable Games

Cluster Explanation via Polyhedral Descriptions

Certifying Ensembles: A General Certification Theory with S-Lipschitzness

SurCo: Learning Linear SURrogates for COmbinatorial Nonlinear Optimization Problems

Rotation and Translation Invariant Representation Learning with Implicit Neural Representations

Searching Large Neighborhoods for Integer Linear Programs with Contrastive Learning

Non-stationary Reinforcement Learning under General Function Approximation

Mechanistic Mode Connectivity

Understanding and Defending Patched-based Adversarial Attacks for Vision Transformer

PFGM++: Unlocking the Potential of Physics-Inspired Generative Models

FusionRetro: Molecule Representation Fusion via In-Context Learning for Retrosynthetic Planning

CodeIPPrompt: Intellectual Property Infringement Assessment of Code Language Models

Differentially Private Distributed Bayesian Linear Regression with MCMC

Whose Opinions Do Language Models Reflect?

Emergence of Sparse Representations from Noise

Pareto Regret Analyses in Multi-objective Multi-armed Bandit

Transformers Learn In-Context by Gradient Descent

Which Tricks are Important for Learning to Rank?

Chemically Transferable Generative Backmapping of Coarse-Grained Proteins

Detecting Adversarial Directions in Deep Reinforcement Learning to Make Robust Decisions

Incentivizing Exploration with Linear Contexts and Combinatorial Actions

JAWS-X: Addressing Efficiency Bottlenecks of Conformal Prediction Under Standard and Feedback Covariate Shift

Multi-Epoch Matrix Factorization Mechanisms for Private Machine Learning

HOPE: High-order Graph ODE For Modeling Interacting Dynamics

Moderately Distributional Exploration for Domain Generalization

Social learning spontaneously emerges by searching optimal heuristics with deep reinforcement learning

Unlocking Slot Attention by Changing Optimal Transport Costs

Efficient Sequence Transduction by Jointly Predicting Tokens and Durations

Fast, Differentiable and Sparse Top-k: a Convex Analysis Perspective

On Heterogeneous Treatment Effects in Heterogeneous Causal Graphs

Faster Rates of Convergence to Stationary Points in Differentially Private Optimization

Adaptive Compositional Continual Meta-Learning

Learning Affinity with Hyperbolic Representation for Spatial Propagation

Dual Propagation: Accelerating Contrastive Hebbian Learning with Dyadic Neurons

Theoretical Behavior of XAI Methods in the Presence of Suppressor Variables

On the Occupancy Measure of Non-Markovian Policies in Continuous MDPs

Loss Balancing for Fair Supervised Learning

Target-Aware Generative Augmentations for Single-Shot Adaptation

Who Needs to Know? Minimal Knowledge for Optimal Coordination

Online Learning with Feedback Graphs: The True Shape of Regret

AutoCoreset: An Automatic Practical Coreset Construction Framework

Image Shortcut Squeezing: Countering Perturbative Availability Poisons with Compression

An Adaptive Entropy-Regularization Framework for Multi-Agent Reinforcement Learning

Eventual Discounting Temporal Logic Counterfactual Experience Replay

Extrapolated Random Tree for Regression

On the Identifiability and Estimation of Causal Location-Scale Noise Models

Orthogonality-Enforced Latent Space in Autoencoders: An Approach to Learning Disentangled Representations

Understanding the Impact of Adversarial Robustness on Accuracy Disparity

MABe22: A Multi-Species Multi-Task Benchmark for Learned Representations of Behavior

Pruning via Sparsity-indexed ODE: a Continuous Sparsity Viewpoint

LESSON: Learning to Integrate Exploration Strategies for Reinforcement Learning via an Option Framework

Adaptive Estimation of Graphical Models under Total Positivity

Algorithmic Stability of Heavy-Tailed SGD with General Loss Functions

Additive Causal Bandits with Unknown Graph

Multi-Task Structural Learning using Local Task Similarity induced Neuron Creation and Removal

FedVS: Straggler-Resilient and Privacy-Preserving Vertical Federated Learning for Split Models

Quantitative Universal Approximation Bounds for Deep Belief Networks

Are Neurons Actually Collapsed? On the Fine-Grained Structure in Neural Representations

Expectation-Complete Graph Representations with Homomorphisms

Group Equivariant Fourier Neural Operators for Partial Differential Equations

SGD with Large Step Sizes Learns Sparse Features

MAHALO: Unifying Offline Reinforcement Learning and Imitation Learning from Observations

STEERING : Stein Information Directed Exploration for Model-Based Reinforcement Learning

Auxiliary Learning as an Asymmetric Bargaining Game

Equivariant Architectures for Learning in Deep Weight Spaces

InGram: Inductive Knowledge Graph Embedding via Relation Graphs

CoDi: Co-evolving Contrastive Diffusion Models for Mixed-type Tabular Synthesis

Reconstructive Neuron Pruning for Backdoor Defense

Learning Deductive Reasoning from Synthetic Corpus based on Formal Logic

Unifying Molecular and Textual Representations via Multi-task Language Modelling

A Toy Model of Universality: Reverse Engineering how Networks Learn Group Operations

Revisiting Data-Free Knowledge Distillation with Poisoned Teachers

Adaptive Computation with Elastic Input Sequence

The Ideal Continual Learner: An Agent That Never Forgets

Collaborative Multi-Agent Heterogeneous Multi-Armed Bandits

Deep Anomaly Detection under Labeling Budget Constraints

Differentiable Simulations for Enhanced Sampling of Rare Events

Coupled Variational Autoencoder

On Second-Order Scoring Rules for Epistemic Uncertainty Quantification

Dynamic Constrained Submodular Optimization with Polylogarithmic Update Time

$H$-Consistency Bounds for Pairwise Misranking Loss Surrogates

Fast Algorithms for Distributed k-Clustering with Outliers

Revisiting Weighted Aggregation in Federated Learning with Neural Networks

FedBR: Improving Federated Learning on Heterogeneous Data via Local Learning Bias Reduction

When does Privileged information Explain Away Label Noise?

On Pitfalls of Test-Time Adaptation

Distributional Offline Policy Evaluation with Predictive Error Guarantees

Distilling Internet-Scale Vision-Language Models into Embodied Agents

Forward-Backward Gaussian Variational Inference via JKO in the Bures-Wasserstein Space

Text Generation with Diffusion Language Models: A Pre-training Approach with Continuous Paragraph Denoise

On the Training Instability of Shuffling SGD with Batch Normalization

Doubly Adversarial Federated Bandits

Measuring the Impact of Programming Language Distribution

Expertise Trees Resolve Knowledge Limitations in Collective Decision-Making

DADAO: Decoupled Accelerated Decentralized Asynchronous Optimization

Why Is Public Pretraining Necessary for Private Model Training?

Prototype-Sample Relation Distillation: Towards Replay-Free Continual Learning

Critical Points and Convergence Analysis of Generative Deep Linear Networks Trained with Bures-Wasserstein Loss

Detecting Adversarial Data by Probing Multiple Perturbations Using Expected Perturbation Score

Detecting Out-of-distribution Data through In-distribution Class Prior

Large Language Models Can Be Easily Distracted by Irrelevant Context

PFNs4BO: In-Context Learning for Bayesian Optimization

Meta-learning Parameterized Skills

Learning Globally Smooth Functions on Manifolds

MyoDex: A Generalizable Prior for Dexterous Manipulation

Diverse and Faithful Knowledge-Grounded Dialogue Generation via Sequential Posterior Inference

FAENet: Frame Averaging Equivariant GNN for Materials Modeling

Beyond In-Domain Scenarios: Robust Density-Aware Calibration

Learning to Decouple Complex Systems

Linkless Link Prediction via Relational Distillation

Cross-Entropy Loss Functions: Theoretical Analysis and Applications

Can Forward Gradient Match Backpropagation?

Identifying Interpretable Subspaces in Image Representations

Global Selection of Contrastive Batches via Optimization on Sample Permutations

Differentiable Multi-Target Causal Bayesian Experimental Design

Quantifying the Knowledge in GNNs for Reliable Distillation into MLPs

Bandit Online Linear Optimization with Hints and Queries

OMS-DPM: Optimizing the Model Schedule for Diffusion Probabilistic Models

LoSparse: Structured Compression of Large Language Models based on Low-Rank and Sparse Approximation

ClimaX: A foundation model for weather and climate

TR0N: Translator Networks for 0-Shot Plug-and-Play Conditional Generation

Generative Adversarial Symmetry Discovery

The Benefits of Mixup for Feature Learning

Towards Robust Graph Incremental Learning on Evolving Graphs

FedHPO-Bench: A Benchmark Suite for Federated Hyperparameter Optimization

FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU

Learning GFlowNets From Partial Episodes For Improved Convergence And Stability

A theory of continuous generative flow networks

N$\text{A}^\text{2}$Q: Neural Attention Additive Model for Interpretable Multi-Agent Q-Learning

The Saddle-Point Method in Differential Privacy

Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time

Contrastive Energy Prediction for Exact Energy-Guided Diffusion Sampling in Offline Reinforcement Learning

Unsupervised Out-of-Distribution Detection with Diffusion Inpainting

SinFusion: Training Diffusion Models on a Single Image or Video

Simple Hardware-Efficient Long Convolutions for Sequence Modeling

LIV: Language-Image Representations and Rewards for Robotic Control

Optimal Goal-Reaching Reinforcement Learning via Quasimetric Learning

Sampling-Based Accuracy Testing of Posterior Estimators for General Inference

Leveraging Offline Data in Online Reinforcement Learning

Learning Noisy OR Bayesian Networks with Max-Product Belief Propagation

RLSbench: Domain Adaptation Under Relaxed Label Shift

Learning to Boost Training by Periodic Nowcasting Near Future Weights

Bayesian Reparameterization of Reward-Conditioned Reinforcement Learning with Energy-based Models

Differentially Private Optimization on Large Model at Small Cost

Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the Machiavelli Benchmark

Predicting Ordinary Differential Equations with Transformers

DP-Fast MH: Private, Fast, and Accurate Metropolis-Hastings for Large-Scale Bayesian Inference

Understanding the Distillation Process from Deep Generative Models to Tractable Probabilistic Circuits

I$^2$SB: Image-to-Image Schrödinger Bridge

GFlowNet-EM for Learning Compositional Latent Variable Models

FeDXL: Provable Federated Learning for Deep X-Risk Optimization

Blockwise Stochastic Variance-Reduced Methods with Parallel Speedup for Multi-Block Bilevel Optimization

Not All Semantics are Created Equal: Contrastive Self-supervised Learning with Automatic Temperature Individualization

Text-To-Concept (and Back) via Cross-Model Alignment

Conformal Inference is (almost) Free for Neural Networks Trained with Early Stopping

Continuously Parameterized Mixture Models

FaDIn: Fast Discretized Inference for Hawkes Processes with General Parametric Kernels

Regression with Sensor Data Containing Incomplete Observations

Superhuman Fairness

Extending Kernel PCA through Dualization: Sparsity, Robustness and Fast Algorithms

PWSHAP: A Path-Wise Explanation Model for Targeted Variables

Outline, Then Details: Syntactically Guided Coarse-To-Fine Code Generation

Discovering Object-Centric Generalized Value Functions From Pixels

Multi-channel Autobidding with Budget and ROI Constraints

Principled Acceleration of Iterative Numerical Methods Using Machine Learning

Beam Tree Recursive Cells

Monotonic Location Attention for Length Generalization

GraphCleaner: Detecting Mislabelled Samples in Popular Graph Learning Benchmarks

UPSCALE: Unconstrained Channel Pruning

Trompt: Towards a Better Deep Neural Network for Tabular Data

Gibbsian Polar Slice Sampling

Graph Reinforcement Learning for Network Control via Bi-Level Optimization

DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation

On the Estimation of Gaussian Mixture Copula Models

Monotonicity and Double Descent in Uncertainty Estimation with Gaussian Processes

A Study of Global and Episodic Bonuses for Exploration in Contextual MDPs

Fully Bayesian Autoencoders with Latent Sparse Gaussian Processes

B-Learner: Quasi-Oracle Bounds on Heterogeneous Causal Effects Under Hidden Confounding

Efficient and Equivariant Graph Networks for Predicting Quantum Hamiltonian

On the Connection Between MPNN and Graph Transformer

Policy Gradient in Robust MDPs with Global Convergence Guarantee

Unit Scaling: Out-of-the-Box Low-Precision Training

Masked Trajectory Models for Prediction, Representation, and Control

A Three-regime Model of Network Pruning

DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature

Data Structures for Density Estimation

Training Normalizing Flows from Dependent Data

Scaling Up Dataset Distillation to ImageNet-1K with Constant Memory

Overcoming Simplicity Bias in Deep Networks using a Feature Sieve

Multi-Objective GFlowNets

Discrete Key-Value Bottleneck

Hyena Hierarchy: Towards Larger Convolutional Language Models

EF21-P and Friends: Improved Theoretical Communication Complexity for Distributed Optimization with Bidirectional Compression

Tighter Analysis for ProxSkip

High-Probability Bounds for Stochastic Optimization and Variational Inequalities: the Case of Unbounded Variance

KDEformer: Accelerating Transformers via Kernel Density Estimation

Streaming Submodular Maximization with Differential Privacy

Learning Neural PDE Solvers with Parameter-Guided Channel Attention

Margin-based sampling in high dimensions: When being active is less efficient than staying passive

MonoNeRF: Learning Generalizable NeRFs from Monocular Videos without Camera Poses

Stochastic Gradient Descent-Induced Drift of Representation in a Two-Layer Neural Network

Large Language Models Struggle to Learn Long-Tail Knowledge

Sequential Predictive Conformal Inference for Time Series

BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models

Hierarchical Neural Coding for Controllable CAD Model Generation

Opponent-Limited Online Search for Imperfect Information Games

Fair Densities via Boosting the Sufficient Statistics of Exponential Families

User-defined Event Sampling and Uncertainty Quantification in Diffusion Models for Physical Dynamical Systems

Matrix Estimation for Individual Fairness

Better Diffusion Models Further Improve Adversarial Training

Beyond Exponentially Fast Mixing in Average-Reward Reinforcement Learning via Multi-Level Monte Carlo Actor-Critic

Variational Open-Domain Question Answering

Learning for Edge-Weighted Online Bipartite Matching with Robustness Guarantees

Synergies between Disentanglement and Sparsity: Generalization and Identifiability in Multi-Task Learning

Generalized Teacher Forcing for Learning Chaotic Dynamics

Structure Learning of Latent Factors via Clique Search on Correlation Thresholded Graphs

Understanding Self-Distillation in the Presence of Label Noise

Beyond Uniform Lipschitz Condition in Differentially Private Optimization

Sequential Underspecified Instrument Selection for Cause-Effect Estimation

Diffusion Based Representation Learning

Which Features are Learnt by Contrastive Learning? On the Role of Simplicity Bias in Class Collapse and Feature Suppression

Coder Reviewer Reranking for Code Generation

Data-Efficient Contrastive Self-supervised Learning: Most Beneficial Examples for Supervised Learning Contribute the Least

Mitigating Spurious Correlations in Multi-modal Models during Fine-tuning

Langevin Thompson Sampling with Logarithmic Communication: Bandits and Reinforcement Learning

Neural Collapse in Deep Linear Networks: From Balanced to Imbalanced Data

Conformalization of Sparse Generalized Linear Models

Fast Rates for Maximum Entropy Exploration

The Unreasonable Effectiveness of Few-shot Learning for Machine Translation

Scaling Laws for Multilingual Neural Machine Translation

Secure Federated Correlation Test and Entropy Estimation

Image generation with shortest path diffusion

Multiply Robust Off-policy Evaluation and Learning under Truncation by Death

On Provable Copyright Protection for Generative Models

Hardness of Independent Learning and Sparse Equilibrium Computation in Markov Games

Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language

Generative Causal Representation Learning for Out-of-Distribution Motion Forecasting

Input Perturbation Reduces Exposure Bias in Diffusion Models

Demystifying Uneven Vulnerability of Link Stealing Attacks against Graph Neural Networks

Tight Regret Bounds for Single-pass Streaming Multi-armed Bandits

Understanding Backdoor Attacks through the Adaptability Hypothesis

Learning to Maximize Mutual Information for Dynamic Feature Selection

Sketching Meets Differential Privacy: Fast Algorithm for Dynamic Kronecker Projection Maintenance

Sketching for First Order Method: Efficient Algorithm for Low-Bandwidth Channel and Vulnerability

Off-Policy Evaluation for Large Action Spaces via Conjunct Effect Modeling

Accelerated Infeasibility Detection of Constrained Optimization and Fixed-Point Iterations

Pretraining Language Models with Human Preferences

Sketch-Flip-Merge: Mergeable Sketches for Private Distinct Counting

NeRFool: Uncovering the Vulnerability of Generalizable Neural Radiance Fields against Adversarial Perturbations

Master-ASR: Achieving Multilingual Scalability and Low-Resource Adaptation in ASR with Modular Learning

Fast Online Value-Maximizing Prediction Sets with Conformal Cost Control

Unconstrained Online Learning with Unbounded Losses

Sample and Predict Your Latent: Modality-free Sequential Disentanglement via Contrastive Estimation

Improved Algorithms for Multi-period Multi-class Packing Problems with Bandit Feedback

Towards Deep Attention in Graph Neural Networks: Problems and Remedies

Adversarial Parameter Attack on Deep Neural Networks

Phase Transitions in the Detection of Correlated Databases

Understanding the Complexity Gains of Single-Task RL with a Curriculum

When Personalization Harms Performance: Reconsidering the Use of Group Attributes in Prediction

OCD: Learning to Overfit with Conditional Diffusion Models

Towards Bridging the Gaps between the Right to Explanation and the Right to be Forgotten

Neural Wave Machines: Learning Spatiotemporally Structured Representations with Locally Coupled Oscillatory Recurrent Neural Networks

Latent Traversals in Generative Models as Potential Flows

DUET: 2D Structured and Approximately Equivariant Representations

ProtST: Multi-Modality Learning of Protein Sequences and Biomedical Texts

Contextual Reliability: When Different Features Matter in Different Contexts

HETAL: Efficient Privacy-preserving Transfer Learning with Homomorphic Encryption

Hierarchical Imitation Learning with Vector Quantized Models

Using Large Language Models to Simulate Multiple Humans and Replicate Human Subject Studies

Deep linear networks can benignly overfit when shallow ones do

Policy Evaluation and Temporal-Difference Learning in Continuous Time and Space: A Martingale Approach

Multi-Agent Online Optimization with Delays: Asynchronicity, Adaptivity, and Optimism

Data-Derived Weak Universal Consistency

Existence, Stability and Scalability of Orthogonal Convolutional Neural Networks

Adversarial Classification: Necessary Conditions and Geometric Flows

On Generalizations of Some Distance Based Classifiers for HDLSS Data

Model-Bellman Inconsistency for Model-based Offline Reinforcement Learning

CLUSTSEG: Clustering for Universal Segmentation

Bi-directional Masks for Efficient N:M Sparse Training

Composer: Creative and Controllable Image Synthesis with Composable Conditions

Learning to acquire novel cognitive tasks with evolution, plasticity and meta-meta-learning

Semiparametrically Efficient Off-Policy Evaluation in Linear Markov Decision Processes

Enabling First-Order Gradient-Based Learning for Equilibrium Computation in Markets

Non-autoregressive Conditional Diffusion Models for Time Series Prediction

On the Power of Foundation Models

Neural Diffusion Processes

Building Neural Networks on Matrix Manifolds: A Gyrovector Space Approach

Contrastive Learning Meets Homophily: Two Birds with One Stone

Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both Worlds in Stochastic and Deterministic Environments

On the Generalization of Multi-modal Contrastive Learning

Near-Minimax-Optimal Risk-Sensitive Reinforcement Learning with CVaR

ContraBAR: Contrastive Bayes-Adaptive Deep RL

Are Diffusion Models Vulnerable to Membership Inference Attacks?

Data Representations' Study of Latent Image Manifolds

Is Consensus Acceleration Possible in Decentralized Optimization over Slowly Time-Varying Networks?

Benign Overfitting in Two-layer ReLU Convolutional Neural Networks

Attention-Based Recurrence for Multi-Agent Reinforcement Learning under Stochastic Partial Observability

Second-order regression models exhibit progressive sharpening to the edge of stability

SAM operates far from home: eigenvalue regularization as a dynamical phenomenon

Subsample Ridge Ensembles: Equivalences and Generalized Cross-Validation

Policy Regularization with Dataset Constraint for Offline Reinforcement Learning

Guiding Pretraining in Reinforcement Learning with Large Language Models

A Mathematical Model for Curriculum Learning for Parities

Revisiting Gradient Clipping: Stochastic bias and tight convergence guarantees

Two Losses Are Better Than One: Faster Optimization Using a Cheaper Proxy

MolDiff: Addressing the Atom-Bond Inconsistency Problem in 3D Molecule Diffusion Generation

NNSplitter: An Active Defense Solution for DNN Model via Automated Weight Obfuscation

Learning Representations without Compositional Assumptions

Fair and Accurate Decision Making through Group-Aware Learning

CocktailSGD: Fine-tuning Foundation Models over 500Mbps Networks

Adversarial Learning of Distributional Reinforcement Learning

Online Local Differential Private Quantile Inference via Self-normalization

Horizon-free Learning for Markov Decision Processes and Games: Stochastically Bounded Rewards and Improved Bounds

Approximate Stein Classes for Truncated Density Estimation

SinDDM: A Single Image Denoising Diffusion Model

Robustly Learning a Single Neuron via Sharpness

Neural Markov Jump Processes

A Model-Based Method for Minimizing CVaR and Beyond

Revisiting Structured Variational Autoencoders

Oracles & Followers: Stackelberg Equilibria in Deep Multi-Agent Reinforcement Learning

Lottery Tickets in Evolutionary Optimization: On Sparse Backpropagation-Free Trainability