Skip to yearly menu bar
Skip to main content
Main Navigation
ICML
Help/FAQ
Contact ICML
Downloads
Code of Conduct
Create Profile
Journal To Conference Track
Diversity & Inclusion
Privacy Policy
Future Meetings
Press
Careers
My Stuff
Login
Select Year: (2024)
2025
2024
2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2002
1996
IMLS Archives
Getting Started
Schedule
Tutorials
Main Conference
Invited Talks
Orals
Spotlight Posters
Awards
Test of Time Award
Papers
Workshops
Community
Affinity Events
Affinity Joint Poster Session
Socials
Town Hall / Business Meeting
Exhibitors
Organizers
Help
FAQ
RocketChat Help
RocketChat Desktop Client
Browse
Visualization
mini
compact
topic
detail
Showing papers for
.
×
×
title
author
topic
session
shuffle
by
serendipity
bookmarked first
visited first
not visited first
bookmarked but not visited
Enable Javascript in your browser to see the papers page.
Position: Towards Unified Alignment Between Agents, Humans, and Environment
Navigating Complexity: Toward Lossless Graph Condensation via Expanding Window Matching
Fair Off-Policy Learning from Observational Data
Consistent Submodular Maximization
Relaxing the Accurate Imputation Assumption in Doubly Robust Learning for Debiased Collaborative Filtering
Automated Statistical Model Discovery with Language Models
Model-based Reinforcement Learning for Confounded POMDPs
Position: A Call for Embodied AI
Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
WARM: On the Benefits of Weight Averaged Reward Models
MusicRL: Aligning Music Generation to Human Preferences
Nash Learning from Human Feedback
LLM and Simulation as Bilevel Optimizers: A New Paradigm to Advance Physical Scientific Discovery
Kernel-Based Evaluation of Conditional Biological Sequence Models
TinyTrain: Resource-Aware Task-Adaptive Sparse Training of DNNs at the Data-Scarce Edge
Learning Associative Memories with Gradient Descent
InfoNet: Neural Estimation of Mutual Information without Test-Time Optimization
Calibration Bottleneck: Over-compressed Representations are Less Calibratable
Discovering Environments with XRM
Batch and match: black-box variational inference with a score-based divergence
Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality
Combinatorial Approximations for Cluster Deletion: Simpler, Faster, and Better
MVMoE: Multi-Task Vehicle Routing Solver with Mixture-of-Experts
Hybrid$^2$ Neural ODE Causal Modeling and an Application to Glycemic Response
Stereographic Spherical Sliced Wasserstein Distances
Distribution Alignment Optimization through Neural Collapse for Long-tailed Classification
How Learning by Reconstruction Produces Uninformative Features For Perception
Effects of Exponential Gaussian Distribution on (Double Sampling) Randomized Smoothing
Sparse Inducing Points in Deep Gaussian Processes: Enhancing Modeling with Denoising Diffusion Variational Inference
Fool Your (Vision and) Language Model with Embarrassingly Simple Permutations
DRED: Zero-Shot Transfer in Reinforcement Learning via Data-Regularised Environment Design
Planning, Fast and Slow: Online Reinforcement Learning with Action-Free Offline Data via Multiscale Planners
Balanced Data, Imbalanced Spectra: Unveiling Class Disparities with Spectral Imbalance
The good, the bad and the ugly sides of data augmentation: An implicit spectral regularization perspective
Active Preference Learning for Large Language Models
CodeIt: Self-Improving Language Models with Prioritized Hindsight Replay
Variational Partial Group Convolutions for Input-Aware Partial Equivariance of Rotations and Color-Shifts
Incorporating Information into Shapley Values: Reweighting via a Maximum Entropy Approach
Position: Why We Must Rethink Empirical Research in Machine Learning
HarmoDT: Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning
Q-value Regularized Transformer for Offline Reinforcement Learning
Locally Estimated Global Perturbations are Better than Local Perturbations for Federated Sharpness-aware Minimization
Boundary Exploration for Bayesian Optimization With Unknown Physical Constraints
Understanding Reasoning Ability of Language Models From the Perspective of Reasoning Paths Aggregation
Peeking with PEAK: Sequential, Nonparametric Composite Hypothesis Tests for Means of Multiple Data Streams
From Vision to Audio and Beyond: A Unified Model for Audio-Visual Representation and Generation
Multi-group Learning for Hierarchical Groups
Fast Decision Boundary based Out-of-Distribution Detector
Ensemble Pruning for Out-of-distribution Generalization
Diffuse, Sample, Project: Plug-And-Play Controllable Graph Generation
On the Asymptotic Distribution of the Minimum Empirical Risk
RL-VLM-F: Reinforcement Learning from Vision Language Foundation Model Feedback
Exploiting Code Symmetries for Learning Program Semantics
Identification and Estimation for Nonignorable Missing Data: A Data Fusion Approach
Encodings for Prediction-based Neural Architecture Search
Chain of Code: Reasoning with a Language Model-Augmented Code Emulator
Towards Compositionality in Concept Learning
A Minimaximalist Approach to Reinforcement Learning from Human Feedback
Language Agent Tree Search Unifies Reasoning, Acting, and Planning in Language Models
Image Hijacks: Adversarial Images can Control Generative Models at Runtime
TENG: Time-Evolving Natural Gradient for Solving PDEs With Deep Neural Nets Toward Machine Precision
Decomposed Linear Dynamical Systems (dLDS) for learning the latent components of neural dynamics
Behavior Generation with Latent Actions
Measures of diversity and space-filling designs for categorical data
Pre-Training Protein Bi-level Representation Through Span Mask Strategy On 3D Protein Chains
QuRating: Selecting High-Quality Data for Training Language Models
On the Maximal Local Disparity of Fairness-Aware Classifiers
Conditional Common Entropy for Instrumental Variable Testing and Partial Identification
On PI Controllers for Updating Lagrange Multipliers in Constrained Optimization
Benchmarking Deletion Metrics with the Principled Explanations
Online Matrix Completion: A Collaborative Approach with Hott Items
Scaling Laws for Fine-Grained Mixture of Experts
Benign Overfitting in Adversarial Training of Neural Networks
Survival Kernets: Scalable and Interpretable Deep Kernel Survival Analysis with an Accuracy Guarantee
Sample as you Infer: Predictive Coding with Langevin Dynamics
NeWRF: A Deep Learning Framework for Wireless Radiation Field Reconstruction and Channel Prediction
Efficient PAC Learnability of Dynamical Systems Over Multilayer Networks
How Language Model Hallucinations Can Snowball
APT: Adaptive Pruning and Tuning Pretrained Language Models for Efficient Training and Inference
Extracting Training Data From Document-Based VQA Models
MindEye2: Shared-Subject Models Enable fMRI-To-Image With 1 Hour of Data
Prediction Accuracy of Learning in Games : Follow-the-Regularized-Leader meets Heisenberg
PARCv2: Physics-aware Recurrent Convolutional Neural Networks for Spatiotemporal Dynamics Modeling
Unsupervised Concept Discovery Mitigates Spurious Correlations
Scalable AI Safety via Doubly-Efficient Debate
GeoMFormer: A General Architecture for Geometric Molecular Representation Learning
Model Assessment and Selection under Temporal Distribution Shift
The Fundamental Limits of Least-Privilege Learning
Sequential Disentanglement by Extracting Static Information From A Single Sequence Element
Minimizing $f$-Divergences by Interpolating Velocity Fields
Sequential Neural Score Estimation: Likelihood-Free Inference with Conditional Score Based Diffusion Models
Bridging Mini-Batch and Asymptotic Analysis in Contrastive Learning: From InfoNCE to Kernel-Based Losses
Subgoal-based Demonstration Learning for Formal Theorem Proving
Vector Quantization Pretraining for EEG Time Series with Random Projection and Phase Alignment
Emergence of In-Context Reinforcement Learning from Noise Distillation
Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs
Robust Universal Adversarial Perturbations
Low-Cost High-Power Membership Inference Attacks
Towards Modular LLMs by Building and Reusing a Library of LoRAs
Early Time Classification with Accumulated Accuracy Gap Control
Evaluating Instrument Validity using the Principle of Independent Mechanisms
Unified Generation, Reconstruction, and Representation: Generalized Diffusion with Adaptive Latent Encoding-Decoding
CogBench: a large language model walks into a psychology lab
Human-like Category Learning by Injecting Ecological Priors from Large Language Models into Neural Networks
ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy
Consistent Diffusion Meets Tweedie: Training Exact Ambient Diffusion Models with Noisy Data
Optimally Improving Cooperative Learning in a Social Setting
$\texttt{MoE-RBench}$: Towards Building Reliable Language Models with Sparse Mixture-of-Experts
Simple linear attention language models balance the recall-throughput tradeoff
Principled Preferential Bayesian Optimization
SiT: Symmetry-invariant Transformers for Generalisation in Reinforcement Learning
On Positivity Condition for Causal Inference
Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference
Linguistic Calibration of Long-Form Generations
A Unified Framework for Learning with Nonlinear Model Classes from Arbitrary Linear Samples
Slicedit: Zero-Shot Video Editing With Text-to-Image Diffusion Models Using Spatio-Temporal Slices
The Expressive Power of Path-Based Graph Neural Networks
Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers
Environment Design for Inverse Reinforcement Learning
Stable Differentiable Causal Discovery
Towards Optimal Adversarial Robust Q-learning with Bellman Infinity-error
Can We Remove the Square-Root in Adaptive Gradient Methods? A Second-Order Perspective
Noise-Aware Algorithm for Heterogeneous Differentially Private Federated Learning
A Tale of Tails: Model Collapse as a Change of Scaling Laws
Adversarial Attacks on Combinatorial Multi-Armed Bandits
Practical Hamiltonian Monte Carlo on Riemannian Manifolds via Relativity Theory
A Dynamic Algorithm for Weighted Submodular Cover Problem
On The Fairness Impacts of Hardware Selection in Machine Learning
Time-Series Forecasting for Out-of-Distribution Generalization Using Invariant Learning
Graph Attention Retrospective
A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity
From Coarse to Fine: Enable Comprehensive Graph Self-supervised Learning with Multi-granular Semantic Ensemble
Parameterized Physics-informed Neural Networks for Parameterized PDEs
Sparser, Better, Deeper, Stronger: Improving Static Sparse Training with Exact Orthogonal Initialization
Safe Exploration in Dose Finding Clinical Trials with Heterogeneous Participants
TIC-TAC: A Framework For Improved Covariance Estimation In Deep Heteroscedastic Regression
Challenges in Training PINNs: A Loss Landscape Perspective
High-Probability Convergence for Composite and Distributed Stochastic Minimization and Variational Inequalities with Heavy-Tailed Noise
Position: The Causal Revolution Needs Scientific Pragmatism
A Unified Recipe for Deriving (Time-Uniform) PAC-Bayes Bounds
RankSEG: A Consistent Ranking-based Framework for Segmentation
PruNeRF: Segment-Centric Dataset Pruning via 3D Spatial Consistency
Bridging discrete and continuous state spaces: Exploring the Ehrenfest process in time-continuous diffusion models
Agnostic Interactive Imitation Learning: New Theory and Practical Algorithms
Coprocessor Actor Critic: A Model-Based Reinforcement Learning Approach For Adaptive Brain Stimulation
Conformal Validity Guarantees Exist for Any Data Distribution (and How to Find Them)
Learning to Continually Learn with the Bayesian Principle
Benign Overfitting in Two-Layer ReLU Convolutional Neural Networks for XOR Data
Repoformer: Selective Retrieval for Repository-Level Code Completion
Zero-Sum Positional Differential Games as a Framework for Robust Reinforcement Learning: Deep Q-Learning Approach
On the Generalization of Stochastic Gradient Descent with Momentum
Q-Star Meets Scalable Posterior Sampling: Bridging Theory and Practice via HyperAgent
A Geometric Explanation of the Likelihood OOD Detection Paradox
Leveraging Attractor Dynamics in Spatial Navigation for Better Language Parsing
Disparate Impact on Group Accuracy of Linearization for Private Inference
SqueezeLLM: Dense-and-Sparse Quantization
Closing the Gap: Achieving Global Convergence (Last Iterate) of Actor-Critic under Markovian Sampling with Neural Network Parametrization
An LLM Compiler for Parallel Function Calling
Unbiased Multi-Label Learning from Crowdsourced Annotations
SleepFM: Multi-modal Representation Learning for Sleep Across Brain Activity, ECG and Respiratory Signals
Position: Evolving AI Collectives Enhance Human Diversity and Enable Self-Regulation
Revisiting Context Aggregation for Image Matting
Learning Scale-Aware Spatio-temporal Implicit Representation for Event-based Motion Deblurring
Mind the Boundary: Coreset Selection via Reconstructing the Decision Boundary
Position: Standardization of Behavioral Use Clauses is Necessary for the Adoption of Responsible Licensing of AI
Infinite-Horizon Distributionally Robust Regret-Optimal Control
Accelerating Heterogeneous Federated Learning with Closed-form Classifiers
Irregular Multivariate Time Series Forecasting: A Transformable Patching Graph Neural Networks Approach
Simplicity Bias via Global Convergence of Sharpness Minimization
Why Do You Grok? A Theoretical Analysis on Grokking Modular Addition
An Intrinsic Vector Heat Network
Mean Field Langevin Actor-Critic: Faster Convergence and Global Optimality beyond Lazy Learning
Towards an Understanding of Stepwise Inference in Transformers: A Synthetic Graph Navigation Model
Neural-Kernel Conditional Mean Embeddings
Stochastic Quantum Sampling for Non-Logconcave Distributions and Estimating Partition Functions
Enabling Uncertainty Estimation in Iterative Neural Networks
MultiMax: Sparse and Multi-Modal Attention Learning
Improving Neural Additive Models with Bayesian Principles
Averaging $n$-step Returns Reduces Variance in Reinforcement Learning
Implicit meta-learning may lead language models to trust more reliable sources
Multi-Track Message Passing: Tackling Oversmoothing and Oversquashing in Graph Learning via Preventing Heterophily Mixing
Overestimation, Overfitting, and Plasticity in Actor-Critic: the Bitter Lesson of Reinforcement Learning
Residual-Conditioned Optimal Transport: Towards Structure-Preserving Unpaired and Paired Image Restoration
Prometheus: Out-of-distribution Fluid Dynamics Modeling with Disentangled Graph ODE
Do Models Explain Themselves? Counterfactual Simulatability of Natural Language Explanations
Total Variation Distance Meets Probabilistic Inference
Improving Adversarial Energy-Based Model via Diffusion Process
A3S: A General Active Clustering Method with Pairwise Constraints
An Online Optimization Perspective on First-Order and Zero-Order Decentralized Nonsmooth Nonconvex Stochastic Optimization
Learning a Diffusion Model Policy from Rewards via Q-Score Matching
Online Cascade Learning for Efficient Inference over Streams
Investigating Pre-Training Objectives for Generalization in Vision-Based Reinforcement Learning
Learning from Streaming Data when Users Choose
Scaling Speech Technology to 1,000+ Languages
Robustness of Nonlinear Representation Learning
Conformal Prediction with Learned Features
Symmetry Induces Structure and Constraint of Learning
Achieving Margin Maximization Exponentially Fast via Progressive Norm Rescaling
T-Cal: An Optimal Test for the Calibration of Predictive Models
Watermark Stealing in Large Language Models
Prior Specification for Bayesian Matrix Factorization via Prior Predictive Matching
Fair Data Representation for Machine Learning at the Pareto Frontier
A Dynamical Model of Neural Scaling Laws
Adaptive Learning of Density Ratios in RKHS
Adaptively Perturbed Mirror Descent for Learning in Games
Bayesian Uncertainty for Gradient Aggregation in Multi-Task Learning
AttnLRP: Attention-Aware Layer-Wise Relevance Propagation for Transformers
FrameQuant: Flexible Low-Bit Quantization for Transformers
BeigeMaps: Behavioral Eigenmaps for Reinforcement Learning from Images
Optimal Coresets for Low-Dimensional Geometric Median
Learning to Play Atari in a World of Tokens
Probabilistic Generating Circuits - Demystified
The Non-linear $F$-Design and Applications to Interactive Learning
LeaPformer: Enabling Linear Transformers for Autoregressive and Simultaneous Tasks via Learned Proportions
Distinguishing the Knowable from the Unknowable with Language Models
How to Escape Sharp Minima with Random Perturbations
Understanding Adam Optimizer via Online Learning of Updates: Adam is FTRL in Disguise
Not all distributional shifts are equal: Fine-grained robust conformal inference
Triple Changes Estimator for Targeted Policies
Learning Mixtures of Gaussian Processes through Random Projection
Nonlinear Filtering with Brenier Optimal Transport Maps
Revisiting Inexact Fixed-Point Iterations for Min-Max Problems: Stochasticity and Structured Nonconvexity
Gaussian Processes on Cellular Complexes
No Dimensional Sampling Coresets for Classification
Physics of Language Models: Part 3.1, Knowledge Storage and Extraction
Byzantine-Robust Federated Learning: Impact of Client Subsampling and Local Updates
The Privacy Power of Correlated Noise in Decentralized Learning
Robust and Conjugate Gaussian Process Regression
Position: Stop Making Unscientific AGI Performance Claims
Hyperbolic Optimizer as a Dynamical System
Stationarity without mean reversion in improper Gaussian processes
Robust Graph Matching when Nodes are Corrupt
Fast Algorithms for Hypergraph PageRank with Applications to Semi-Supervised Learning
Scalable Online Exploration via Coverability
Adaptive Hierarchical Certification for Segmentation using Randomized Smoothing
Online conformal prediction with decaying step sizes
A Rate-Distortion View of Uncertainty Quantification
Practical Performance Guarantees for Pipelined DNN Inference
Causal Action Influence Aware Counterfactual Data Augmentation
An amortized approach to non-linear mixed-effects modeling based on neural posterior estimation
Learning the Target Network in Function Space
Private Vector Mean Estimation in the Shuffle Model: Optimal Rates Require Many Messages
Delaunay Graph: Addressing Over-Squashing and Over-Smoothing Using Delaunay Triangulation
How Free is Parameter-Free Stochastic Optimization?
Random features models: a way to study the success of naive imputation
Bipartite Matching in Massive Graphs: A Tight Analysis of EDCS
Simulation of Graph Algorithms with Looped Transformers
Diffusion Models Demand Contrastive Guidance for Adversarial Purification to Advance
On the Complexity of Finite-Sum Smooth Optimization under the Polyak–Łojasiewicz Condition
Constrained Ensemble Exploration for Unsupervised Skill Discovery
Memory Consolidation Enables Long-Context Video Understanding
On the Identifiability of Switching Dynamical Systems
Analyzing $D^\alpha$ seeding for $k$-means
Relational DNN Verification With Cross Executional Bound Refinement
VNN: Verification-Friendly Neural Networks with Hard Robustness Guarantees
Scale-Free Image Keypoints Using Differentiable Persistent Homology
Adversarial Robustness Limits via Scaling-Law and Human-Alignment Studies
Neural Diffusion Models
Generalization in Kernel Regression Under Realistic Assumptions
On Mechanistic Knowledge Localization in Text-to-Image Generative Models
Monotone Individual Fairness
Vocabulary for Universal Approximation: A Linguistic Perspective of Mapping Compositions
Standardized Interpretable Fairness Measures for Continuous Risk Scores
Neural Networks Learn Statistics of Increasing Complexity
The Role of Learning Algorithms in Collective Action
CoLoRA: Continuous low-rank adaptation for reduced implicit neural modeling of parameterized partial differential equations
By Tying Embeddings You Are Assuming the Distributional Hypothesis
Controlling Behavioral Diversity in Multi-Agent Reinforcement Learning
Refining Minimax Regret for Unsupervised Environment Design
Guiding LLMs The Right Way: Fast, Non-Invasive Constrained Generation
Sarah Frank-Wolfe: Methods for Constrained Optimization with Best Rates and Practical Features
Position: Scaling Simulation is Neither Necessary Nor Sufficient for In-the-Wild Robot Manipulation
Why do Variational Autoencoders Really Promote Disentanglement?
Best of Both Worlds Guarantees for Smoothed Online Quadratic Optimization
Multi-Patch Prediction: Adapting Language Models for Time Series Representation Learning
Naive Bayes Classifiers over Missing Data: Decision and Poisoning
Improving fine-grained understanding in image-text pre-training
Position: Explain to Question not to Justify
Biharmonic Distance of Graphs and its Higher-Order Variants: Theoretical Properties with Applications to Centrality and Clustering
Stability Evaluation through Distributional Perturbation Analysis
Dynamic Survival Analysis with Controlled Latent States
Beyond ELBOs: A Large-Scale Evaluation of Variational Methods for Sampling
Shifted Interpolation for Differential Privacy
How Spurious Features are Memorized: Precise Analysis for Random and NTK Features
Towards Understanding the Word Sensitivity of Attention Layers: A Study via Random Features
Position: Machine Learning-powered Assessments of the EU Digital Services Act Aid Quantify Policy Impacts on Online Harms
Random matrix theory improved Fréchet mean of symmetric positive definite matrices
On dimensionality of feature vectors in MPNNs
Fully-Dynamic Approximate Decision Trees With Worst-Case Update Time Guarantees
Applying language models to algebraic topology: generating simplicial cycles using multi-labeling in Wu's formula
Private Gradient Descent for Linear Regression: Tighter Error Bounds and Instance-Specific Uncertainty Estimation
Provably Neural Active Learning Succeeds via Prioritizing Perplexing Samples
Tackling Prevalent Conditions in Unsupervised Combinatorial Optimization: Cardinality, Minimum, Covering, and More
Differentially Private Bias-Term Fine-tuning of Foundation Models
Langevin Policy for Safe Reinforcement Learning
Semantically-correlated memories in a dense associative model
How Uniform Random Weights Induce Non-uniform Bias: Typical Interpolating Neural Networks Generalize with Narrow Teachers
Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads
Enhancing Cross-Modal Fine-Tuning with Gradually Intermediate Modality Generation
Accelerated Algorithms for Constrained Nonconvex-Nonconcave Min-Max Optimization and Comonotone Inclusion
Sample-specific Masks for Visual Reprogramming-based Prompting
Generative Flows on Discrete State-Spaces: Enabling Multimodal Flows with Applications to Protein Co-Design
Successor Features for Efficient Multi-Subject Controlled Text Generation
Limited Preference Aided Imitation Learning from Imperfect Demonstrations
Predictive Dynamic Fusion
Can a Few Decide for Many? The Metric Distortion of Sortition
AI Alignment with Changing and Influenceable Reward Functions
Online Learning under Budget and ROI Constraints via Weak Adaptivity
On the Implicit Bias of Adam
Feasibility Consistent Representation Learning for Safe Reinforcement Learning
Simple Ingredients for Offline Reinforcement Learning
Auditing Private Prediction
Scribble-Supervised Semantic Segmentation with Prototype-based Feature Augmentation
Feature Importance Disparities for Data Bias Investigations
Inferring Dynamic Networks from Marginals with Iterative Proportional Fitting
MagicPose: Realistic Human Poses and Facial Expressions Retargeting with Identity-aware Diffusion
How Interpretable Are Interpretable Graph Neural Networks?
Doubly Robust Causal Effect Estimation under Networked Interference via Targeted Learning
Feature Attribution with Necessity and Sufficiency via Dual-stage Perturbation Test for Causal Explanation
InstructZero: Efficient Instruction Optimization for Black-Box Large Language Models
MaSS: Multi-attribute Selective Suppression for Utility-preserving Data Transformation from an Information-theoretic Perspective
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models
Robust Classification via a Single Diffusion Model
Relational Learning in Pre-Trained Models: A Theory from Hypergraph Recovery Perspective
Towards AutoAI: Optimizing a Machine Learning System with Black-box and Differentiable Components
RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content
Probabilistic Forecasting with Stochastic Interpolants and Föllmer Processes
CogDPM: Diffusion Probabilistic Models via Cognitive Predictive Coding
Bagged Deep Image Prior for Recovering Images in the Presence of Speckle Noise
Accelerated Policy Gradient for s-rectangular Robust MDPs with Large State Spaces
Improved Communication-Privacy Trade-offs in $L_2$ Mean Estimation under Streaming Differential Privacy
Offline Transition Modeling via Contrastive Energy Learning
Efficient Pareto Manifold Learning with Low-Rank Structure
Identifiability Matters: Revealing the Hidden Recoverable Condition in Unbiased Learning to Rank
High-Dimensional Kernel Methods under Covariate Shift: Data-Dependent Implicit Regularization
DiJiang: Efficient Large Language Models through Compact Kernelization
EE-LLM: Large-Scale Training and Inference of Early-Exit Large Language Models with 3D Parallelism
MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models
CaRiNG: Learning Temporal Causal Representation under Non-Invertible Generation Process
GRATH: Gradual Self-Truthifying for Large Language Models
Performative Prediction with Bandit Feedback: Learning through Reparameterization
Self-Attention through Kernel-Eigen Pair Sparse Variational Gaussian Processes
Recovering Labels from Local Updates in Federated Learning
Locally Differentially Private Decentralized Stochastic Bilevel Optimization with Guaranteed Convergence Accuracy
Subequivariant Reinforcement Learning in 3D Multi-Entity Physical Environments
A General Framework for Learning from Weak Supervision
Diffusion Model-Augmented Behavioral Cloning
Positional Knowledge is All You Need: Position-induced Transformer (PiT) for Operator Learning
In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation
Split-Ensemble: Efficient OOD-aware Ensemble via Task and Model Splitting
DRCT: Diffusion Reconstruction Contrastive Training towards Universal Detection of Diffusion Generated Images
FedMBridge: Bridgeable Multimodal Federated Learning
Revealing the Dark Secrets of Extremely Large Kernel ConvNets on Robustness
Diffusive Gibbs Sampling
Provable Risk-Sensitive Distributional Reinforcement Learning with General Function Approximation
LLaGA: Large Language and Graph Assistant
Compact Optimality Verification for Optimization Proxies
Enhancing Implicit Shape Generators Using Topological Regularizations
Stacking Deep Set Networks and Pooling by Quantiles
What Can Transformer Learn with Varying Depth? Case Studies on Sequence Learning Tasks
BadPart: Unified Black-box Adversarial Patch Attacks against Pixel-wise Regression Tasks
GaussianPro: 3D Gaussian Splatting with Progressive Propagation
RICE: Breaking Through the Training Bottlenecks of Reinforcement Learning with Explanation
RIME: Robust Preference-based Reinforcement Learning with Noisy Preferences
Kernel Semi-Implicit Variational Inference
Creative Text-to-Audio Generation via Synthesizer Programming
Leveraging (Biased) Information: Multi-armed Bandits with Offline Data
Expert Proximity as Surrogate Rewards for Single Demonstration Imitation Learning
MS-TIP: Imputation Aware Pedestrian Trajectory Prediction
Enhancing Trajectory Prediction through Self-Supervised Waypoint Distortion Prediction
How Flawed Is ECE? An Analysis via Logit Smoothing
Kernel Debiased Plug-in Estimation: Simultaneous, Automated Debiasing without Influence Functions for Many Target Parameters
Hard Tasks First: Multi-Task Reinforcement Learning Through Task Scheduling
KV-Runahead: Scalable Causal LLM Inference by Parallel Key-Value Cache Generation
Neurodegenerative Brain Network Classification via Adaptive Diffusion with Temporal Regularization
Scalable Wasserstein Gradient Flow for Generative Modeling through Unbalanced Optimal Transport
Listwise Reward Estimation for Offline Preference-based Reinforcement Learning
PICLe: Eliciting Diverse Behaviors from Large Language Models with Persona In-Context Learning
Online bipartite matching with imperfect advice
A connection between Tempering and Entropic Mirror Descent
A Provably Effective Method for Pruning Experts in Fine-tuned Sparse Mixture-of-Experts
How Private are DP-SGD Implementations?
Prompt-tuning Latent Diffusion Models for Inverse Problems
Studying K-FAC Heuristics by Viewing Adam through a Second-Order Lens
$\mathtt{VITS}$ : Variational Inference Thompson Sampling for contextual bandits
Improving Token-Based World Models with Parallel Observation Prediction
Multi-View Stochastic Block Models
Weighted distance nearest neighbor condensing
A Near-Linear Time Approximation Algorithm for Beyond-Worst-Case Graph Clustering
Dynamic Correlation Clustering in Sublinear Update Time
A2Q+: Improving Accumulator-Aware Weight Quantization
Statistical Inference Under Constrained Selection Bias
Conformal Prediction Sets Improve Human Decision Making
Generalization Bounds for Causal Regression: Insights, Guarantees and Sensitivity Analysis
Harmonizing Generalization and Personalization in Federated Prompt Learning
ULTRAFEEDBACK: Boosting Language Models with Scaled AI Feedback
Position: Beyond Personhood: Agency, Accountability, and the Limits of Anthropomorphic Ethical Analysis
Multi-View Clustering by Inter-cluster Connectivity Guided Reward
High-Order Contrastive Learning with Fine-grained Comparative Levels for Sparse Ordinal Tensor Completion
Safe Reinforcement Learning using Finite-Horizon Gradient-based Estimation
Neural Collapse for Cross-entropy Class-Imbalanced Learning with Unconstrained ReLU Features Model
Boosting Offline Optimizers with Surrogate Sensitivity
Test-Time Degradation Adaptation for Open-Set Image Restoration
A decoder-only foundation model for time-series forecasting
New Bounds on the Cohesion of Complete-link and Other Linkage Methods for Agglomerative Clustering
Geometric Active Exploration in Markov Decision Processes: the Benefit of Abstraction
Global Reinforcement Learning : Beyond Linear and Convex Rewards via Submodular Semi-gradient Methods
Provably Better Explanations with Optimized Aggregation of Feature Attributions
Learning Cognitive Maps from Transformer Representations for Efficient Planning in Partially Observed Environments
Asymptotically Optimal and Computationally Efficient Average Treatment Effect Estimation in A/B testing
Predicting Lagrangian Multipliers for Mixed Integer Linear Programs
Prediction-powered Generalization of Causal Inferences
An Unsupervised Approach for Periodic Source Detection in Time Series
Collaborative Learning with Different Labeling Functions
Exploring the Low-Pass Filtering Behavior in Image Super-Resolution
Network Tight Community Detection
Going beyond Compositions, DDPMs Can Produce Zero-Shot Interpolations
Multicalibration for Confidence Scoring in LLMs
Contextualized Policy Recovery: Modeling and Interpreting Medical Decisions with Adaptive Imitation Learning
Locally Interdependent Multi-Agent MDP: Theoretical Framework for Decentralized Agents with Dynamic Dependencies
Trust Regions for Explanations via Black-Box Probabilistic Certification
Double Stochasticity Gazes Faster: Snap-Shot Decentralized Stochastic Gradient Tracking Methods
Double Variance Reduction: A Smoothing Trick for Composite Optimization Problems without First-Order Gradient
Robust Sparse Estimation for Gaussians with Optimal Error under Huber Contamination
Fast Co-Training under Weak Dependence via Stream-Based Active Learning
Convex and Bilevel Optimization for Neural-Symbolic Inference and Learning
Efficient Algorithms for Sum-Of-Minimum Optimization
Robust Stable Spiking Neural Networks
Quality Diversity through Human Feedback: Towards Open-Ended Diversity-Driven Optimization
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
Learning-Rate-Free Stochastic Optimization over Riemannian Manifolds
Consistent Adversarially Robust Linear Classification: Non-Parametric Setting
Precise Accuracy / Robustness Tradeoffs in Regression: Case of General Norms
Spectral Preconditioning for Gradient Methods on Graded Non-convex Functions
Impact of Decentralized Learning on Player Utilities in Stackelberg Games
Towards Generalization beyond Pointwise Learning: A Unified Information-theoretic Perspective
Pruner-Zero: Evolving Symbolic Pruning Metric From Scratch for Large Language Models
Position: Building Guardrails for Large Language Models Requires Systematic Design
Accelerating PDE Data Generation via Differential Operator Action in Solution Space
TimeSiam: A Pre-Training Framework for Siamese Time-Series Modeling
Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference
Privacy-Preserving Data Release Leveraging Optimal Transport and Particle Gradient Descent
Spike Distance Function as a Learning Objective for Spike Prediction
On the Universality of Volume-Preserving and Coupling-Based Normalizing Flows
WorkArena: How Capable are Web Agents at Solving Common Knowledge Work Tasks?
Principled Gradient-Based MCMC for Conditional Sampling of Text
Position: Compositional Generative Modeling: A Single Model is Not All You Need
Improving Factuality and Reasoning in Language Models through Multiagent Debate
Learning Iterative Reasoning through Energy Diffusion
When and How Does In-Distribution Label Help Out-of-Distribution Detection?
AnyTool: Self-Reflective, Hierarchical Agents for Large-Scale API Calls
MuxServe: Flexible Spatial-Temporal Multiplexing for Multiple LLM Serving
MF-CLR: Multi-Frequency Contrastive Learning Representation for Time Series
DE-COP: Detecting Copyrighted Content in Language Models Training Data
Unveiling the Potential of AI for Nanomaterial Morphology Prediction
Sharpness-Aware Data Generation for Zero-shot Quantization
Making Old Things New: A Unified Algorithm for Differentially Private Clustering
Generalization Bounds for Heavy-Tailed SDEs through the Fractional Fokker-Planck Equation
Outlier-robust Kalman Filtering through Generalised Bayes
Barrier Algorithms for Constrained Non-Convex Optimization
Equivariant Frames and the Impossibility of Continuous Canonicalization
Position: Insights from Survey Methodology can Improve Training Data
Efficient Error Certification for Physics-Informed Neural Networks
Scalable Pre-training of Large Autoregressive Image Models
TSLANet: Rethinking Transformers for Time Series Representation Learning
Revisiting Scalable Hessian Diagonal Approximations for Applications in Reinforcement Learning
Approximate Nearest Neighbor Search with Window Filters
DsDm: Model-Aware Dataset Selection with Datamodels
Compositional Curvature Bounds for Deep Neural Networks
PAC-Bayesian Error Bound, via Rényi Divergence, for a Class of Linear Time-Invariant State-Space Models
Model Alignment as Prospect Theoretic Optimization
Out of the Ordinary: Spectrally Adapting Regression for Covariate Shift
Revisit the Essence of Distilling Knowledge through Calibration
DOGE: Domain Reweighting with Generalization Estimation
Bayesian Knowledge Distillation: A Bayesian Perspective of Distillation with Uncertainty Quantification
Exploring Correlations of Self-Supervised Tasks for Graphs
INViT: A Generalizable Routing Problem Solver with Invariant Nested View Transformer
Stop Regressing: Training Value Functions via Classification for Scalable Deep RL
From Geometry to Causality- Ricci Curvature and the Reliability of Causal Inference on Networks
Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition
Resisting Stochastic Risks in Diffusion Planners with the Trajectory Aggregation Tree
Fast White-Box Adversarial Streaming Without a Random Oracle
DSD-DA: Distillation-based Source Debiasing for Domain Adaptive Object Detection
Keypoint-based Progressive Chain-of-Thought Distillation for LLMs
UniCorn: A Unified Contrastive Learning Approach for Multi-view Molecular Representation Learning
Sliced-Wasserstein Estimation with Spherical Harmonics as Control Variates
Privacy Backdoors: Stealing Data with Corrupted Pretrained Models
Improving Sample Efficiency of Model-Free Algorithms for Zero-Sum Markov Games
Reservoir Computing for Short High-Dimensional Time Series: an Application to SARS-CoV-2 Hospitalization Forecast
Position: Relational Deep Learning - Graph Representation Learning on Relational Databases
Critical feature learning in deep neural networks
Neuroexplicit Diffusion Models for Inpainting of Optical Flow Fields
Inverse-Variance Weighting for Estimation of Heterogeneous Treatment Effects
Explaining Probabilistic Models with Distributional Values
Hyperbolic Active Learning for Semantic Segmentation under Domain Shift
Weisfeiler-Leman at the margin: When more expressivity matters
Unsupervised Zero-Shot Reinforcement Learning via Functional Reward Encodings
Trust the Model Where It Trusts Itself - Model-Based Actor-Critic with Uncertainty-Aware Rollout Adaption
Trustworthy Actionable Perturbations
Interpretability Illusions in the Generalization of Simplified Models
Hyperbolic Geometric Latent Diffusion Model for Graph Generation
Language-guided Skill Learning with Temporal Variational Inference
PinNet: Pinpoint Instructive Information for Retrieval Augmented Code-to-Text Generation
Towards Theoretical Understandings of Self-Consuming Generative Models
Positive Concave Deep Equilibrium Models
Let Go of Your Labels with Unsupervised Transfer
Erasing the Bias: Fine-Tuning Foundation Models for Semi-Supervised Learning
Reflective Policy Optimization
Testing the Feasibility of Linear Programs with Bandit Feedback
A Doubly Recursive Stochastic Compositional Gradient Descent Method for Federated Multi-Level Compositional Optimization
Stochastic Weakly Convex Optimization beyond Lipschitz Continuity
A Graph is Worth $K$ Words: Euclideanizing Graph using Pure Transformer
Multi-Agent Reinforcement Learning Meets Leaf Sequencing in Radiotherapy
DMTG: One-Shot Differentiable Multi-Task Grouping
Rethinking Specificity in SBDD: Leveraging Delta Score and Energy-Guided Diffusion
Non-convex Stochastic Composite Optimization with Polyak Momentum
Adaptive-Gradient Policy Optimization: Enhancing Policy Learning in Non-Smooth Differentiable Simulations
Decoupling Learning and Decision-Making: Breaking the $\mathcal{O}(\sqrt{T})$ Barrier in Online Resource Allocation with First-Order Methods
Parameter-Efficient Fine-Tuning with Discrete Fourier Transform
Fast-Slow Test-Time Adaptation for Online Vision-and-Language Navigation
Causal Customer Churn Analysis with Low-rank Tensor Block Hazard Model
Projection-Free Online Convex Optimization with Time-Varying Constraints
LLark: A Multimodal Instruction-Following Language Model for Music
Position: Categorical Deep Learning is an Algebraic Theory of All Architectures
Safe and Robust Subgame Exploitation in Imperfect Information Games
Don't trust your eyes: on the (un)reliability of feature visualizations
Learning with 3D rotations, a hitchhiker's guide to SO(3)
Graph-Triggered Rising Bandits
Reinforcement Learning within Tree Search for Fast Macro Placement
Patchscopes: A Unifying Framework for Inspecting Hidden Representations of Language Models
Individualized Privacy Accounting via Subsampling with Applications in Combinatorial Optimization
State-Constrained Zero-Sum Differential Games with One-Sided Information
Agnostic Learning of Mixed Linear Regressions with EM and AM Algorithms
Optimal Eye Surgeon: Finding image priors through sparse generators at initialization
Self-Correcting Self-Consuming Loops for Generative Model Training
Position: The No Free Lunch Theorem, Kolmogorov Complexity, and the Role of Inductive Biases in Machine Learning
CasCast: Skillful High-resolution Precipitation Nowcasting via Cascaded Modelling
Does Label Smoothing Help Deep Partial Label Learning?
AST-T5: Structure-Aware Pretraining for Code Generation and Understanding
Evolution-Inspired Loss Functions for Protein Representation Learning
Evaluation of LLMs on Syntax-Aware Code Fill-in-the-Middle Tasks
E$^2$GAN: Efficient Training of Efficient GANs for Image-to-Image Translation
Long Range Propagation on Continuous-Time Dynamic Graphs
Nonsmooth Implicit Differentiation: Deterministic and Stochastic Convergence Rates
Fine-grained Classes and How to Find Them
AI Control: Improving Safety Despite Intentional Subversion
Scaling Down Deep Learning with MNIST-1D
A Bias-Variance-Covariance Decomposition of Kernel Scores for Generative Models
EDISON: Enhanced Dictionary-Induced Tensorized Incomplete Multi-View Clustering with Gaussian Error Rank Minimization
CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution
Predictive Performance Comparison of Decision Policies Under Confounding
On the Diminishing Returns of Width for Continual Learning
Automated Evaluation of Retrieval-Augmented Language Models with Task-Specific Exam Generation
DS-Agent: Automated Data Science by Empowering Large Language Models with Case-Based Reasoning
Collaborative Heterogeneous Causal Inference Beyond Meta-analysis
ACM-MILP: Adaptive Constraint Modification via Grouping and Selection for Hardness-Preserving MILP Instance Generation
FedRC: Tackling Diverse Distribution Shifts Challenge in Federated Learning by Robust Clustering
Compressing Large Language Models by Joint Sparsification and Quantization
Automated Loss function Search for Class-imbalanced Node Classification
Temporal Logic Specification-Conditioned Decision Transformer for Offline Safe Reinforcement Learning
GistScore: Learning Better Representations for In-Context Example Selection with Gist Bottlenecks
Vectorized Conditional Neural Fields: A Framework for Solving Time-dependent Parametric Partial Differential Equations
Isometric Representation Learning for Disentangled Latent Space of Diffusion Models
Pursuing Overall Welfare in Federated Learning through Sequential Decision Making
Dr. Strategy: Model-Based Generalist Agents with Strategic Dreaming
Riemannian coordinate descent algorithms on matrix manifolds
Prototypical Transformer As Unified Motion Learners
SIN: Selective and Interpretable Normalization for Long-Term Time Series Forecasting
Large Language Models Can Automatically Engineer Features for Few-Shot Tabular Learning
Binary Decomposition: A Problem Transformation Perspective for Open-Set Semi-Supervised Learning
Data-efficient Large Vision Models through Sequential Autoregression
MGit: A Model Versioning and Management System
DPOT: Auto-Regressive Denoising Operator Transformer for Large-Scale PDE Pre-Training
Convergence Guarantees for the DeepWalk Embedding on Block Models
Estimating the Permanent by Nesting Importance Sampling
Position: $C^*$-Algebraic Machine Learning $-$ Moving in a New Direction
Wasserstein Wormhole: Scalable Optimal Transport Distance with Transformer
GLoRe: When, Where, and How to Improve LLM Reasoning via Global and Local Refinements
MAGNOLIA: Matching Algorithms via GNNs for Online Value-to-go Approximation
LoRA+: Efficient Low Rank Adaptation of Large Models
Deep Neural Room Acoustics Primitive
Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation
ReDiffuser: Reliable Decision-Making Using a Diffuser with Confidence Estimation
Domain-wise Data Acquisition to Improve Performance under Distribution Shift
Quantum Algorithm for Online Exp-concave Optimization
Riemannian Accelerated Zeroth-order Algorithm: Improved Robustness and Lower Query Complexity
Ambiguity-Aware Abductive Learning
Be Your Own Neighborhood: Detecting Adversarial Examples by the Neighborhood Relations Built on Self-Supervised Learning
Robust Multi-Task Learning with Excess Risks
DynSyn: Dynamical Synergistic Representation for Efficient Learning and Control in Overactuated Embodied Systems
Learning Useful Representations of Recurrent Neural Network Weight Matrices
Randomized Confidence Bounds for Stochastic Partial Monitoring
Understanding Diffusion Models by Feynman's Path Integral
Learning Surrogates for Offline Black-Box Optimization via Gradient Matching
Estimating Unknown Population Sizes Using the Hypergeometric Distribution
Two Tales of Single-Phase Contrastive Hebbian Learning
Verifying message-passing neural networks via topology-based bounds tightening
Criterion Collapse and Loss Distribution Control
Removing Spurious Concepts from Neural Network Representations via Joint Subspace Estimation
Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression
Enhancing Sufficient Dimension Reduction via Hellinger Correlation
A Primal-Dual Algorithm for Offline Constrained Reinforcement Learning with Linear MDPs
Do Large Code Models Understand Programming Concepts? Counterfactual Analysis for Code Predicates
Graph Neural PDE Solvers with Conservation and Similarity-Equivariance
Maestro: Uncovering Low-Rank Structures via Trainable Decomposition
Equilibrium of Data Markets with Externality
Multi-Sender Persuasion: A Computational Perspective
IBD-PSC: Input-level Backdoor Detection via Parameter-oriented Scaling Consistency
PrE-Text: Training Language Models on Private Federated Data in the Age of LLMs
Loss Shaping Constraints for Long-Term Time Series Forecasting
Careful with that Scalpel: Improving Gradient Surgery with an EMA
Tripod: Three Complementary Inductive Biases for Disentangled Representation Learning
Task-aware Orthogonal Sparse Network for Exploring Shared Knowledge in Continual Learning
An Information Theoretic Approach to Interaction-Grounded Learning
SceneCraft: An LLM Agent for Synthesizing 3D Scenes as Blender Code
Pseudo-Calibration: Improving Predictive Uncertainty Estimation in Unsupervised Domain Adaptation
Improving Interpretation Faithfulness for Vision Transformers
Multigroup Robustness
Provable Privacy with Non-Private Pre-Processing
Case-Based or Rule-Based: How Do Transformers Do the Math?
Sparse Model Inversion: Efficient Inversion of Vision Transformers for Data-Free Applications
High-Performance Temporal Reversible Spiking Neural Networks with $\mathcal{O}(L)$ Training Memory and $\mathcal{O}(1)$ Inference Cost
Accelerating Transformer Pre-training with 2:4 Sparsity
InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks
ReconBoost: Boosting Can Achieve Modality Reconcilement
Auctionformer: A Unified Deep Learning Algorithm for Solving Equilibrium Strategies in Auction Games
In-context Convergence of Transformers
Near-Linear Time Approximation Algorithms for k-means with Outliers
Symbolic Music Generation with Non-Differentiable Rule Guided Diffusion
Model-Based RL for Mean-Field Games is not Statistically Harder than Single-Agent RL
InstructSpeech: Following Speech Editing Instructions via Large Language Models
Bayesian Power Steering: An Effective Approach for Domain Adaptation of Diffusion Models
CLIF: Complementary Leaky Integrate-and-Fire Neuron for Spiking Neural Networks
Machine Vision Therapy: Multimodal Large Language Models Can Enhance Visual Robustness via Denoising In-Context Learning
BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
An Empirical Examination of Balancing Strategy for Counterfactual Estimation on Time Series
MFTN: A Multi-scale Feature Transfer Network Based on IMatchFormer for Hyperspectral Image Super-Resolution
On Which Nodes Does GCN Fail? Enhancing GCN From the Node Perspective
Quasi-Monte Carlo Features for Kernel Approximation
MLAgentBench: Evaluating Language Agents on Machine Learning Experimentation
How Universal Polynomial Bases Enhance Spectral Graph Neural Networks: Heterophily, Over-smoothing, and Over-squashing
Interaction-based Retrieval-augmented Diffusion Models for Protein-specific 3D Molecule Generation
Triadic-OCD: Asynchronous Online Change Detection with Provable Robustness, Optimality, and Convergence
An Embodied Generalist Agent in 3D World
Faster Adaptive Decentralized Learning Algorithms
Faster Sampling via Stochastic Gradient Proximal Sampler
Position: The Platonic Representation Hypothesis
Nash Incentive-compatible Online Mechanism Learning via Weakly Differentially Private Online Learning
Make-A-Shape: a Ten-Million-scale 3D Shape Model
Residual Quantization with Implicit Neural Codebooks
Theoretical Guarantees for Variational Inference with Fixed-Variance Mixture of Gaussians
Vanilla Bayesian Optimization Performs Great in High Dimensions
Fine-Grained Causal Dynamics Learning with Quantization for Improving Robustness in Reinforcement Learning
Adapting Pretrained ViTs with Convolution Injector for Visuo-Motor Control
Smooth Min-Max Monotonic Networks
SAMformer: Unlocking the Potential of Transformers in Time Series Forecasting with Sharpness-Aware Minimization and Channel-Wise Attention
Understanding the Learning Dynamics of Alignment with Human Feedback
Zero-Shot Reinforcement Learning via Function Encoders
PASOA- PArticle baSed Bayesian Optimal Adaptive design
Attribution-based Explanations that Provide Recourse Cannot be Robust
Learning to Reach Goals via Diffusion
Online Non-stochastic Control with Partial Feedback
Rethinking DP-SGD in Discrete Domain: Exploring Logistic Distribution in the Realm of signSGD
An Independence-promoting Loss for Music Generation with Language Models
Gradual Divergence for Seamless Adaptation: A Novel Domain Incremental Learning Method
Repeat After Me: Transformers are Better than State Space Models at Copying
Finite Volume Features, Global Geometry Representations, and Residual Training for Deep Learning-based CFD Simulation
ReLU to the Rescue: Improve Your On-Policy Actor-Critic with Positive Advantages
Advancing Dynamic Sparse Training by Exploring Optimization Opportunities
ACE: Off-Policy Actor-Critic with Causality-Aware Entropy Regularization
Towards Efficient Exact Optimization of Language Model Alignment
Seizing Serendipity: Exploiting the Value of Past Success in Off-Policy Actor-Critic
Discrete Latent Perspective Learning for Segmentation and Detection
Simulation-Based Inference with Quantile Regression
GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer
Chain-of-Thought Predictive Control
NDOT: Neuronal Dynamics-based Online Training for Spiking Neural Networks
On the Origins of Linear Representations in Large Language Models
Federated Optimization with Doubly Regularized Drift Correction
Projection-Free Variance Reduction Methods for Stochastic Constrained Multi-Level Compositional Optimization
Generalized Neural Collapse for a Large Number of Classes
SuDA: Support-based Domain Adaptation for Sim2Real Hinge Joint Tracking with Flexible Sensors
Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning
Homomorphism Counts for Graph Neural Networks: All About That Basis
What Will My Model Forget? Forecasting Forgotten Examples in Language Model Refinement
An Image is Worth Multiple Words: Discovering Object Level Concepts using Multi-Concept Prompt Learning
Language Models as Semantic Indexers
Position: What Can Large Language Models Tell Us about Time Series Analysis
FedSC: Provable Federated Self-supervised Learning with Spectral Contrastive Objective over Non-i.i.d. Data
Graph Generation with Diffusion Mixture
Experts Don't Cheat: Learning What You Don't Know By Predicting Pairs
IW-GAE: Importance weighted group accuracy estimation for improved calibration and model selection in unsupervised domain adaptation
Decoupling Feature Extraction and Classification Layers for Calibrated Neural Networks
Position: Benchmarking is Limited in Reinforcement Learning Research
Is Epistemic Uncertainty Faithfully Represented by Evidential Deep Learning Methods?
Unsupervised Episode Generation for Graph Meta-learning
Beyond the Calibration Point: Mechanism Comparison in Differential Privacy
Replicable Learning of Large-Margin Halfspaces
Tell, Don't Show: Language Guidance Eases Transfer Across Domains in Images and Videos
C-RAG: Certified Generation Risks for Retrieval-Augmented Language Models
Think Before You Act: Decision Transformers with Working Memory
Certifiably Byzantine-Robust Federated Conformal Prediction
Neural Tangent Kernels for Axis-Aligned Tree Ensembles
On the Generalization of Equivariant Graph Neural Networks
Challenges and Considerations in the Evaluation of Bayesian Causal Discovery
Progressive Inference: Explaining Decoder-Only Sequence Classification Models Using Intermediate Predictions
Active Adaptive Experimental Design for Treatment Effect Estimation with Covariate Choice
Pluvial Flood Emulation with Hydraulics-informed Message Passing
Accelerating Convergence in Bayesian Few-Shot Classification
An Improved Finite-time Analysis of Temporal Difference Learning with Deep Neural Networks
DUPLEX: Dual GAT for Complex Embedding of Directed Graphs
A Universal Transfer Theorem for Convex Optimization Algorithms Using Inexact First-order Oracles
Fair Classification with Partial Feedback: An Exploration-Based Data Collection Approach
Neural Tangent Kernels Motivate Cross-Covariance Graphs in Neural Networks
Breaking through the learning plateaus of in-context learning in Transformer
Tuning-Free Stochastic Optimization
Off-policy Evaluation Beyond Overlap: Sharp Partial Identification Under Smoothness
Can Machines Learn the True Probabilities?
Gaussian Plane-Wave Neural Operator for Electron Density Estimation
LayerMerge: Neural Network Depth Compression through Layer Pruning and Merging
CARTE: Pretraining and Transfer for Tabular Learning
Achieving Lossless Gradient Sparsification via Mapping to Alternative Space in Federated Learning
Active Label Correction for Semantic Segmentation with Foundation Models
ODIM: Outlier Detection via Likelihood of Under-Fitted Generative Models
Synergistic Integration of Coordinate Network and Tensorial Feature for Improving Neural Radiance Fields from Sparse Inputs
Learning to Explore for Stochastic Gradient MCMC
Improving Robustness to Multiple Spurious Correlations by Multi-Objective Optimization
Clustered Federated Learning via Gradient-based Partitioning
Demystifying SGD with Doubly Stochastic Gradients
Attribute Based Interpretable Evaluation Metrics for Generative Models
EquiAV: Leveraging Equivariance for Audio-Visual Contrastive Learning
Risk-Sensitive Policy Optimization via Predictive CVaR Policy Gradient
Translating Subgraphs to Nodes Makes Simple GNNs Strong and Efficient for Subgraph Representation Learning
Privacy-Preserving Embedding via Look-up Table Evaluation with Fully Homomorphic Encryption
Convex Relaxations of ReLU Neural Networks Approximate Global Optima in Polynomial Time
Polynomial-based Self-Attention for Table Representation Learning
Transformers Learn Nonlinear Features In Context: Nonconvex Mean-field Dynamics on the Attention Landscape
Discovering Features with Synergistic Interactions in Multiple Views
An Infinite-Width Analysis on the Jacobian-Regularised Training of a Neural Network
One Size Fits All for Semantic Shifts: Adaptive Prompt Tuning for Continual Learning
A Unified Linear Programming Framework for Offline Reward Learning from Human Demonstrations and Feedback
Context-Guided Diffusion for Out-of-Distribution Molecular and Protein Design
Universal Consistency of Wide and Deep ReLU Neural Networks and Minimax Optimal Convergence Rates for Kolmogorov-Donoho Optimal Function Classes
DistiLLM: Towards Streamlined Distillation for Large Language Models
Provably Scalable Black-Box Variational Inference with Structured Variational Families
Stochastic Conditional Diffusion Models for Robust Semantic Image Synthesis
Compression of Structured Data with Autoencoders: Provable Benefit of Nonlinearities and Depth
Estimating Barycenters of Distributions with Neural Optimal Transport
AdsorbDiff: Adsorbate Placement via Conditional Denoising Diffusion
On Convergence of Incremental Gradient for Non-convex Smooth Functions
Generalist Equivariant Transformer Towards 3D Molecular Interaction Learning
The Computational Complexity of Finding Second-Order Stationary Points
convSeq: Fast and Scalable Method for Detecting Patterns in Spike Data
A General Online Algorithm for Optimizing Complex Performance Metrics
CLLMs: Consistency Large Language Models
KISA: A Unified Keyframe Identifier and Skill Annotator for Long-Horizon Robotics Demonstrations
PcLast: Discovering Plannable Continuous Latent States
Sobolev Space Regularised Pre Density Models
Geometry-Aware Instrumental Variable Regression
Understanding the Effects of Iterative Prompting on Truthfulness
Mean Estimation in the Add-Remove Model of Differential Privacy
No Free Prune: Information-Theoretic Barriers to Pruning at Initialization
Collective Certified Robustness against Graph Injection Attacks
Privately Learning Smooth Distributions on the Hypercube by Projections
Single-Model Attribution of Generative Models Through Final-Layer Inversion
Towards Understanding Inductive Bias in Transformers: A View From Infinity
Modeling Caption Diversity in Contrastive Vision-Language Pretraining
Offline Inverse RL: New Solution Concepts and Provably Efficient Algorithms
Generalized Sobolev Transport for Probability Measures on a Graph
Robust Inverse Graphics via Probabilistic Inference
Knowledge Graphs Can be Learned with Just Intersection Features
Run-Time Task Composition with Safety Semantics
Chasing Convex Functions with Long-term Constraints
Stationary Latent Weight Inference for Unreliable Observations from Online Test-Time Adaptation
Slow and Steady Wins the Race: Maintaining Plasticity with Hare and Tortoise Networks
Fundamental Benefit of Alternating Updates in Minimax Optimization
DataFreeShield: Defending Adversarial Attacks without Training Data
SelMatch: Effectively Scaling Up Dataset Distillation via Selection-Based Initialization and Partial Updates by Trajectory Matching
Pausing Policy Learning in Non-stationary Reinforcement Learning
Feature Distribution on Graph Topology Mediates the Effect of Graph Convolution: Homophily Perspective
Drug Discovery with Dynamic Goal-aware Fragments
Supervised Matrix Factorization: Local Landscape Analysis and Applications
Defining Neural Network Architecture through Polytope Structures of Datasets
3D Geometric Shape Assembly via Efficient Point Cloud Matching
StrWAEs to Invariant Representations
Binning as a Pretext Task: Improving Self-Supervised Learning in Tabular Domains
Training Greedy Policy for Proposal Batch Selection in Expensive Multi-Objective Combinatorial Optimization
Robust Optimization in Protein Fitness Landscapes Using Reinforcement Learning in Latent Space
Improving Instruction Following in Language Models through Proxy-Based Uncertainty Estimation
Improving Gradient-Guided Nested Sampling for Posterior Inference
Winner-takes-all learners are geometry-aware conditional density estimators
Eluder-based Regret for Stochastic Contextual MDPs
Convergence and Complexity Guarantee for Inexact First-order Riemannian Optimization Algorithms
DetKDS: Knowledge Distillation Search for Object Detectors
Purifying Quantization-conditioned Backdoors via Layer-wise Activation Correction with Distribution Approximation
Critical windows: non-asymptotic theory for feature emergence in diffusion models
Learning Causal Domain-Invariant Temporal Dynamics for Few-Shot Action Recognition
Completing Visual Objects via Bridging Generation and Segmentation
Evolving Subnetwork Training for Large Language Models
Data Poisoning Attacks against Conformal Prediction
ExCP: Extreme LLM Checkpoint Compression via Weight-Momentum Joint Shrinking
Two-sided Competing Matching Recommendation Markets With Quota and Complementary Preferences Constraints
Full-Atom Peptide Design based on Multi-modal Flow Matching
Positive and Unlabeled Learning with Controlled Probability Boundary Fence
FightLadder: A Benchmark for Competitive Multi-Agent Reinforcement Learning
Debiased Distribution Compression
RL-CFR: Improving Action Abstraction for Imperfect Information Extensive-Form Games with Reinforcement Learning
Vague Prototype-Oriented Diffusion Model for Multi-Class Anomaly Detection
Graph Structure Extrapolation for Out-of-Distribution Generalization
Value-Evolutionary-Based Reinforcement Learning
Image Clustering with External Guidance
VisionGraph: Leveraging Large Multimodal Models for Graph Theory Problems in Visual Context
Accelerating Convergence of Score-Based Diffusion Models, Provably
Q-Probe: A Lightweight Approach to Reward Maximization for Language Models
Promises and Pitfalls of Generative Masked Language Modeling: Theoretical Framework and Practical Guidelines
Neural Collapse in Multi-label Learning with Pick-all-label Loss
A Differentiable Partially Observable Generalized Linear Model with Forward-Backward Message Passing
Multi-Region Markovian Gaussian Process: An Efficient Method to Discover Directional Communications Across Multiple Brain Regions
A Generative Approach for Treatment Effect Estimation under Collider Bias: From an Out-of-Distribution Perspective
Learning Shadow Variable Representation for Treatment Effect Estimation under Collider Bias
Configurable Mirror Descent: Towards a Unification of Decision Making
Enhancing Class-Imbalanced Learning with Pre-Trained Guidance through Class-Conditional Knowledge Distillation
A Neural-Guided Dynamic Symbolic Network for Exploring Mathematical Expressions from Data
Cascade-CLIP: Cascaded Vision-Language Embeddings Alignment for Zero-Shot Semantic Segmentation
Preventing Model Collapse in Gaussian Process Latent Variable Models
Measuring Stochastic Data Complexity with Boltzmann Influence Functions
Concentration Inequalities for General Functions of Heavy-Tailed Random Variables
Sparse Cocktail: Every Sparse Pattern Every Sparse Ratio All At Once
Is Temperature Sample Efficient for Softmax Gaussian Mixture of Experts?
Learning Adaptive and View-Invariant Vision Transformer for Real-Time UAV Tracking
PID: Prompt-Independent Data Protection Against Latent Diffusion Models
A Contextual Combinatorial Bandit Approach to Negotiation
DiffStitch: Boosting Offline Reinforcement Learning with Diffusion-based Trajectory Stitching
Privacy Preserving Adaptive Experiment Design
Combining Experimental and Historical Data for Policy Evaluation
LoRAP: Transformer Sub-Layers Deserve Differentiated Structured Compression for Large Language Models
DiffFPR: Diffusion Prior for Oversampled Fourier Phase Retrieval
Harnessing Neural Unit Dynamics for Effective and Scalable Class-Incremental Learning
Compress Clean Signal from Noisy Raw Image: A Self-Supervised Approach
VQDNA: Unleashing the Power of Vector Quantization for Multi-Species Genomic Sequence Modeling
Emergent Representations of Program Semantics in Language Models Trained on Programs
OODRobustBench: a Benchmark and Large-Scale Analysis of Adversarial Robustness under Distribution Shift
Improved Bounds for Pure Private Agnostic Learning: Item-Level and User-Level Privacy
$\bf{\Phi}_\textrm{Flow}$: Differentiable Simulations for PyTorch, TensorFlow and Jax
The Good, The Bad, and Why: Unveiling Emotions in Generative AI
Two-Stage Shadow Inclusion Estimation: An IV Approach for Causal Inference under Latent Confounding and Collider Bias
FlashST: A Simple and Universal Prompt-Tuning Framework for Traffic Prediction
Towards efficient deep spiking neural networks construction with spiking activity based pruning
FedBAT: Communication-Efficient Federated Learning via Learnable Binarization
Beyond Point Prediction: Score Matching-based Pseudolikelihood Estimation of Neural Marked Spatio-Temporal Point Process
Statistical Properties of Robust Satisficing
ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models
IIANet: An Intra- and Inter-Modality Attention Network for Audio-Visual Speech Separation
KernelWarehouse: Rethinking the Design of Dynamic Convolution
GeoReasoner: Geo-localization with Reasoning in Street Views using a Large Vision-Language Model
Learning the Uncertainty Sets of Linear Control Systems via Set Membership: A Non-asymptotic Analysis
Seesaw: Compensating for Nonlinear Reduction with Linear Computations for Private Inference
Predicting and Interpreting Energy Barriers of Metallic Glasses with Graph Neural Networks
From Fourier to Neural ODEs: Flow Matching for Modeling Complex Systems
EvoRainbow: Combining Improvements in Evolutionary Reinforcement Learning for Policy Search
Algorithmic Stability Unleashed: Generalization Bounds with Unbounded Losses
Receptive Fields As Experts in Convolutional Neural Architectures
Diving into Underwater: Segment Anything Model Guided Underwater Salient Instance Segmentation and A Large-scale Dataset
Graph External Attention Enhanced Transformer
Single-Trajectory Distributionally Robust Reinforcement Learning
Realistic Unsupervised CLIP Fine-tuning with Universal Entropy Optimization
On the Error-Propagation of Inexact Hotelling's Deflation for Principal Component Analysis
Bootstrapping Fisher Market Equilibrium and First-Price Pacing Equilibrium
Graph Geometry-Preserving Autoencoders
Momentum Particle Maximum Likelihood
An Effective Dynamic Gradient Calibration Method for Continual Learning
Revisiting the Role of Language Priors in Vision-Language Models
Non-confusing Generation of Customized Concepts in Diffusion Models
Structured Inverse-Free Natural Gradient Descent: Memory-Efficient & Numerically-Stable KFAC
Robustness of Deep Learning for Accelerated MRI: Benefits of Diverse Training Data
Equivariance via Minimal Frame Averaging for More Symmetries and Efficiency
Graph-enhanced Large Language Models in Asynchronous Plan Reasoning
HGAP: Boosting Permutation Invariant and Permutation Equivariant in Multi-Agent Reinforcement Learning via Graph Attention Network
Parsimonious Learning-Augmented Approximations for Dense Instances of $\mathcal{NP}$-hard Problems
SparseTSF: Modeling Long-term Time Series Forecasting with *1k* Parameters
On Hypothesis Transfer Learning of Functional Linear Models
GeoAB: Towards Realistic Antibody Design and Reliable Affinity Maturation
A Single-Loop Robust Policy Gradient Method for Robust Markov Decision Processes
A General Theory for Softmax Gating Multinomial Logistic Mixture of Experts
Layer-Aware Analysis of Catastrophic Overfitting: Revealing the Pseudo-Robust Shortcut Dependency
Autonomous Sparse Mean-CVaR Portfolio Optimization
More Flexible PAC-Bayesian Meta-Learning by Learning Learning Algorithms
Graph Neural Stochastic Diffusion for Estimating Uncertainty in Node Classification
PPFLOW: Target-Aware Peptide Design with Torsional Flow Matching
Lie Neurons: Adjoint-Equivariant Neural Networks for Semisimple Lie Algebras
Position: A Call to Action for a Human-Centered AutoML Paradigm
Beyond Regular Grids: Fourier-Based Neural Operators on Arbitrary Domains
Scaling Tractable Probabilistic Circuits: A Systems Perspective
Graph Distillation with Eigenbasis Matching
Adaptive Text Watermark for Large Language Models
Graph Adversarial Diffusion Convolution
Zeroth-Order Methods for Constrained Nonconvex Nonsmooth Stochastic Optimization
ESNet: Evolution and Succession Network for High-Resolution Salient Object Detection
Unifying Image Processing as Visual Prompting Question Answering
The Pitfalls and Promise of Conformal Inference Under Adversarial Attacks
Preference Optimization for Molecule Synthesis with Conditional Residual Energy-based Models
Decoding-time Realignment of Language Models
DIDI: Diffusion-Guided Diversity for Offline Behavioral Generation
Bidirectional Reciprocative Information Communication for Few-Shot Semantic Segmentation
Unlock the Cognitive Generalization of Deep Reinforcement Learning via Granular Ball Representation
PAPM: A Physics-aware Proxy Model for Process Systems
ELTA: An Enhancer against Long-Tail for Aesthetics-oriented Models
On the Feasibility of Single-Pass Full-Capacity Learning in Linear Threshold Neurons with Binary Input Vectors
Online Speculative Decoding
Tuning-free Estimation and Inference of Cumulative Distribution Function under Local Differential Privacy
Referee Can Play: An Alternative Approach to Conditional Generation via Model Inversion
Reason for Future, Act for Now: A Principled Architecture for Autonomous LLM Agents
Stereo Risk: A Continuous Modeling Approach to Stereo Matching
Multi-Source Conformal Inference Under Distribution Shift
From Generalization Analysis to Optimization Designs for State Space Models
Short-Long Convolutions Help Hardware-Efficient Linear Attention to Focus on Long Sequences
Convergence of Online Learning Algorithm for a Mixture of Multiple Linear Regressions
Energy-Guided Diffusion Sampling for Offline-to-Online Reinforcement Learning
Moreau Envelope for Nonconvex Bi-Level Optimization: A Single-Loop and Hessian-Free Solution Strategy
Position: Foundation Agents as the Paradigm Shift for Decision Making
Amortized Equation Discovery in Hybrid Dynamical Systems
Floating Anchor Diffusion Model for Multi-motif Scaffolding
Class-Imbalanced Graph Learning without Class Rebalancing
Generative Marginalization Models
Federated Representation Learning in the Under-Parameterized Regime
Causal Discovery via Conditional Independence Testing with Proxy Variables
Perfect Alignment May be Poisonous to Graph Contrastive Learning
Zero-Shot ECG Classification with Multimodal Learning and Test-time Clinical Knowledge Enhancement
Symmetric Matrix Completion with ReLU Sampling
DNA-SE: Towards Deep Neural-Nets Assisted Semiparametric Estimation
High-Probability Bound for Non-Smooth Non-Convex Stochastic Optimization with Heavy Tails
Language-Driven Cross-Modal Classifier for Zero-Shot Multi-Label Image Recognition
Geometry-Calibrated DRO: Combating Over-Pessimism with Free Energy Implications
Correlation-Induced Label Prior for Semi-Supervised Multi-Label Learning
Causality Based Front-door Defense Against Backdoor Attack on Language Models
Partial Multi-View Multi-Label Classification via Semantic Invariance Learning and Prototype Modeling
Building Socially-Equitable Public Models
KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache
Timer: Generative Pre-trained Transformers Are Large Time Series Models
On the Last-Iterate Convergence of Shuffling Gradient Methods
Pairwise Alignment Improves Graph Domain Adaptation
Neural Operators with Localized Integral and Differential Kernels
A Tensor Decomposition Perspective on Second-order RNNs
Restoring balance: principled under/oversampling of data for optimal classification
Reparameterized Importance Sampling for Robust Variational Bayesian Neural Networks
Attention Meets Post-hoc Interpretability: A Mathematical Perspective
Non-Vacuous Generalization Bounds for Large Language Models
Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution
Optimal Differentially Private Model Training with Public Data
How to Make the Gradients Small Privately: Improved Rates for Differentially Private Non-Convex Optimization
Beyond Sole Strength: Customized Ensembles for Generalized Vision-Language Models
HumanTOMATO: Text-aligned Whole-body Motion Generation
Characterizing Overfitting in Kernel Ridgeless Regression Through the Eigenspectrum
Position: Exploring the Robustness of Pipeline-Parallelism-Based Decentralized Training
CATS: Enhancing Multivariate Time Series Forecasting by Constructing Auxiliary Time Series as Exogenous Variables
WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
EiG-Search: Generating Edge-Induced Subgraphs for GNN Explanation in Linear Time
CauDiTS: Causal Disentangled Domain Adaptation of Multivariate Time Series
FiT: Flexible Vision Transformer for Diffusion Model
Probabilistic Routing for Graph-Based Approximate Nearest Neighbor Search
OxyGenerator: Reconstructing Global Ocean Deoxygenation Over a Century with Deep Learning
SPP: Sparsity-Preserved Parameter-Efficient Fine-Tuning for Large Language Models
Unveiling the Cycloid Trajectory of EM Iterations in Mixed Linear Regression
OMPO: A Unified Framework for RL under Policy and Dynamics Shifts
OLLIE: Imitation Learning from Offline Pretraining to Online Finetuning
Offline-Boosted Actor-Critic: Adaptively Blending Optimal Historical Behaviors in Deep Off-Policy RL
Potential Based Diffusion Motion Planning
Cluster-Aware Similarity Diffusion for Instance Retrieval
RoboMP$^2$: A Robotic Multimodal Perception-Planning Framework with Multimodal Large Language Models
Contamination-Resilient Anomaly Detection via Adversarial Learning on Partially-Observed Normal and Anomalous Data
Coarse-to-Fine Highlighting: Reducing Knowledge Hallucination in Large Language Models
Efficient and Effective Time-Series Forecasting with Spiking Neural Networks
Cross-Domain Policy Adaptation by Capturing Representation Mismatch
Sampling is as easy as keeping the consistency: convergence guarantee for Consistency Models
Parameter Efficient Quasi-Orthogonal Fine-Tuning via Givens Rotation
Rethinking Decision Transformer via Hierarchical Reinforcement Learning
Better Locally Private Sparse Estimation Given Multiple Samples Per User
Outlier-aware Slicing for Post-Training Quantization in Vision Transformer
X-Oscar: A Progressive Framework for High-quality Text-guided 3D Animatable Avatar Generation
Neighboring Perturbations of Knowledge Editing on Large Language Models
HarmonyDream: Task Harmonization Inside World Models
High-dimensional Linear Bandits with Knapsacks
SyCoCa: Symmetrizing Contrastive Captioners with Attentive Masking for Multimodal Alignment
Correcting Diffusion-Based Perceptual Image Compression with Privileged End-to-End Decoder
A Provable Decision Rule for Out-of-Distribution Detection
Faithfulness Measurable Masked Language Models
On the Hardness of Probabilistic Neurosymbolic Learning
Split-and-Denoise: Protect large language model inference with local differential privacy
tinyBenchmarks: evaluating LLMs with fewer examples
SCoRe: Submodular Combinatorial Representation Learning
LASER: Linear Compression in Wireless Distributed Optimization
Entropy-Reinforced Planning with Large Language Models for Drug Discovery
Auto-Regressive Next-Token Predictors are Universal Learners
Self-Composing Policies for Scalable Continual Reinforcement Learning
Tilting the Odds at the Lottery: the Interplay of Overparameterisation and Curricula in Neural Networks
Submodular framework for structured-sparse optimal transport
Large Language Models are Geographically Biased
Position: Graph Foundation Models Are Already Here
Towards General Neural Surrogate Solvers with Specialized Neural Accelerators
$H$-Consistency Guarantees for Regression
Regression with Multi-Expert Deferral
No-Regret Reinforcement Learning in Smooth MDPs
Keep the Momentum: Conservation Laws beyond Euclidean Gradient Flows
Graph-based Forecasting with Missing Data through Spatiotemporal Downsampling
Convergence and Trade-Offs in Riemannian Gradient Descent and Riemannian Proximal Point
Using AI Uncertainty Quantification to Improve Human Decision-Making
On the Tractability of SHAP Explanations under Markovian Distributions
On the Consistency of Kernel Methods with Dependent Observations
Delving into Differentially Private Transformer
Deep Fusion: Efficient Network Training via Pre-trained Initializations
Roping in Uncertainty: Robustness and Regularization in Markov Games
IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation
O$n$ Learning Deep O($n$)-Equivariant Hyperspheres
Position: Tensor Networks are a Valuable Asset for Green AI
OSSCAR: One-Shot Structured Pruning in Vision and Language Models with Combinatorial Optimization
Physics-Informed Neural Network Policy Iteration: Algorithms, Convergence, and Verification
The Illusion of State in State-Space Models
Superposition Prompting: Improving and Accelerating Retrieval-Augmented Generation
Locality-Sensitive Hashing-Based Efficient Point Transformer with Applications in High-Energy Physics
Rethinking Independent Cross-Entropy Loss For Graph-Structured Data
Rethinking Momentum Knowledge Distillation in Online Continual Learning
Efficient World Models with Context-Aware Tokenization
CLIPZyme: Reaction-Conditioned Virtual Screening of Enzymes
Can Implicit Bias Imply Adversarial Robustness?
Understanding Retrieval-Augmented Task Adaptation for Vision-Language Models
RODEO: Robust Outlier Detection via Exposing Adaptive Out-of-Distribution Samples
Prodigy: An Expeditiously Adaptive Parameter-Free Learner
From Inverse Optimization to Feasibility to ERM
Provable Interactive Learning with Hindsight Instruction Feedback
TERD: A Unified Framework for Safeguarding Diffusion Models Against Backdoors
Straight-Through Meets Sparse Recovery: the Support Exploration Algorithm
OAK: Enriching Document Representations using Auxiliary Knowledge for Extreme Classification
Language Models with Conformal Factuality Guarantees
Finding NEM-U: Explaining unsupervised representation learning through neural network generated explanation masks
Slot Abstractors: Toward Scalable Abstract Visual Reasoning
A Theory of Non-Linear Feature Learning with One Gradient Step in Two-Layer Neural Networks
Learning Optimal Deterministic Policies with Stochastic Policy Gradients
Causal Representation Learning Made Identifiable by Grouping of Observational Variables
Position: Levels of AGI for Operationalizing Progress on the Path to AGI
Using Uncertainty Quantification to Characterize and Improve Out-of-Domain Learning for PDEs
SiBBlInGS: Similarity-driven Building-Block Inference using Graphs across States
Truly No-Regret Learning in Constrained MDPs
Optimal bounds for $\ell_p$ sensitivity sampling via $\ell_2$ augmentation
Turnstile $\ell_p$ leverage score sampling with applications
BAGEL: Bootstrapping Agents by Guiding Exploration with Language
Factored-Reward Bandits with Intermediate Observations
Best Arm Identification for Stochastic Rising Bandits
Test-Time Regret Minimization in Meta Reinforcement Learning
Learning in Deep Factor Graphs with Gaussian Belief Propagation
PairNet: Training with Observed Pairs to Estimate Individual Treatment Effect
Density Ratio Estimation with Doubly Strong Robustness
Equivariant Deep Weight Space Alignment
Quality-Weighted Vendi Scores And Their Application To Diverse Experimental Design
On Least Square Estimation in Softmax Gating Mixture of Experts
PIDformer: Transformer Meets Control Theory
Differentially private exact recovery for stochastic block models
Novel Spectral Algorithms for the Partial Credit Model
Sliced Wasserstein with Random-Path Projecting Directions
Risk-Sensitive Reward-Free Reinforcement Learning with CVaR
How Transformers Learn Causal Structure with Gradient Descent
Understanding the Impact of Introducing Constraints at Inference Time on Generalization Error
Test-Time Model Adaptation with Only Forward Passes
Latent Optimal Paths by Gumbel Propagation for Variational Bayesian Dynamic Programming
RNAFlow: RNA Structure & Sequence Design via Inverse Folding-Based Flow Matching
$f$-Divergence Based Classification: Beyond the Use of Cross-Entropy
In value-based deep reinforcement learning, a pruned network is a good network
Mixtures of Experts Unlock Parameter Scaling for Deep RL
The Perception-Robustness Tradeoff in Deterministic Image Restoration
Linear Explanations for Individual Neurons
Adaptive Proximal Gradient Methods Are Universal Without Approximation
Fair Resource Allocation in Multi-Task Learning
Do Language Models Exhibit the Same Cognitive Biases in Problem Solving as Human Learners?
Deep Stochastic Mechanics
Variational Linearized Laplace Approximation for Bayesian Deep Learning
Differentiable Mapper for Topological Optimization of Data Representation
Structured Chemistry Reasoning with Large Language Models
MADA: Meta-Adaptive Optimizers Through Hyper-Gradient Descent
Implicit Representations via Operator Learning
Bayesian Program Learning by Decompiling Amortized Knowledge
$S^2$IP-LLM: Semantic Space Informed Prompt Learning with LLM for Time Series Forecasting
Feedback Loops With Language Models Drive In-Context Reward Hacking
Stability and Generalization for Stochastic Recursive Momentum-based Algorithms for (Strongly-)Convex One to $K$-Level Stochastic Optimizations
RMIB: Representation Matching Information Bottleneck for Matching Text Representations
Auto-Encoding Morph-Tokens for Multimodal LLM
A New Linear Scaling Rule for Private Adaptive Hyperparameter Optimization
Self-Alignment of Large Language Models via Monopolylogue-based Social Scene Simulation
Trainable Transformer in Transformer
Position: Topological Deep Learning is the New Frontier for Relational Learning
Position: Bayesian Deep Learning is Needed in the Age of Large-Scale AI
The Max-Min Formulation of Multi-Objective Reinforcement Learning: From Theory to a Model-Free Algorithm
The Linear Representation Hypothesis and the Geometry of Large Language Models
Mean-field Chaos Diffusion Models
Foundation Policies with Hilbert Representations
SignSGD with Federated Defense: Harnessing Adversarial Attacks through Gradient Sign Decoding
BOtied: Multi-objective Bayesian optimization with tied multivariate ranks
State-Free Inference of State-Space Models: The *Transfer Function* Approach
Variational Inference with Coverage Guarantees in Simulation-Based Inference
Optimal Ridge Regularization for Out-of-Distribution Prediction
LPGD: A General Framework for Backpropagation through Embedded Optimization Layers
Learning to Stabilize Online Reinforcement Learning in Unbounded State Spaces
Graph Automorphism Group Equivariant Neural Networks
BetterV: Controlled Verilog Generation with Discriminative Guidance
Bias of Stochastic Gradient Descent or the Architecture: Disentangling the Effects of Overparameterization of Neural Networks
Knowledge Distillation with Auxiliary Variable
UPAM: Unified Prompt Attack in Text-to-Image Generation Models Against Both Textual Filters and Visual Checkers
Pragmatic Feature Preferences: Learning Reward-Relevant Preferences from Human Input
UPOCR: Towards Unified Pixel-Level OCR Interface
FedCal: Achieving Local and Global Calibration in Federated Learning via Aggregated Parameterized Scaler
Improving Diffusion Models for Inverse Problems Using Optimal Posterior Covariance
A Subquadratic Time Algorithm for Robust Sparse Mean Estimation
Solving Hierarchical Information-Sharing Dec-POMDPs: An Extensive-Form Game Approach
The Relative Value of Prediction in Algorithmic Decision Making
Interpreting and Improving Diffusion Models from an Optimization Perspective
Mechanistic Neural Networks for Scientific Machine Learning
Bayesian Regret Minimization in Offline Bandits
Prompting a Pretrained Transformer Can Be a Universal Approximator
Transport of Algebraic Structure to Latent Embeddings
Cross-view Masked Diffusion Transformers for Person Image Synthesis
Detecting Influence Structures in Multi-Agent Reinforcement Learning
Contrasting Multiple Representations with the Multi-Marginal Matching Gap
Adaptive Conformal Inference by Betting
Mechanistic Design and Scaling of Hybrid Architectures
Robust Data-driven Prescriptiveness Optimization
Learning Multiple Secrets in Mastermind
The Entropy Enigma: Success and Failure of Entropy Minimization
Efficient Exploration in Average-Reward Constrained Reinforcement Learning: Achieving Near-Optimal Regret With Posterior Sampling
Learning-Efficient Yet Generalizable Collaborative Filtering for Item Recommendation
Unsupervised Domain Adaptation for Anatomical Structure Detection in Ultrasound Images
Learning to Remove Cuts in Integer Linear Programming
Conformalized Survival Distributions: A Generic Post-Process to Increase Calibration
ByMI: Byzantine Machine Identification with False Discovery Rate Control
Efficient Non-stationary Online Learning by Wavelets with Applications to Online Distribution Shift Adaptation
Near-Optimal Reinforcement Learning with Self-Play under Adaptivity Constraints
ULAREF: A Unified Label Refinement Framework for Learning with Inaccurate Supervision
Federated Full-Parameter Tuning of Billion-Sized Language Models with Communication Cost under 18 Kilobytes
Accurate LoRA-Finetuning Quantization of LLMs via Information Retention
Various Lengths, Constant Speed: Efficient Language Modeling with Lightning Attention
Feasible Reachable Policy Iteration
Learning High-Order Relationships of Brain Regions
To Cool or not to Cool? Temperature Network Meets Large Foundation Models via DRO
Transferring Knowledge From Large Foundation Models to Small Downstream Models
Compute Better Spent: Replacing Dense Layers with Structured Matrices
MolCRAFT: Structure-Based Drug Design in Continuous Parameter Space
Connect Later: Improving Fine-tuning for Robustness with Targeted Augmentations
Learning Constraints from Offline Demonstrations via Superior Distribution Correction Estimation
Multiply-Robust Causal Change Attribution
Decomposable Submodular Maximization in Federated Setting
Subsampling is not Magic: Why Large Batch Sizes Work for Differentially Private Stochastic Optimisation
STEER: Assessing the Economic Rationality of Large Language Models
Compositional Capabilities of Autoregressive Transformers: A Study on Synthetic, Interpretable Tasks
Position: The Reasonable Person Standard for AI
Unveiling Privacy, Memorization, and Input Curvature Links
Dissecting Multimodality in VideoQA Transformer Models by Impairing Modality Fusion
Fair Federated Learning via the Proportional Veto Core
Optimal Batched Linear Bandits
TabLog: Test-Time Adaptation for Tabular Data Using Logic Rules
Rejuvenating image-GPT as Strong Visual Representation Learners
CarbonNovo: Joint Design of Protein Structure and Sequence Using a Unified Energy-based Model
Plug-and-Play image restoration with Stochastic deNOising REgularization
Implicit Regularization in Feedback Alignment Learning Mechanisms for Neural Networks
Universal Gradient Methods for Stochastic Convex Optimization
Position: Key Claims in LLM Research Have a Long Tail of Footnotes
Position: Mission Critical – Satellite Data is a Distinct Modality in Machine Learning
Invariant Risk Minimization Is A Total Variation Model
Position: Application-Driven Innovation in Machine Learning
One-Shot Strategic Classification Under Unknown Costs
Modelling Microbial Communities with Graph Neural Networks
Position: Amazing Things Come From Having Many Good Models
Generalizing Orthogonalization for Models with Non-Linearities
Rolling Diffusion Models
Second-Order Uncertainty Quantification: A Distance-Based Approach
Random Exploration in Bayesian Optimization: Order-Optimal Regret and Computational Efficiency
Predictive Coding beyond Correlations
Proactive Detection of Voice Cloning with Localized Watermarking
A Diffusion Model Framework for Unsupervised Neural Combinatorial Optimization
Sparse and Structured Hopfield Networks
A sampling theory perspective on activations for implicit neural representations
A fast algorithm to simulate nonlinear resistive networks
Parallel Affine Transformation Tuning of Markov Chain Monte Carlo
Incentivized Learning in Principal-Agent Bandit Games
Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models
Leveraging Self-Consistency for Data-Efficient Amortized Bayesian Inference
Online Learning with Bounded Recall
Asymptotics of Learning with Deep Structured (Random) Features
Simultaneous identification of models and parameters of scientific simulators
Bayesian Adaptation of Network Depth and Width for Continual Learning
Towards Scalable and Versatile Weight Space Learning
Curated LLM: Synergy of LLMs and Data Curation for tabular augmentation in low-data regimes
Lessons from Generalization Error Analysis of Federated Learning: You May Communicate Less Often!
Algorithm of Thoughts: Enhancing Exploration of Ideas in Large Language Models
Prompting is a Double-Edged Sword: Improving Worst-Group Robustness of Foundation Models
A Multimodal Automated Interpretability Agent
The Balanced-Pairwise-Affinities Feature Transform
Improved Generalization of Weight Space Networks via Augmentations
On Multi-Armed Bandit with Impatient Arms
Language Generation with Strictly Proper Scoring Rules
Learning Decision Policies with Instrumental Variables through Double Machine Learning
How Far Can Fairness Constraints Help Recover From Biased Data?
Reducing sequential change detection to sequential estimation
Exploring the Complexity of Deep Neural Networks through Functional Equivalence
Position: Do pretrained Transformers Learn In-Context by Gradient Descent?
Why Do Animals Need Shaping? A Theory of Task Composition and Curriculum Learning
ReLUs Are Sufficient for Learning Implicit Neural Representations
Double Momentum Method for Lower-Level Constrained Bilevel Optimization
LCA-on-the-Line: Benchmarking Out of Distribution Generalization with Class Taxonomies
Sample-Efficient Robust Multi-Agent Reinforcement Learning in the Face of Environmental Uncertainty
CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers
Why Larger Language Models Do In-context Learning Differently?
Long-Tail Learning with Foundation Model: Heavy Fine-Tuning Hurts
Statistical Test for Attention Maps in Vision Transformers
Weakly Convex Regularisers for Inverse Problems: Convergence of Critical Points and Primal-Dual Optimisation
IOI: Invisible One-Iteration Adversarial Attack on No-Reference Image- and Video-Quality Metrics
InterpreTabNet: Distilling Predictive Signals from Tabular Data by Salient Feature Interpretation
Embarrassingly Parallel GFlowNets
Deletion-Anticipative Data Selection with a Limited Budget
Latent variable model for high-dimensional point process with structured missingness
Domain Generalisation via Imprecise Learning
Byzantine Resilient and Fast Federated Few-Shot Learning
Parallelized Spatiotemporal Slot Binding for Videos
In-Context Reinforcement Learning for Variable Action Spaces
Multi-Agent Reinforcement Learning with Hierarchical Coordination for Emergency Responder Stationing
Inexact Newton-type Methods for Optimisation with Nonnegativity Constraints
Probabilistic Modeling of Interpersonal Coordination Processes
Connecting the Dots: Is Mode-Connectedness the Key to Feasible Sample-Based Inference in Bayesian Neural Networks?
Harnessing the Power of Neural Operators with Automatically Encoded Conservation Laws
Hybrid Reinforcement Learning from Offline Observation Alone
SurfPro: Functional Protein Design Based on Continuous Surface
OSN: Infinite Representations of Dynamic 3D Scenes from Monocular Videos
Sparse is Enough in Fine-tuning Pre-trained Large Language Models
Position: Leverage Foundational Models for Black-Box Optimization
Latent Logic Tree Extraction for Event Sequence Explanation from LLMs
Generative Enzyme Design Guided by Functionally Important Sites and Small-Molecule Substrates
Position: A Roadmap to Pluralistic Alignment
CHEMREASONER: Heuristic Search over a Large Language Model’s Knowledge Space using Quantum-Chemical Feedback
Harmonic Self-Conditioned Flow Matching for joint Multi-Ligand Docking and Binding Site Design
Learning to Intervene on Concept Bottlenecks
QORA: Zero-Shot Transfer via Interpretable Object-Relational Model Learning
Private Truly-Everlasting Robust-Prediction
ReGAL: Refactoring Programs to Discover Generalizable Abstractions
RLVF: Learning from Verbal Feedback without Overgeneralization
Online Learning in CMDPs: Handling Stochastic and Adversarial Constraints
Designing Decision Support Systems using Counterfactual Prediction Sets
Whispering Experts: Neural Interventions for Toxicity Mitigation in Language Models
Rényi Pufferfish Privacy: General Additive Noise Mechanisms and Privacy Amplification by Iteration via Shift Reduction Lemmas
Networked Inequality: Preferential Attachment Bias in Graph Neural Network Link Prediction
ED-Copilot: Reduce Emergency Department Wait Time with Language Model Diagnostic Assistance
Constrained Reinforcement Learning Under Model Mismatch
DFA-RAG: Conversational Semantic Router for Large Language Model with Definite Finite Automaton
LSEnet: Lorentz Structural Entropy Neural Network for Deep Graph Clustering
Online Adaptive Anomaly Thresholding with Confidence Sequences
Learning Graph Representation via Graph Entropy Maximization
FedBPT: Efficient Federated Black-box Prompt Tuning for Large Language Models
Regression Learning with Limited Observations of Multivariate Outcomes and Features
Graph Neural Networks with a Distribution of Parametrized Graphs
video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models
BBox-Adapter: Lightweight Adapting for Black-Box Large Language Models
On a Combinatorial Problem Arising in Machine Teaching
Interpretable Deep Clustering for Tabular Data
Reinforcement Learning from Reachability Specifications: PAC Guarantees with Expected Conditional Distance
A Universal Class of Sharpness-Aware Minimization Algorithms
Posterior Sampling-Based Bayesian Optimization with Tighter Bayesian Regret Bounds
Deciphering RNA Secondary Structure Prediction: A Probabilistic K-Rook Matching Perspective
Community-Invariant Graph Contrastive Learning
Memorization Through the Lens of Curvature of Loss Function Around Samples
Fourier Controller Networks for Real-Time Decision-Making in Embodied Learning
Learning Solution-Aware Transformers for Efficiently Solving Quadratic Assignment Problem
OTMatch: Improving Semi-Supervised Learning with Optimal Transport
Post-hoc Part-Prototype Networks
Rethinking Optimization and Architecture for Tiny Language Models
Merging Multi-Task Models via Weight-Ensembling Mixture of Experts
StrokeNUWA—Tokenizing Strokes for Vector Graphic Synthesis
SSL4Q: Semi-Supervised Learning of Quantum Data with Application to Quantum State Classification
Finite Smoothing Algorithm for High-Dimensional Support Vector Machines and Quantile Regression
MathScale: Scaling Instruction Tuning for Mathematical Reasoning
QUEST: Query-Aware Sparsity for Efficient Long-Context LLM Inference
Position: What makes an image realistic?
A New Branch-and-Bound Pruning Framework for $\ell_0$-Regularized Problems
Beyond Individual Input for Deep Anomaly Detection on Tabular Data
Collapse-Aware Triplet Decoupling for Adversarially Robust Image Retrieval
MOKD: Cross-domain Finetuning for Few-shot Classification via Maximizing Optimized Kernel Dependence
Ranking-based Client Imitation Selection for Efficient Federated Learning
OT-CLIP: Understanding and Generalizing CLIP via Optimal Transport
Copula-Nested Spectral Kernel Network
FRAPPÉ: A Group Fairness Framework for Post-Processing Everything
Faster Maximum Inner Product Search in High Dimensions
Position: Enforced Amnesia as a Way to Mitigate the Potential Risk of Silent Suffering in the Conscious AI
How Deep Networks Learn Sparse and Hierarchical Data: the Sparse Random Hierarchy Model
Position: Do Not Explain Vision Models Without Context
Neural SPH: Improved Neural Modeling of Lagrangian Fluid Dynamics
Inferring the Long-Term Causal Effects of Long-Term Treatments from Short-Term Experiments
Simplicity Bias of Two-Layer Networks beyond Linearly Separable Data
Exploration by Optimization with Hybrid Regularizers: Logarithmic Regret with Adversarial Robustness in Partial Monitoring
Coactive Learning for Large Language Models using Implicit User Feedback
An Efficient Self-Learning Framework For Interactive Spoken Dialog Systems
Matroid Semi-Bandits in Sublinear Time
Improving Antibody Humanness Prediction using Patent Data
Feedback Efficient Online Fine-Tuning of Diffusion Models
Do Large Language Models Perform the Way People Expect? Measuring the Human Generalization Function
Reward-Free Kernel-Based Reinforcement Learning
Federated Self-Explaining GNNs with Anti-shortcut Augmentations
How to Leverage Diverse Demonstrations in Offline Imitation Learning
Position: Why Tabular Foundation Models Should Be a Research Priority
Piecewise Constant and Linear Regression Trees: An Optimal Dynamic Programming Approach
Algorithm and Hardness for Dynamic Attention Maintenance in Large Language Models
Proactive DP: A Multiple Target Optimization Framework for DP-SGD
When Representations Align: Universality in Representation Learning Dynamics
Generalized Smooth Variational Inequalities: Methods with Adaptive Stepsizes
Discovering Mixtures of Structural Causal Models from Time Series Data
Statistically Optimal Generative Modeling with Maximum Deviation from the Empirical Distribution
Stochastic Gradient Flow Dynamics of Test Risk and its Exact Solution for Weak Features
Code as Reward: Empowering Reinforcement Learning with VLMs
Topological Neural Networks go Persistent, Equivariant, and Continuous
To the Max: Reinventing Reward in Reinforcement Learning
Imitation Learning in Discounted Linear MDPs without exploration assumptions
Parameter Estimation in DAGs from Incomplete Data via Optimal Transport
Optimal Transport for Structure Learning Under Missing Data
Convergence of Some Convex Message Passing Algorithms to a Fixed Point
Unsupervised Evaluation of Code LLMs with Round-Trip Correctness
Beyond Implicit Bias: The Insignificance of SGD Noise in Online Learning
Trustless Audits without Revealing Data or Models
Implicit Compressibility of Overparametrized Neural Networks Trained with Heavy-Tailed SGD
SeMOPO: Learning High-quality Model and Policy from Low-quality Offline Visual Datasets
VinT-6D: A Large-Scale Object-in-hand Dataset from Vision, Touch and Proprioception
Superpoint Gaussian Splatting for Real-Time High-Fidelity Dynamic Scene Reconstruction
S3GCL: Spectral, Swift, Spatial Graph Contrastive Learning
Non-stationary Online Convex Optimization with Arbitrary Delays
Towards Unified Multi-granularity Text Detection with Interactive Attention
One Prompt is not Enough: Automated Construction of a Mixture-of-Expert Prompts
On Universally Optimal Algorithms for A/B Testing
Adversarially Robust Hypothesis Transfer Learning
Towards Theoretical Understanding of Learning Large-scale Dependent Data via Random Features
A Circuit Domain Generalization Framework for Efficient Logic Synthesis in Chip Design
Revisiting the Power of Prompt for Visual Tuning
TVE: Learning Meta-attribution for Transferable Vision Explainer
Adaptively Learning to Select-Rank in Online Platforms
Imitation Learning from Purified Demonstrations
Image Restoration Through Generalized Ornstein-Uhlenbeck Bridge
Diagnosing the Compositional Knowledge of Vision Language Models from a Game-Theoretic View
An Efficient Maximal Ancestral Graph Listing Algorithm
Monotone, Bi-Lipschitz, and Polyak-Łojasiewicz Networks
Swallowing the Bitter Pill: Simplified Scalable Conformer Generation
Optimal Kernel Quantile Learning with Random Features
MEMORYLLM: Towards Self-Updatable Large Language Models
Momentum for the Win: Collaborative Federated Reinforcement Learning across Heterogeneous Environments
Mollification Effects of Policy Gradient Methods
Optimal Kernel Choice for Score Function-based Causal Discovery
Rapid Learning without Catastrophic Forgetting in the Morris Water Maze
Learning with Complementary Labels Revisited: The Selected-Completely-at-Random Setting Is More Practical
Total Variation Floodgate for Variable Importance Inference in Classification
In-context Learning on Function Classes Unveiled for Transformers
StableSSM: Alleviating the Curse of Memory in State-space Models through Stable Reparameterization
Bootstrap AutoEncoders With Contrastive Paradigm for Self-supervised Gaze Estimation
Highway Value Iteration Networks
Improving Generalization in Offline Reinforcement Learning via Adversarial Data Splitting
Visual Transformer with Differentiable Channel Selection: An Information Bottleneck Inspired Approach
An Iterative Min-Min Optimization Method for Sparse Bayesian Learning
Open Ad Hoc Teamwork with Cooperative Game Theory
Connecting the Dots: Collaborative Fine-tuning for Black-Box Vision-Language Models
Bridging Data Gaps in Diffusion Models with Adversarial Noise-Based Transfer Learning
EfficientZero V2: Mastering Discrete and Continuous Control with Limited Data
A Dual-module Framework for Counterfactual Estimation over Time
Transforming and Combining Rewards for Aligning Large Language Models
TroVE: Inducing Verifiable and Efficient Toolboxes for Solving Programmatic Tasks
Pi-DUAL: Using privileged information to distinguish clean from noisy labels
Sample Average Approximation for Conditional Stochastic Optimization with Dependent Data
InstructRetro: Instruction Tuning post Retrieval-Augmented Pretraining
A Fine-grained Analysis of Fitted Q-evaluation: Beyond Parametric Models
Efficient Online Set-valued Classification with Bandit Feedback
LLM-Empowered State Representation for Reinforcement Learning
Proteus: Exploring Protein Structure Generation for Enhanced Designability and Efficiency
How to Trace Latent Generative Model Generated Images without Artificial Watermark?
Distributed High-Dimensional Quantile Regression: Estimation Efficiency and Support Recovery
Generalization Analysis of Stochastic Weight Averaging with General Sampling
CW Complex Hypothesis for Image Data
Optimal Exact Recovery in Semi-Supervised Learning: A Study of Spectral Methods and Graph Convolutional Networks
Open-Vocabulary Calibration for Fine-tuned CLIP
A Hierarchical Adaptive Multi-Task Reinforcement Learning Framework for Multiplier Circuit Design
Transformers Provably Learn Sparse Token Selection While Fully-Connected Nets Cannot
Defense against Model Extraction Attack by Bayesian Active Watermarking
Autaptic Synaptic Circuit Enhances Spatio-temporal Predictive Learning of Spiking Neural Networks
Learning with Adaptive Resource Allocation
Exploring Intrinsic Dimension for Vision-Language Model Pruning
Boximator: Generating Rich and Controllable Motions for Video Synthesis
Bridging Model Heterogeneity in Federated Learning via Uncertainty-based Asymmetrical Reciprocity Learning
Neural Collapse meets Differential Privacy: Curious behaviors of NoisyGD with Near-Perfect Representation Learning
Batch Singular Value Polarization and Weighted Semantic Augmentation for Universal Domain Adaptation
Mapping the Multiverse of Latent Representations
Exact Soft Analytical Side-Channel Attacks using Tractable Circuits
Learning Pseudo-Contractive Denoisers for Inverse Problems
Rethinking Generative Large Language Model Evaluation for Semantic Comprehension
Task Groupings Regularization: Data-Free Meta-Learning with Heterogeneous Pre-trained Models
Magicoder: Empowering Code Generation with OSS-Instruct
Extending Test-Time Augmentation with Metamorphic Relations for Combinatorial Problems
Position: AI/ML Influencers Have a Place in the Academic Process
Manifold Integrated Gradients: Riemannian Geometry for Feature Attribution
Contrastive Representation for Data Filtering in Cross-Domain Offline Reinforcement Learning
Diffusion-based Missing-view Generation With the Application on Incomplete Multi-view Clustering
Which Frequencies do CNNs Need? Emergent Bottleneck Structure in Feature Learning
Provable Contrastive Continual Learning
Stability-Informed Initialization of Neural Ordinary Differential Equations
Multiply Robust Estimation for Local Distribution Shifts with Multiple Domains
Unified Training of Universal Time Series Forecasting Transformers
Adaptive Accompaniment with ReaLchords
Ditto: Quantization-aware Secure Inference of Transformers upon MPC
NExT-GPT: Any-to-Any Multimodal LLM
Understanding Stochastic Natural Gradient Variational Inference
FAFE: Immune Complex Modeling with Geodesic Distance Loss on Noisy Group Frames
A Resilient and Accessible Distribution-Preserving Watermark for Large Language Models
PointMC: Multi-instance Point Cloud Registration based on Maximal Cliques
Evaluating and Analyzing Relationship Hallucinations in Large Vision-Language Models
Borda Regret Minimization for Generalized Linear Dueling Bandits
DISCRET: Synthesizing Faithful Explanations For Treatment Effect Estimation
Surface-VQMAE: Vector-quantized Masked Auto-encoders on Molecular Surfaces
Learning Causal Relations from Subsampled Time Series with Two Time-Slices
AND: Audio Network Dissection for Interpreting Deep Acoustic Models
Transolver: A Fast Transformer Solver for PDEs on General Geometries
Confidence-aware Contrastive Learning for Selective Classification
Minimally Modifying a Markov Game to Achieve Any Nash Equilibrium and Value
VoroNav: Voronoi-based Zero-shot Object Navigation with Large Language Model
Profile Reconstruction from Private Sketches
Policy Learning for Balancing Short-Term and Long-Term Rewards
A Theory of Fault-Tolerant Learning
How to Explore with Belief: State Entropy Maximization in POMDPs
Mitigating Catastrophic Forgetting in Online Continual Learning by Modeling Previous Task Interrelations via Pareto Optimization
Detecting Any instruction-to-answer interaction relationship:Universal Instruction-to-Answer Navigator for Med-VQA
Unraveling the Impact of Heterophilic Structures on Graph Positive-Unlabeled Learning
Mitigating Label Noise on Graphs via Topological Sample Selection
Boosting Reinforcement Learning with Strongly Delayed Feedback Through Auxiliary Short Delays
HGCN2SP: Hierarchical Graph Convolutional Network for Two-Stage Stochastic Programming
Q-Align: Teaching LMMs for Visual Scoring via Discrete Text-Defined Levels
Refined Coreset Selection: Towards Minimal Coreset Size under Model Performance Constraints
LESS: Selecting Influential Data for Targeted Instruction Tuning
Contrastive Learning for Clinical Outcome Prediction with Partial Data Sources
Position: Rethinking Post-Hoc Search-Based Neural Approaches for Solving Large-Scale Traveling Salesman Problems
Delving into the Convergence of Generalized Smooth Minimax Optimization
Category-Aware Active Domain Adaptation
Improved Operator Learning by Orthogonal Attention
Temporal Spiking Neural Networks with Synaptic Delay for Graph Reasoning
Efficient Contrastive Learning for Fast and Accurate Inference on Graphs
CCM: Real-Time Controllable Visual Content Creation Using Text-to-Image Consistency Models
Intersecting-Boundary-Sensitive Fingerprinting for Tampering Detection of DNN Models
Automating the Selection of Proxy Variables of Unmeasured Confounders
FedREDefense: Defending against Model Poisoning Attacks for Federated Learning using Model Update Reconstruction Error
Improving SAM Requires Rethinking its Optimization Formulation
Implicit Bias of AdamW: $\ell_\infty$-Norm Constrained Optimization
Local Causal Structure Learning in the Presence of Latent Variables
Reflected Flow Matching
Federated Neuro-Symbolic Learning
HelmFluid: Learning Helmholtz Dynamics for Interpretable Fluid Prediction
See More Details: Efficient Image Super-Resolution by Experts Mining
Provably Efficient Reinforcement Learning for Adversarial Restless Multi-Armed Bandits with Unknown Transitions and Bandit Feedback
Distilling Morphology-Conditioned Hypernetworks for Efficient Universal Morphology Control
Stochastic Bandits with ReLU Neural Networks
Intersectional Unfairness Discovery
Semantic-Aware Human Object Interaction Image Generation
Adapting Static Fairness to Sequential Decision-Making: Bias Mitigation Strategies towards Equal Long-term Benefit Rate
Equivariant Graph Neural Operator for Modeling 3D Dynamics
Aligned Objective for Soft-Pseudo-Label Generation in Supervised Learning
Non-clairvoyant Scheduling with Partial Predictions
BiSHop: Bi-Directional Cellular Learning for Tabular Data with Generalized Sparse Modern Hopfield Model
Enhancing Vision Transformer: Amplifying Non-Linearity in Feedforward Network Module
Meta-Reinforcement Learning Robust to Distributional Shift Via Performing Lifelong In-Context Learning
Prompt-guided Precise Audio Editing with Diffusion Models
Robust Inverse Constrained Reinforcement Learning under Model Misspecification
Soft Prompt Recovers Compressed LLMs, Transferably
Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation
Adaptive Group Personalization for Federated Mutual Transfer Learning
Learning Exceptional Subgroups by End-to-End Maximizing KL-Divergence
Pricing with Contextual Elasticity and Heteroscedastic Valuation
Learning 1-Bit Tiny Object Detector with Discriminative Feature Refinement
SLOG: An Inductive Spectral Graph Neural Network Beyond Polynomial Filter
Libra: Building Decoupled Vision System on Large Language Models
Out-of-Distribution Detection via Deep Multi-Comprehension Ensemble
Random Masking Finds Winning Tickets for Parameter Efficient Fine-tuning
Exponential Spectral Pursuit: An Effective Initialization Method for Sparse Phase Retrieval
Iterative Regularized Policy Optimization with Imperfect Demonstrations
Few-shot Adaptation to Distribution Shifts By Mixing Source and Target Embeddings
Offline Multi-Objective Optimization
FairProof : Confidential and Certifiable Fairness for Neural Networks
Balancing Similarity and Complementarity for Federated Learning
Probabilistic Time Series Modeling with Decomposable Denoising Diffusion Model
Exploring the LLM Journey from Cognition to Expression with Linear Representations
A Space Group Symmetry Informed Network for O(3) Equivariant Crystal Tensor Prediction
Offline Imitation from Observation via Primal Wasserstein State Occupancy Matching
Handling Heterogeneous Curvatures in Bandit LQR Control
Foundations of Testing for Finite-Sample Causal Discovery
Retrieval Across Any Domains via Large-scale Pre-trained Model
Reducing Balancing Error for Causal Inference via Optimal Transport
Sample-Efficient Multiagent Reinforcement Learning with Reset Replay
Guidance with Spherical Gaussian Constraint for Conditional Diffusion
Better Safe than Sorry: Pre-training CLIP against Targeted Data Poisoning and Backdoor Attacks
SAM as the Guide: Mastering Pseudo-Label Refinement in Semi-Supervised Referring Expression Segmentation
Small-loss Adaptive Regret for Online Convex Optimization
Position: Towards Implicit Prompt For Text-To-Image Models
Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment
Understanding Server-Assisted Federated Learning in the Presence of Incomplete Client Participation
Representation Surgery for Multi-Task Model Merging
Reducing Fine-Tuning Memory Overhead by Approximate and Memory-Sharing Backpropagation
Unlocking the Power of Spatial and Temporal Information in Medical Multimodal Pre-training
UniAudio: Towards Universal Audio Generation with Large Language Models
Explain Temporal Black-Box Models via Functional Decomposition
Stability and Generalization of Stochastic Compositional Gradient Descent Algorithms
Neuro-Symbolic Temporal Point Processes
Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs
Empowering Graph Invariance Learning with Deep Spurious Infomax
Human vs. Generative AI in Content Creation Competition: Symbiosis or Conflict?
Mobile Attention: Mobile-Friendly Linear-Attention for Vision Transformers
Socialized Learning: Making Each Other Better Through Multi-Agent Collaboration
Towards Robust Model-Based Reinforcement Learning Against Adversarial Corruption
StableMask: Refining Causal Masking in Decoder-only Transformer
Junk DNA Hypothesis: Pruning Small Pre-Trained Weights $\textit{Irreversibly}$ and $\textit{Monotonically}$ Impairs ``Difficult" Downstream Tasks in LLMs
Uncertainty Estimation by Density Aware Evidential Deep Learning
FRAG: Frequency Adapting Group for Diffusion Video Editing
When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models
EvGGS: A Collaborative Learning Framework for Event-based Generalizable Gaussian Splatting
SpikeZIP-TF: Conversion is All You Need for Transformer-based SNN
Activation-Descent Regularization for Input Optimization of ReLU Networks
Privacy-Preserving Instructions for Aligning Large Language Models
Learning Latent Structures in Network Games via Data-Dependent Gated-Prior Graph Variational Autoencoders
Enabling Few-Shot Learning with PID Control: A Layer Adaptive Optimizer
Generalization Bound and New Algorithm for Clean-Label Backdoor Attack
Learning Causal Dynamics Models in Object-Oriented Environments
ViP: A Differentially Private Foundation Model for Computer Vision
Purify Unlearnable Examples via Rate-Constrained Variational Autoencoders
MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities
Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch
Improving Sharpness-Aware Minimization by Lookahead
SHINE: Shielding Backdoors in Deep Reinforcement Learning
Not Just Pretty Pictures: Toward Interventional Data Augmentation Using Text-to-Image Generators
DiffAug: Enhance Unsupervised Contrastive Learning with Domain-Knowledge-Free Diffusion-based Data Augmentation
Robustly Learning Single-Index Models via Alignment Sharpness
In-Context Decision Transformer: Reinforcement Learning via Hierarchical Chain-of-Thought
tnGPS: Discovering Unknown Tensor Network Structure Search Algorithms via Large Language Models (LLMs)
Token-level Direct Preference Optimization
Learning Reward for Robot Skills Using Large Language Models via Self-Alignment
Graph Mixup on Approximate Gromov–Wasserstein Geodesics
IM-Unpack: Training and Inference with Arbitrarily Low Precision Integers
Differentiable Annealed Importance Sampling Minimizes The Jensen-Shannon Divergence Between Initial and Target Distribution
Robust Learning-Augmented Dictionaries
Tight Partial Identification of Causal Effects with Marginal Distribution of Unmeasured Confounders
DAG-Based Column Generation for Adversarial Team Games
Efficient Stochastic Approximation of Minimax Excess Risk Optimization
Discounted Adaptive Online Learning: Towards Better Regularization
Provably Efficient Partially Observable Risk-sensitive Reinforcement Learning with Hindsight Observation
Self-Supervised Coarsening of Unstructured Grid with Automatic Differentiation
LQER: Low-Rank Quantization Error Reconstruction for LLMs
Random Scaling and Momentum for Non-smooth Non-convex Optimization
Look Ahead or Look Around? A Theoretical Comparison Between Autoregressive and Masked Pretraining
CaM: Cache Merging for Memory-efficient LLMs Inference
Watermarks in the Sand: Impossibility of Strong Watermarking for Language Models
MILP-FBGen: LP/MILP Instance Generation with Feasibility/Boundedness
ILILT: Implicit Learning of Inverse Lithography Technologies
SF-DQN: Provable Knowledge Transfer using Successor Feature for Deep Reinforcement Learning
More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning
Advancing DRL Agents in Commercial Fighting Games: Training, Integration, and Agent-Human Alignment
Parameter-Efficient Fine-Tuning with Controls
Deep Regression Representation Learning with Topology
Understanding Unimodal Bias in Multimodal Deep Linear Networks
Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark
S3O: A Dual-Phase Approach for Reconstructing Dynamic Shape and Skeleton of Articulated Objects from Single Monocular Video
Understanding and Diagnosing Deep Reinforcement Learning
Tackling Non-Stationarity in Reinforcement Learning via Causal-Origin Representation
Multi-Factor Adaptive Vision Selection for Egocentric Video Question Answering
UP2ME: Univariate Pre-training to Multivariate Fine-tuning as a General-purpose Framework for Multivariate Time Series Analysis
Improving Accuracy-robustness Trade-off via Pixel Reweighted Adversarial Training
MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions
Wukong: Towards a Scaling Law for Large-Scale Recommendation
Nonparametric Teaching of Implicit Neural Representations
Sparse-to-dense Multimodal Image Registration via Multi-Task Learning
In-Context Principle Learning from Mistakes
A Federated Stochastic Multi-level Compositional Minimax Algorithm for Deep AUC Maximization
Riemannian Preconditioned LoRA for Fine-Tuning Foundation Models
Enhancing Storage and Computational Efficiency in Federated Multimodal Learning for Large-Scale Models
Online Resource Allocation with Non-Stationary Customers
Debiased Offline Representation Learning for Fast Online Adaptation in Non-stationary Dynamics
Fair Risk Control: A Generalized Framework for Calibrating Multi-group Fairness Risks
Online Matching with Stochastic Rewards: Provable Better Bound via Adversarial Reinforcement Learning
Trustworthy Alignment of Retrieval-Augmented Large Language Models via Reinforcement Learning
Switchable Decision: Dynamic Neural Generation Networks
Interpreting and Improving Large Language Models in Arithmetic Calculation
GroupCover: A Secure, Efficient and Scalable Inference Framework for On-device Model Protection based on TEEs
Candidate Pseudolabel Learning: Enhancing Vision-Language Models by Prompt Tuning with Unlabeled Data
Exploring the Benefit of Activation Sparsity in Pre-training
Causal Representation Learning from Multiple Distributions: A General Setting
FESSNC: Fast Exponentially Stable and Safe Neural Controller
Rethinking Guidance Information to Utilize Unlabeled Samples: A Label Encoding Perspective
Minimax Optimality of Score-based Diffusion Models: Beyond the Density Lower Bound Assumptions
Two Heads Are Better Than One: Boosting Graph Sparse Training via Semantic and Topological Awareness
Beyond the ROC Curve: Classification Trees Using Cost-Optimal Curves, with Application to Imbalanced Datasets
Distributionally Robust Data Valuation
MLIP: Efficient Multi-Perspective Language-Image Pretraining with Exhaustive Data Utilization
Efficient Contextual Bandits with Uninformed Feedback Graphs
Efficient Denoising Diffusion via Probabilistic Masking
Confronting Reward Overoptimization for Diffusion Models: A Perspective of Inductive and Primacy Biases
Uncertainty-Aware Reward-Free Exploration with General Function Approximation
Feature Contamination: Neural Networks Learn Uncorrelated Features and Fail to Generalize
On the Expressive Power of Spectral Invariant Graph Neural Networks
Neural Jump-Diffusion Temporal Point Processes
Fast Text-to-3D-Aware Face Generation and Manipulation via Direct Cross-modal Mapping and Geometric Regularization
Accelerating Iterative Retrieval-augmented Language Model Serving with Speculation
Position: Measure Dataset Diversity, Don't Just Claim It
Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning
Absolute Policy Optimization: Enhancing Lower Probability Bound of Performance with High Confidence
Spider: A Unified Framework for Context-dependent Concept Segmentation
Rethinking Adversarial Robustness in the Context of the Right to be Forgotten
Quantum Implicit Neural Representations
Is Inverse Reinforcement Learning Harder than Standard Reinforcement Learning? A Theoretical Perspective
A Statistical Theory of Regularization-Based Continual Learning
Unsupervised Representation Learning of Brain Activity via Bridging Voxel Activity and Functional Connectivity
Double-Step Alternating Extragradient with Increasing Timescale Separation for Finding Local Minimax Points: Provable Improvements
CompeteAI: Understanding the Competition Dynamics of Large Language Model-based Agents
LangCell: Language-Cell Pre-training for Cell Identity Understanding
Graph-based Time Series Clustering for End-to-End Hierarchical Forecasting
Exploiting Negative Samples: A Catalyst for Cohort Discovery in Healthcare Analytics
Characteristic Guidance: Non-linear Correction for Diffusion Model at Large Guidance Scale
Premier-TACO is a Few-Shot Policy Learner: Pretraining Multitask Representation via Temporal Action-Driven Contrastive Loss
Conformal Predictions under Markovian Data
Learning Latent Space Hierarchical EBM Diffusion Models
DPN: Decoupling Partition and Navigation for Neural Solvers of Min-max Vehicle Routing Problems
On Prompt-Driven Safeguarding for Large Language Models
Self-Infilling Code Generation
ERQ: Error Reduction for Post-Training Quantization of Vision Transformers
Provably Efficient Exploration in Quantum Reinforcement Learning with Logarithmic Worst-Case Regret
GNNs Also Deserve Editing, and They Need It More Than Once
Causal-IQA: Towards the Generalization of Image Quality Assessment Based on Causal Inference
On the Emergence of Cross-Task Linearity in Pretraining-Finetuning Paradigm
Finite-Time Convergence and Sample Complexity of Actor-Critic Multi-Objective Reinforcement Learning
Pedestrian Attribute Recognition as Label-balanced Multi-label Learning
Conformalized Adaptive Forecasting of Heterogeneous Trajectories
Sequential Kernel Goodness-of-fit Testing
RAUCA: A Novel Physical Adversarial Attack on Vehicle Detectors via Robust and Accurate Camouflage Generation
CurBench: Curriculum Learning Benchmark
GALA3D: Towards Text-to-3D Complex Scene Generation via Layout-guided Generative Gaussian Splatting
Stabilizing Policy Gradients for Stochastic Differential Equations via Consistency with Perturbation Process
DeCoOp: Robust Prompt Tuning with Out-of-Distribution Detection
ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL
Graphon Mean Field Games with a Representative Player: Analysis and Learning Algorithm
Score identity Distillation: Exponentially Fast Distillation of Pretrained Diffusion Models for One-Step Generation
Exploring Training on Heterogeneous Data with Mixture of Low-rank Adapters
Iterative Search Attribution for Deep Neural Networks
Generative Active Learning for Long-tailed Instance Segmentation
Iterative Data Smoothing: Mitigating Reward Overfitting and Overoptimization in RLHF
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Switched Flow Matching: Eliminating Singularities via Switching ODEs
Toward Availability Attacks in 3D Point Clouds
Antibody Design Using a Score-based Diffusion Model Guided by Evolutionary, Physical and Geometric Constraints
Online Learning in Betting Markets: Profit versus Prediction
Dynamic Evaluation of Large Language Models by Meta Probing Agents
CRoFT: Robust Fine-Tuning with Concurrent Optimization for OOD Generalization and Open-Set OOD Detection
Translation Equivariant Transformer Neural Processes
Language Models Represent Beliefs of Self and Others
Stealthy Imitation: Reward-guided Environment-free Policy Stealing
Reinformer: Max-Return Sequence Modeling for Offline RL
Towards Efficient Spiking Transformer: a Token Sparsification Framework for Training and Inference Acceleration
Converting Transformers to Polynomial Form for Secure Inference Over Homomorphic Encryption
Viewing Transformers Through the Lens of Long Convolutions Layers
Safety Fine-Tuning at (Almost) No Cost: A Baseline for Vision Large Language Models
Compositional Few-Shot Class-Incremental Learning
BiE: Bi-Exponent Block Floating-Point for Large Language Models Quantization
Improving Equivariant Graph Neural Networks on Large Geometric Graphs via Virtual Nodes Learning
REST: Efficient and Accelerated EEG Seizure Analysis through Residual State Updates
Amend to Alignment: Decoupled Prompt Tuning for Mitigating Spurious Correlation in Vision-Language Models
Visual Representation Learning with Stochastic Frame Prediction
Exploration and Anti-Exploration with Distributional Random Network Distillation
Position: Is machine learning good or bad for the natural sciences?
Multi-class Probabilistic Bounds for Majority Vote Classifiers with Partially Labeled Data
Sample Complexity Bounds for Estimating Probability Divergences under Invariances
AquaLoRA: Toward White-box Protection for Customized Stable Diffusion Models via Watermark LoRA
On the Embedding Collapse when Scaling up Recommendation Models
Revisiting Character-level Adversarial Attacks for Language Models
Position: Quo Vadis, Unsupervised Time Series Anomaly Detection?
Scaling Beyond the GPU Memory Limit for Large Mixture-of-Experts Model Training
Language Agents with Reinforcement Learning for Strategic Play in the Werewolf Game
COPAL: Continual Pruning in Large Language Generative Models
Position: Understanding LLMs Requires More Than Statistical Generalization
Understanding Inter-Concept Relationships in Concept-Based Models
Online Isolation Forest
Multimodal Prototyping for cancer survival prediction
Differentially Private Sum-Product Networks
Log Neural Controlled Differential Equations: The Lie Brackets Make A Difference
A Dense Reward View on Aligning Text-to-Image Diffusion with Preference
Improved Differentially Private and Lazy Online Convex Optimization: Lower Regret without Smoothness Requirements
MorphGrower: A Synchronized Layer-by-layer Growing Approach for Plausible Neuronal Morphology Generation
Expand-and-Cluster: Parameter Recovery of Neural Networks
REMEDI: Corrective Transformations for Improved Neural Entropy Estimation
Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF
Momentor: Advancing Video Large Language Model with Fine-Grained Temporal Reasoning
Counterfactual Metarules for Local and Global Recourse
UGrid: An Efficient-And-Rigorous Neural Multigrid Solver for Linear PDEs
RLAIF vs. RLHF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
Optimal Recurrent Network Topologies for Dynamical Systems Reconstruction
Unmasking Vulnerabilities: Cardinality Sketches under Adaptive Inputs
StackSight: Unveiling WebAssembly through Large Language Models and Neurosymbolic Chain-of-Thought Decompilation
Disentangled Continual Graph Neural Architecture Search with Invariant Modular Supernet
Disentangled Graph Self-supervised Learning for Out-of-Distribution Generalization
Knowledge-aware Reinforced Language Models for Protein Directed Evolution
Adapt and Diffuse: Sample-adaptive Reconstruction via Latent Diffusion Models
On the Effectiveness of Supervision in Asymmetric Non-Contrastive Learning
DiracDiffusion: Denoising and Incremental Reconstruction with Assured Data-Consistency
DNCs Require More Planning Steps
I/O Complexity of Attention, or How Optimal is FlashAttention?
Position: Data Authenticity, Consent, & Provenance for AI are all broken: what will it take to fix them?
Lookbehind-SAM: k steps back, 1 step forward
A Distributional Analogue to the Successor Representation
From Neurons to Neutrons: A Case Study in Interpretability
A Theoretical Analysis of Backdoor Poisoning Attacks in Convolutional Neural Networks
Do Transformer World Models Give Better Policy Gradients?
Conformal prediction for multi-dimensional time series by ellipsoidal sets
Conditionally-Conjugate Gaussian Process Factor Analysis for Spike Count Data via Data Augmentation
Understanding the Training Speedup from Sampling with Approximate Losses
Adaptive Horizon Actor-Critic for Policy Learning in Contact-Rich Differentiable Simulation
Skill Set Optimization: Reinforcing Language Model Behavior via Transferable Skills
Mathematical Framework for Online Social Media Auditing
Low-Rank Similarity Mining for Multimodal Dataset Distillation
MC-GTA: Metric-Constrained Model-Based Clustering using Goodness-of-fit Tests with Autocorrelations
Enforcing Constraints in RNA Secondary Structure Predictions: A Post-Processing Framework Based on the Assignment Problem
Reshape and Adapt for Output Quantization (RAOQ): Quantization-aware Training for In-memory Computing Systems
What Improves the Generalization of Graph Transformers? A Theoretical Dive into the Self-attention and Positional Encoding
How Do Nonlinear Transformers Learn and Generalize in In-Context Learning?
Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning
ProtoGate: Prototype-based Neural Networks with Global-to-local Feature Selection for Tabular Biomedical Data
Controllable Prompt Tuning For Balancing Group Distributional Robustness
Position: Scarce Resource Allocations That Rely On Machine Learning Should Be Randomized
Compositional Text-to-Image Generation with Dense Blob Representations
Learning High-Frequency Functions Made Easy with Sinusoidal Positional Encoding
Deep Functional Factor Models: Forecasting High-Dimensional Functional Time Series via Bayesian Nonparametric Factorization
A Neural-Preconditioned Poisson Solver for Mixed Dirichlet and Neumann Boundary Conditions
Diffusion Posterior Sampling is Computationally Intractable
PerceptAnon: Exploring the Human Perception of Image Anonymization Beyond Pseudonymization for GDPR
Do Topological Characteristics Help in Knowledge Distillation?
Stochastic Optimization with Arbitrary Recurrent Data Sampling
Partially Stochastic Infinitely Deep Bayesian Neural Networks
Neuro-Visualizer: A Novel Auto-Encoder-Based Loss Landscape Visualization Method With an Application in Knowledge-Guided Machine Learning
Discovering Bias in Latent Space: An Unsupervised Debiasing Approach
Centralized Selection with Preferences in the Presence of Biases
Generalizing Knowledge Graph Embedding with Universal Orthogonal Parameterization
DySLIM: Dynamics Stable Learning by Invariant Measure for Chaotic Systems
Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibration
BAT: Learning to Reason about Spatial Sounds with Large Language Models
Rethinking Transformers in Solving POMDPs
Symmetric Replay Training: Enhancing Sample Efficiency in Deep Reinforcement Learning for Combinatorial Optimization
From Biased Selective Labels to Pseudo-Labels: An Expectation-Maximization Framework for Learning from Biased Decisions
Embodied CoT Distillation From LLM To Off-the-shelf Agents
A General Framework for Sequential Decision-Making under Adaptivity Constraints
How Does Goal Relabeling Improve Sample Efficiency?
Theory of Consistency Diffusion Models: Distribution Estimation Meets Fast Sampling
RoboCodeX: Multimodal Code Generation for Robotic Behavior Synthesis
Enhancing Adversarial Robustness in SNNs with Sparse Gradients
Layerwise Change of Knowledge in Neural Networks
Analysis for Abductive Learning and Neural-Symbolic Reasoning Shortcuts
On Computational Limits of Modern Hopfield Models: A Fine-Grained Complexity Analysis
Use Your INSTINCT: INSTruction optimization for LLMs usIng Neural bandits Coupled with Transformers
EMC$^2$: Efficient MCMC Negative Sampling for Contrastive Learning with Global Convergence
Smoothness Adaptive Hypothesis Transfer Learning
Envisioning Outlier Exposure by Large Language Models for Out-of-Distribution Detection
Tilt your Head: Activating the Hidden Spatial-Invariance of Classifiers
WISER: Weak Supervision and Supervised Representation Learning to Improve Drug Response Prediction in Cancer
Interacting Diffusion Processes for Event Sequence Forecasting
Recurrent Early Exits for Federated Learning with Heterogeneous Clients
On Interpolating Experts and Multi-Armed Bandits
Individual Contributions as Intrinsic Exploration Scaffolds for Multi-agent Reinforcement Learning
Revitalizing Multivariate Time Series Forecasting: Learnable Decomposition with Inter-Series Dependencies and Intra-Series Variations Modeling
NeuralIndicator: Implicit Surface Reconstruction from Neural Indicator Priors
On the Calibration of Human Pose Estimation
MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI
Size-invariance Matters: Rethinking Metrics and Losses for Imbalanced Multi-object Salient Object Detection
The Surprising Effectiveness of Skip-Tuning in Diffusion Sampling
RVI-SAC: Average Reward Off-Policy Deep Reinforcement Learning
Smooth Tchebycheff Scalarization for Multi-Objective Optimization
DFD: Distilling the Feature Disparity Differently for Detectors
Evolution of Heuristics: Towards Efficient Automatic Algorithm Design Using Large Language Model
In-Context Unlearning: Language Models as Few-Shot Unlearners
Reward Model Learning vs. Direct Policy Optimization: A Comparative Analysis of Learning from Human Preferences
Agent-Specific Effects: A Causal Effect Propagation Analysis in Multi-Agent MDPs
KernelSHAP-IQ: Weighted Least Square Optimization for Shapley Interactions
Reference Neural Operators: Learning the Smooth Dependence of Solutions of PDEs on Geometric Deformations
Auto-Linear Phenomenon in Subsurface Imaging
Accelerating Parallel Sampling of Diffusion Models
From Words to Actions: Unveiling the Theoretical Underpinnings of LLM-Driven Autonomous Systems
Acquiring Diverse Skills using Curriculum Reinforcement Learning with Mixture of Experts
DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models (Exemplified as A Video Agent)
Distributed Bilevel Optimization with Communication Compression
Inherent Trade-Offs between Diversity and Stability in Multi-Task Benchmarks
On the Minimal Degree Bias in Generalization on the Unseen for non-Boolean Functions
On the Weight Dynamics of Deep Normalized Networks
Jacobian Regularizer-based Neural Granger Causality
Diffusion Tempering Improves Parameter Estimation with Probabilistic Integrators for Ordinary Differential Equations
Projecting Molecules into Synthesizable Chemical Spaces
A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts
Energy-based Backdoor Defense without Task-Specific Samples and Model Retraining
Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
Don’t Label Twice: Quantity Beats Quality when Comparing Binary Classifiers on a Budget
Causal Inference from Competing Treatments
Denoising Autoregressive Representation Learning
On a Neural Implementation of Brenier's Polar Factorization
Causally Motivated Personalized Federated Invariant Learning with Shortcut-Averse Information-Theoretic Regularization
Privacy Attacks in Decentralized Learning
Membership Inference Attacks on Diffusion Models via Quantile Regression
Hybrid Neural Representations for Spherical Data
Premise Order Matters in Reasoning with Large Language Models
Graph As Point Set
Mastering Zero-Shot Interactions in Cooperative and Competitive Simultaneous Games
SparQ Attention: Bandwidth-Efficient LLM Inference
An Analysis of Linear Time Series Forecasting Models
Adaptive Stabilization Based on Machine Learning for Column Generation
PAC-Bayesian Generalization Bounds for Knowledge Graph Representation Learning
Is In-Context Learning in Large Language Models Bayesian? A Martingale Perspective
Accelerating Legacy Numerical Solvers by Non-intrusive Gradient-based Meta-solving
Mastering Robot Manipulation with Multimodal Prompts through Pretraining and Multi-task Fine-tuning
Optimizing Watermarks for Large Language Models
A Probabilistic Approach to Learning the Degree of Equivariance in Steerable CNNs
Selective Mixup Helps with Distribution Shifts, But Not (Only) because of Mixup
Hieros: Hierarchical Imagination on Structured State Space Sequence World Models
Improved Stability and Generalization Guarantees of the Decentralized SGD Algorithm
Equivariant Diffusion for Crystal Structure Prediction
Online Variational Sequential Monte Carlo
Generalized Preference Optimization: A Unified Approach to Offline Alignment
Leveraging VLM-Based Pipelines to Annotate 3D Objects
Sequential Asynchronous Action Coordination in Multi-Agent Systems: A Stackelberg Decision Transformer Approach
MS$^3$D: A RG Flow-Based Regularization for GAN Training with Limited Data
DiffDA: a Diffusion model for weather-scale Data Assimilation
Understanding Heterophily for Graph Neural Networks
The Stronger the Diffusion Model, the Easier the Backdoor: Data Poisoning to Induce Copyright BreachesWithout Adjusting Finetuning Pipeline
Dynamic Spectral Clustering with Provable Approximation Guarantee
Adversarially Robust Deep Multi-View Clustering: A Novel Attack and Defense Framework
PEARL: Zero-shot Cross-task Preference Alignment and Robust Reward Learning for Robotic Manipulation
On Stronger Computational Separations Between Multimodal and Unimodal Machine Learning
Rich-Observation Reinforcement Learning with Continuous Latent Dynamics
Ai-sampler: Adversarial Learning of Markov kernels with involutive maps
Unifying Bayesian Flow Networks and Diffusion Models through Stochastic Differential Equations
Sparse Dimensionality Reduction Revisited
A New Theoretical Perspective on Data Heterogeneity in Federated Optimization
Subhomogeneous Deep Equilibrium Models
Speech Self-Supervised Learning Using Diffusion Model Synthetic Data
RoboDreamer: Learning Compositional World Models for Robot Imagination
ContPhy: Continuum Physical Concept Learning and Reasoning from Videos
3D-VLA: A 3D Vision-Language-Action Generative World Model
RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning via Generative Simulation
Model Tailor: Mitigating Catastrophic Forgetting in Multi-modal Large Language Models
Compositional Image Decomposition with Diffusion Models
Diffusion Rejection Sampling
Information-Directed Pessimism for Offline Reinforcement Learning
Exploring the Enigma of Neural Dynamics Through A Scattering-Transform Mixer Landscape for Riemannian Manifold
Partial Optimality in the Linear Ordering Problem
Improved Modelling of Federated Datasets using Mixtures-of-Dirichlet-Multinomials
Differentially Private Domain Adaptation with Theoretical Guarantees
Differentially Private Worst-group Risk Minimization
Time Series Diffusion in the Frequency Domain
FreeBind: Free Lunch in Unified Multimodal Space via Knowledge Fusion
Position: Optimization in SciML Should Employ the Function Space Geometry
Memory Efficient Neural Processes via Constant Memory Attention Block
Box Facets and Cut Facets of Lifted Multicut Polytopes
Improving Computational Complexity in Statistical Models with Local Curvature Information
Online Algorithms with Uncertainty-Quantified Predictions
A Statistical Framework for Data-dependent Retrieval-Augmented Models
A Fresh Take on Stale Embeddings: Improving Dense Retriever Training with Corrector Networks
Trained Random Forests Completely Reveal your Dataset
Generalization Analysis of Deep Non-linear Matrix Completion
Triplet Interaction Improves Graph Transformers: Accurate Molecular Graph Learning with Triplet Graph Transformers
R2E: Turning any Github Repository into a Programming Agent Environment
Enhancing Value Function Estimation through First-Order State-Action Dynamics in Offline Reinforcement Learning
Accelerated Policy Gradient: On the Convergence Rates of the Nesterov Momentum for Reinforcement Learning
Verification of Machine Unlearning is Fragile
Towards Certified Unlearning for Deep Neural Networks
DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents
Dirichlet Flow Matching with Applications to DNA Sequence Design
Reward Shaping for Reinforcement Learning with An Assistant Reward Agent
Human Alignment of Large Language Models through Online Preference Optimisation
Flora: Low-Rank Adapters Are Secretly Gradient Compressors
Expressivity and Generalization: Fragment-Biases for Molecular GNNs
SelfIE: Self-Interpretation of Large Language Model Embeddings
Listenable Maps for Audio Classifiers
QBMK: Quantum-based Matching Kernels for Un-attributed Graphs
Unsupervised Parameter-free Simplicial Representation Learning with Scattering Transforms
Regularized Q-learning through Robust Averaging
Sparsest Models Elude Pruning: An Exposé of Pruning’s Current Capabilities
Efficient Precision and Recall Metrics for Assessing Generative Models using Hubness-aware Sampling
Reprompting: Automated Chain-of-Thought Prompt Inference Through Gibbs Sampling
Non-Asymptotic Analysis for Single-Loop (Natural) Actor-Critic with Compatible Function Approximation
Position: Considerations for Differentially Private Learning with Large-Scale Public Pretraining
Differentially Private Post-Processing for Fair Regression
An Empirical Study Into What Matters for Calibrating Vision-Language Models
Position: Automatic Environment Shaping is the Next Frontier in RL
LLM Maybe LongLM: SelfExtend LLM Context Window Without Tuning
Evaluating Model Bias Requires Characterizing its Mistakes
Probabilistic Subgoal Representations for Hierarchical Reinforcement Learning
LAGMA: LAtent Goal-guided Multi-Agent Reinforcement Learning
How do Large Language Models Navigate Conflicts between Honesty and Helpfulness?
Image Fusion via Vision-Language Model
Sharp Rates in Dependent Learning Theory: Avoiding Sample Size Deflation for the Square Loss
Knowledge Transfer from Vision Foundation Models for Efficient Training of Small Task-specific Models
A Computational Framework for Solving Wasserstein Lagrangian Flows
Contextual Feature Selection with Conditional Stochastic Gates
Weisfeiler Leman for Euclidean Equivariant Machine Learning
Learning to Scale Logits for Temperature-Conditional GFlowNets
Learning in Feature Spaces via Coupled Covariances: Asymmetric Kernel SVD and Nyström method
Reinforcement Learning and Regret Bounds for Admission Control
Large Scale Dataset Distillation with Domain Shift
OptiMUS: Scalable Optimization Modeling with (MI)LP Solvers and Large Language Models
Autoformalizing Euclidean Geometry
Efficient Policy Evaluation with Offline Data Informed Behavior Policy Design
QuIP$\#$: Even Better LLM Quantization with Hadamard Incoherence and Lattice Codebooks
Harmony in Diversity: Merging Neural Networks with Canonical Correlation Analysis
Probabilistic Inference in Language Models via Twisted Sequential Monte Carlo
Revealing Vision-Language Integration in the Brain with Multimodal Networks
COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability
Learning Optimal Projection for Forecast Reconciliation of Hierarchical Time Series
Saliency strikes back: How filtering out high frequencies improves white-box explanations
Switching the Loss Reduces the Cost in Batch Reinforcement Learning
Sampling-based Multi-dimensional Recalibration
NExT: Teaching Large Language Models to Reason about Code Execution
Time Weaver: A Conditional Time Series Generation Model
Hierarchical Novelty Detection via Fine-Grained Evidence Allocation
Executable Code Actions Elicit Better LLM Agents
Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision
PANDA: Expanded Width-Aware Message Passing Beyond Rewiring
Towards Efficient Training and Evaluation of Robust Models against $l_0$ Bounded Adversarial Perturbations
Position: AI-Powered Autonomous Weapons Risk Geopolitical Instability and Threaten AI Research
Harnessing Hierarchical Label Distribution Variations in Test Agnostic Long-tail Recognition
MMPareto: Boosting Multimodal Learning with Innocent Unimodal Assistance
Uniformly Stable Algorithms for Adversarial Training and Beyond
Exploration-Driven Policy Optimization in RLHF: Theoretical Insights on Efficient Data Utilization
Position: Intent-aligned AI Systems Must Optimize for Agency Preservation
Model-Based Minimum Bayes Risk Decoding for Text Generation
On the Convergence of Projected Bures-Wasserstein Gradient Descent under Euclidean Strong Convexity
Minimum-Norm Interpolation Under Covariate Shift
Variance-reduced Zeroth-Order Methods for Fine-Tuning Language Models
Collage: Light-Weight Low-Precision Strategy for LLM Training
SelfVC: Voice Conversion With Iterative Refinement using Self Transformations
Evaluating Quantized Large Language Models
PhAST: Physics-Aware, Scalable, and Task-Specific GNNs for Accelerated Catalyst Design
First-Order Manifold Data Augmentation for Regression Learning
Recurrent Distance Filtering for Graph Representation Learning
Zero-Shot Unsupervised and Text-Based Audio Editing Using DDPM Inversion
Feel-Good Thompson Sampling for Contextual Dueling Bandits
Causal Discovery with Fewer Conditional Independence Tests
What is the Long-Run Distribution of Stochastic Gradient Descent? A Large Deviations Analysis
Exploiting Human-AI Dependence for Learning to Defer
Scene Graph Generation Strategy with Co-occurrence Knowledge and Learnable Term Frequency
Towards General Algorithm Discovery for Combinatorial Optimization: Learning Symbolic Branching Policy from Bipartite Graph
Gradient Compressed Sensing: A Query-Efficient Gradient Estimator for High-Dimensional Zeroth-Order Optimization
Fundamental Limitations of Alignment in Large Language Models
Flexible Residual Binarization for Image Super-Resolution
SFC: Achieve Accurate Fast Convolution under Low-precision Arithmetic
Predicting Dose-Response Curves with Deep Neural Networks
Provably Efficient Long-Horizon Exploration in Monte Carlo Tree Search through State Occupancy Regularization
Causal Effect Identification in LiNGAM Models with Latent Confounders
Transformers, parallel computation, and logarithmic depth
Extending Adversarial Attacks to Produce Adversarial Class Probability Distributions
Tilt and Average : Geometric Adjustment of the Last Layer for Recalibration
Weakly-Supervised Residual Evidential Learning for Multi-Instance Uncertainty Estimation
Sliding Down the Stairs: How Correlated Latent Variables Accelerate Learning with Neural Networks
Helpful or Harmful Data? Fine-tuning-free Shapley Attribution for Explaining Language Model Predictions
Pessimism Meets Risk: Risk-Sensitive Offline Reinforcement Learning
Rotational Equilibrium: How Weight Decay Balances Learning Across Neural Networks
Implicit Representations for Constrained Image Segmentation
Integrating Multimodal Data for Joint Generative Modeling of Complex Dynamics
When is Transfer Learning Possible?
Improving Context Understanding in Multimodal Large Language Models via Multimodal Composition Learning
Self-Consistency Training for Density-Functional-Theory Hamiltonian Prediction
InferCept: Efficient Intercept Support for Augmented Large Language Model Inference
Scalable Safe Policy Improvement for Factored Multi-Agent MDPs
Local Feature Selection without Label or Feature Leakage for Interpretable Machine Learning Predictions
Generalization Analysis for Multi-Label Learning
On the sample complexity of conditional independence testing with Von Mises estimator with application to causal discovery
Coarse-To-Fine Tensor Trains for Compact Visual Representations
How Smooth Is Attention?
Efficient Low-Rank Matrix Estimation, Experimental Design, and Arm-Set-Dependent Low-Rank Bandits
Lightweight Image Super-Resolution via Flexible Meta Pruning
On the Nonlinearity of Layer Normalization
KnowFormer: Revisiting Transformers for Knowledge Graph Reasoning
Fundamental Limits of Distributed Covariance Matrix Estimation Under Communication Constraints
Sign Rank Limitations for Inner Product Graph Decoders
LoCoCo: Dropping In Convolutions for Long Context Compression
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
GenCO: Generating Diverse Designs with Combinatorial Constraints
TravelPlanner: A Benchmark for Real-World Planning with Language Agents
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Dense Reward for Free in Reinforcement Learning from Human Feedback
Training-Free Long-Context Scaling of Large Language Models
MD tree: a model-diagnostic tree grown on loss landscape
ReLU Network with Width $d+\mathcal{O}(1)$ Can Achieve Optimal Approximation Rate
Characterizing ResNet's Universal Approximation Capability
Position: Social Choice Should Guide AI Alignment in Dealing with Diverse Human Feedback
PARDEN, Can You Repeat That? Defending against Jailbreaks via Repetition
Minimum Norm Interpolation Meets The Local Theory of Banach Spaces
Individual Fairness in Graph Decomposition
SciBench: Evaluating College-Level Scientific Problem-Solving Abilities of Large Language Models
Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement Learning
Bivariate Causal Discovery using Bayesian Model Selection
BRAIn: Bayesian Reward-conditioned Amortized Inference for natural language generation from feedback
Adaptive Online Experimental Design for Causal Discovery
Joint Composite Latent Space Bayesian Optimization
Breaking the Barrier: Enhanced Utility and Robustness in Smoothed DRL Agents
Spectral Phase Transition and Optimal PCA in Block-Structured Spiked Models
A Bayesian Approach to Online Planning
Theoretical insights for diffusion guidance: A case study for Gaussian mixture models
Uniform Memory Retrieval with Larger Capacity for Modern Hopfield Models
Provable Representation with Efficient Planning for Partially Observable Reinforcement Learning
Position: Video as the New Language for Real-World Decision Making
Agent Instructs Large Language Models to be General Zero-Shot Reasoners
Aligning Transformers with Weisfeiler-Leman
Position: Will we run out of data? Limits of LLM scaling based on human-generated data
Learning Linear Block Error Correction Codes
ACPO: A Policy Optimization Algorithm for Average MDPs with Constraints
Local vs. Global Interpretability: A Computational Complexity Perspective
A Study of First-Order Methods with a Deterministic Relative-Error Gradient Oracle
Fault Tolerant ML: Efficient Meta-Aggregation and Synchronous Training
Private and Federated Stochastic Convex Optimization: Efficient Strategies for Centralized Systems
Efficient Value Iteration for s-rectangular Robust Markov Decision Processes
Dynamic Byzantine-Robust Learning: Adapting to Switching Byzantine Workers
Retrieval-Augmented Score Distillation for Text-to-3D Generation
Parameter-Dependent Competitive Analysis for Online Capacitated Coverage Maximization through Boostings and Attenuations
Promoting External and Internal Equities Under Ex-Ante/Ex-Post Metrics in Online Resource Allocation
Eureka-Moments in Transformers: Multi-Step Tasks Reveal Softmax Induced Optimization Problems
Stochastic Q-learning for Large Discrete Action Spaces
Federated Combinatorial Multi-Agent Multi-Armed Bandits
In-Context Freeze-Thaw Bayesian Optimization for Hyperparameter Optimization
Bayesian Optimization of Function Networks with Partial Evaluations
Autoencoding Conditional Neural Processes for Representation Learning
BLO-SAM: Bi-level Optimization Based Finetuning of the Segment Anything Model for Overfitting-Preventing Semantic Segmentation
A Touch, Vision, and Language Dataset for Multimodal Alignment
Prospective Side Information for Latent MDPs
On The Complexity of First-Order Methods in Stochastic Bilevel Optimization
Neural NeRF Compression
SimPro: A Simple Probabilistic Framework Towards Realistic Long-Tailed Semi-Supervised Learning
Graph Neural Network Explanations are Fragile
AlphaFold Meets Flow Matching for Generating Protein Ensembles
Generalization Error of Graph Neural Networks in the Mean-field Regime
ConTextual: Evaluating Context-Sensitive Text-Rich Visual Reasoning in Large Multimodal Models
Explorations of Self-Repair in Language Models
Regularizing with Pseudo-Negatives for Continual Self-Supervised Learning
Block Acceleration Without Momentum: On Optimal Stepsizes of Block Gradient Descent for Least-Squares
Improving Group Robustness on Spurious Correlation Requires Preciser Group Inference
Effective Federated Graph Matching
Self-cognitive Denoising in the Presence of Multiple Noisy Label Sources
Explaining Graph Neural Networks via Structure-aware Interaction Index
GliDe with a CaPE: A Low-Hassle Method to Accelerate Speculative Decoding
Generative Conditional Distributions by Neural (Entropic) Optimal Transport
Federated Continual Learning via Prompt-based Dual Knowledge Transfer
An Interpretable Evaluation of Entropy-based Novelty of Generative Models
xT: Nested Tokenization for Larger Context in Large Images
Generative Modeling on Manifolds Through Mixture of Riemannian Diffusion Processes
Combinatorial Multivariant Multi-Armed Bandits with Applications to Episodic Reinforcement Learning and Beyond
SAM-E: Leveraging Visual Foundation Model with Sequence Imitation for Embodied Manipulation
Optimal Hessian/Jacobian-Free Nonconvex-PL Bilevel Optimization
Easing Concept Bleeding in Diffusion via Entity Localization and Anchoring
BayOTIDE: Bayesian Online Multivariate Time Series Imputation with Functional Decomposition
Toward Adaptive Reasoning in Large Language Models with Thought Rollback
When Will Gradient Regularization Be Harmful?
Sign is Not a Remedy: Multiset-to-Multiset Message Passing for Learning on Heterophilic Graphs
Iterated Denoising Energy Matching for Sampling from Boltzmann Densities
Privacy Profiles for Private Selection
From Yes-Men to Truth-Tellers: Addressing Sycophancy in Large Language Models with Pinpoint Tuning
Improving Neural Logic Machines via Failure Reflection
Less is More: on the Over-Globalizing Problem in Graph Transformers
Quantum Algorithms and Lower Bounds for Finite-Sum Optimization
Stochastic Localization via Iterative Posterior Sampling
Learning Modality Knowledge Alignment for Cross-Modality Transfer
A Nearly Optimal Single Loop Algorithm for Stochastic Bilevel Optimization under Unbounded Smoothness
Differentiable Model Scaling using Differentiable Topk
Energy-Efficient Gaussian Processes Using Low-Precision Arithmetic
LoRA Training in the NTK Regime has No Spurious Local Minima
Optimal Acceleration for Minimax and Fixed-Point Problems is Not Unique
Mol-AE: Auto-Encoder Based Molecular Representation Learning With 3D Cloze Test Objective
ESM All-Atom: Multi-Scale Protein Language Model for Unified Molecular Modeling
Pruned Pivot: Correlation Clustering Algorithm for Dynamic, Parallel, and Local Computation Models
MH-pFLID: Model Heterogeneous personalized Federated Learning via Injection and Distillation for Medical Data Analysis
Differentially Private Decentralized Learning with Random Walks
Discovering Multiple Solutions from a Single Task in Offline Reinforcement Learning
Towards Resource-friendly, Extensible and Stable Incomplete Multi-view Clustering
Adaptive Robust Learning using Latent Bernoulli Variables
Confidence Aware Inverse Constrained Reinforcement Learning
Scalable Multiple Kernel Clustering: Learning Clustering Structure from Expectation
Decouple then Classify: A Dynamic Multi-view Labeling Strategy with Shared and Specific Information
Solving Poisson Equations using Neural Walk-on-Spheres
Mean-field Underdamped Langevin Dynamics and its Spacetime Discretization
DoRA: Weight-Decomposed Low-Rank Adaptation
Flextron: Many-in-One Flexible Large Language Model
Bayesian Exploration Networks
Exact Conversion of In-Context Learning to Model Weights in Linearized-Attention Transformers
MLLM-as-a-Judge: Assessing Multimodal LLM-as-a-Judge with Vision-Language Benchmark
Prior Mismatch and Adaptation in PnP-ADMM with a Nonconvex Convergence Analysis
DecisionNCE: Embodied Multimodal Representations via Implicit Preference Learning
Learning to Compile Programs to Neural Networks
FedLMT: Tackling System Heterogeneity of Federated Learning via Low-Rank Model Training with Theoretical Guarantees
ELF: Encoding Speaker-Specific Latent Speech Feature for Speech Synthesis
Towards the Theory of Unsupervised Federated Learning: Non-asymptotic Analysis of Federated EM Algorithms
SAPG: Split and Aggregate Policy Gradients
Attack-free Evaluating and Enhancing Adversarial Robustness on Categorical Data
Bridging Environments and Language with Rendering Functions and Vision-Language Models
MoMo: Momentum Models for Adaptive Learning Rates
Multi-Fidelity Residual Neural Processes for Scalable Surrogate Modeling
Prompting4Debugging: Red-Teaming Text-to-Image Diffusion Models by Finding Problematic Prompts
Fine-grained Local Sensitivity Analysis of Standard Dot-Product Self-Attention
Gambling-Based Confidence Sequences for Bounded Random Vectors
Operator SVD with Neural Networks via Nested Low-Rank Approximation
Offline Actor-Critic Reinforcement Learning Scales to Large Models
Meta-Learners for Partially-Identified Treatment Effects Across Multiple Environments
Mitigating Oversmoothing Through Reverse Process of GNNs for Heterophilic Graphs
Hypergraph-enhanced Dual Semi-supervised Graph Classification
From Self-Attention to Markov Models: Unveiling the Dynamics of Generative Transformers
Online Linear Regression in Dynamic Environments via Discounting
PGODE: Towards High-quality System Dynamics Modeling
Uncertainty for Active Learning on Graphs
Decomposing and Editing Predictions by Modeling Model Computation
The Benefits of Reusing Batches for Gradient Descent in Two-Layer Networks: Breaking the Curse of Information and Leap Exponents
Online Learning and Information Exponents: The Importance of Batch size & Time/Complexity Tradeoffs
Asymptotics of feature learning in two-layer networks after one gradient-step
Initial Guessing Bias: How Untrained Networks Favor Some Classes
Polygonal Unadjusted Langevin Algorithms: Creating stable and efficient adaptive algorithms for neural networks
Taylor Videos for Action Recognition
On Statistical Learning Theory for Distributional Inputs
Structure-based drug design by denoising voxel grids
Towards Realistic Model Selection for Semi-supervised Learning
On the Second-Order Convergence of Biased Policy Gradient Algorithms
Finite Time Logarithmic Regret Bounds for Self-Tuning Regulation
Differentially Private Synthetic Data via Foundation Model APIs 2: Text
A Unified View of FANOVA: A Comprehensive Bayesian Framework for Component Selection and Estimation
Non-parametric Online Change Point Detection on Riemannian Manifolds
On the Unexpected Effectiveness of Reinforcement Learning for Sequential Recommendation
DFlow: A Generative Model Combining Denoising AutoEncoder and Normalizing Flow for High Fidelity Waveform Generation
On Online Experimentation without Device Identifiers
Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization
SPABA: A Single-Loop and Probabilistic Stochastic Bilevel Algorithm Achieving Optimal Sample Complexity
Transferable Facial Privacy Protection against Blind Face Restoration via Domain-Consistent Adversarial Obfuscation
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal
Emergent Equivariance in Deep Ensembles
Do Efficient Transformers Really Save Computation?
Sequence Compression Speeds Up Credit Assignment in Reinforcement Learning
GPTSwarm: Language Agents as Optimizable Graphs
Provably Robust DPO: Aligning Language Models with Noisy Feedback
Balanced Resonate-and-Fire Neurons
Path-Guided Particle-based Sampling
GFlowNet Training by Policy Gradients
Logistic Variational Bayes Revisited
Cross-domain Open-world Discovery
Decentralized Convex Finite-Sum Optimization with Better Dependence on Condition Numbers
Fewer Truncations Improve Language Modeling
Nesting Particle Filters for Experimental Design in Dynamical Systems
An Information-Theoretic Analysis of In-Context Learning
Balancing Feature Similarity and Label Variability for Optimal Size-Aware One-shot Subset Selection
Adaptive Advantage-Guided Policy Regularization for Offline Reinforcement Learning
Classification under Nuisance Parameters and Generalized Label Shift in Likelihood-Free Inference
A Simple Early Exiting Framework for Accelerated Sampling in Diffusion Models
Beyond the Norms: Detecting Prediction Errors in Regression Models
Improving Open-Ended Text Generation via Adaptive Decoding
A New Robust Partial p-Wasserstein-Based Metric for Comparing Distributions
Fast Timing-Conditioned Latent Audio Diffusion
Overcoming the Optimizer's Curse: Obtaining Realistic Prescriptions from Neural Networks
Density-Softmax: Efficient Test-time Model for Uncertainty Estimation and Robustness under Distribution Shifts
Causal Inference out of Control: Estimating Performativity without Treatment Randomization
Differentiability and Optimization of Multiparameter Persistent Homology
Hybrid Inverse Reinforcement Learning
Cooperative Graph Neural Networks
Generalization to New Sequential Decision Making Tasks with In-Context Learning
Allocation Requires Prediction Only if Inequality Is Low
Latent Space Symmetry Discovery
Can Gaussian Sketching Converge Faster on a Preconditioned Landscape?
Leverage Class-Specific Accuracy to Guide Data Generation for Improving Image Classification
Data-free Neural Representation Compression with Riemannian Neural Dynamics
FlowMM: Generating Materials with Riemannian Flow Matching
Bespoke Non-Stationary Solvers for Fast Sampling of Diffusion and Flow Models
Variational Schrödinger Diffusion Models
Improving Transformers with Dynamically Composable Multi-Head Attention
Constrained Exploration via Reflected Replica Exchange Stochastic Gradient Langevin Dynamics
Learning Universal Predictors
Subgraphormer: Unifying Subgraph GNNs and Graph Transformers via Graph Products
Efficient Exploration for LLMs
Overcoming Data and Model heterogeneities in Decentralized Federated Learning via Synthetic Anchors
Amortizing Pragmatic Program Synthesis with Rankings
Memoria: Resolving Fateful Forgetting Problem through Human-Inspired Memory Architecture
Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks
Surprisingly Strong Performance Prediction with Neural Graph Features
Stability and Multigroup Fairness in Ranking with Uncertain Predictions
Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations
GiLOT: Interpreting Generative Language Models via Optimal Transport
Transitional Uncertainty with Layered Intermediate Predictions
Prompt Sketching for Large Language Models
Probabilistic Conceptual Explainers: Trustworthy Conceptual Explanations for Vision Foundation Models
Mitigating Privacy Risk in Membership Inference by Convex-Concave Loss
Diversified Batch Selection for Training Acceleration
MALIBO: Meta-learning for Likelihood-free Bayesian Optimization
A Geometric Decomposition of Finite Games: Convergence vs. Recurrence under Exponential Weights
Rethinking Data Shapley for Data Selection Tasks: Misleads and Merits
Scaling Laws for the Value of Individual Data Points in Machine Learning
Learning and Forgetting Unsafe Examples in Large Language Models
In-context Vectors: Making In Context Learning More Effective and Controllable Through Latent Space Steering
Prospector Heads: Generalized Feature Attribution for Large Models & Data
How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis
Selecting Large Language Model to Fine-tune via Rectified Scaling Law
PolySketchFormer: Fast Transformers via Sketching Polynomial Kernels
Lyapunov-stable Neural Control for State and Output Feedback: A Novel Formulation
When Do Skills Help Reinforcement Learning? A Theoretical Analysis of Temporal Abstractions
Assessing Large Language Models on Climate Information
The WMDP Benchmark: Measuring and Reducing Malicious Use with Unlearning
Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities
An Explicit Frame Construction for Normalizing 3D Point Clouds
Align Your Steps: Optimizing Sampling Schedules in Diffusion Models
Distributional Bellman Operators over Mean Embeddings
Comparing Graph Transformers via Positional Encodings
ETHER: Efficient Finetuning of Large-Scale Models with Hyperplane Reflections
DéjàVu: KV-cache Streaming for Fast, Fault-tolerant Generative LLM Serving
Private Heterogeneous Federated Learning Without a Trusted Server Revisited: Error-Optimal and Communication-Efficient Algorithms for Convex Losses
Incremental Topological Ordering and Cycle Detection with Predictions
Reweighted Solutions for Weighted Low Rank Approximation
Coresets for Multiple $\ell_p$ Regression
Fast, Scalable, Warm-Start Semidefinite Programming with Spectral Bundling and Sketching
Learning from Integral Losses in Physics Informed Neural Networks
Model-Free Robust $\phi$-Divergence Reinforcement Learning Using Both Offline and Online Data
A Linear Time and Space Local Point Cloud Geometry Encoder via Vectorized Kernel Mixture (VecKM)
WAVES: Benchmarking the Robustness of Image Watermarks
How Deep Do We Need: Accelerating Training and Inference of Neural ODEs via Control Perspective
GPT-4V(ision) is a Generalist Web Agent, if Grounded
LIDAO: Towards Limited Interventions for Debiasing (Large) Language Models
Scalable and Flexible Causal Discovery with an Efficient Test for Adjacency
Continuous Treatment Effects with Surrogate Outcomes
Editing Partially Observable Networks via Graph Diffusion Models
Meta Evidential Transformer for Few-Shot Open-Set Recognition
Neural Image Compression with Text-guided Encoding for both Pixel-level and Perceptual Fidelity
NExT-Chat: An LMM for Chat, Detection and Segmentation
CKGConv: General Graph Convolution with Continuous Kernels
Few-Shot Character Understanding in Movies as an Assessment to Meta-Learning of Theory-of-Mind
Position: Social Environment Design Should be Further Developed for AI-based Policy-Making
CaPS: Collaborative and Private Synthetic Data Generation from Distributed Sources
${\rm E}(3)$-Equivariant Actor-Critic Methods for Cooperative Multi-Agent Reinforcement Learning
Grokking Group Multiplication with Cosets
Monitoring AI-Modified Content at Scale: A Case Study on the Impact of ChatGPT on AI Conference Peer Reviews
Cell2Sentence: Teaching Large Language Models the Language of Biology
The Effect of Weight Precision on the Neuron Count in Deep ReLU Networks
Augmenting Decision with Hypothesis in Reinforcement Learning
Outlier-Efficient Hopfield Layers for Large Transformer-Based Models
High-Dimensional Geometric Streaming for Nearly Low Rank Data
PriorBoost: An Adaptive Algorithm for Learning from Aggregate Responses
Perturb-and-Project: Differentially Private Similarities and Marginals
Data-Efficient Learning via Clustering-Based Sensitivity Sampling: Foundation Models and Beyond
A Field Guide for Pacing Budget and ROS Constraints
PRISE: LLM-Style Sequence Compression for Learning Temporal Action Abstractions in Control
Deep Networks Always Grok and Here is Why
Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-constraint
Contrastive Predict-and-Search for Mixed Integer Linear Programs
Learning Decision Trees and Forests with Algorithmic Recourse
Deep Equilibrium Models are Almost Equivalent to Not-so-deep Explicit Models for High-dimensional Gaussian Mixtures
AMPA: Adaptive Mixed Precision Allocation for Low-Bit Integer Training
MusicFlow: Cascaded Flow Matching for Text Guided Music Generation
Beyond the Federation: Topology-aware Federated Learning for Generalization to Unseen Clients
Bayesian Design Principles for Offline-to-Online Reinforcement Learning
Differentiable Distributionally Robust Optimization Layers
Generating In-Distribution Proxy Graphs for Explaining Graph Neural Networks
TimeX++: Learning Time-Series Explanations with Information Bottleneck
Should we be going MAD? A Look at Multi-Agent Debate Strategies for LLMs
NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models
Position: Near to Mid-term Risks and Opportunities of Open-Source Generative AI
Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study
Open-Domain Text Evaluation via Contrastive Distribution Methods
DiNADO: Norm-Disentangled Neurally-Decomposed Oracles for Controlling Language Models
Listening to the noise: Blind Denoising with Gibbs Diffusion
GATE: How to Keep Out Intrusive Neighbors
End-to-End Neuro-Symbolic Reinforcement Learning with Textual Explanations
Larimar: Large Language Models with Episodic Memory Control
What Would Gauss Say About Representations? Probing Pretrained Image Models using Synthetic Gaussian Benchmarks
Position: Data-driven Discovery with Large Generative Models
Matrix Information Theory for Self-Supervised Learning
Information Flow in Self-Supervised Learning
Better & Faster Large Language Models via Multi-token Prediction
ALERT-Transformer: Bridging Asynchronous and Synchronous Machine Learning for Real-Time Event-based Spatio-Temporal Data
CosPGD: an efficient white-box adversarial attack for pixel-wise prediction tasks
Promptbreeder: Self-Referential Self-Improvement via Prompt Evolution
Position: Open-Endedness is Essential for Artificial Superhuman Intelligence
Debating with More Persuasive LLMs Leads to More Truthful Answers
Genie: Generative Interactive Environments
HAMLET: Graph Transformer Neural Operator for Partial Differential Equations
ATraDiff: Accelerating Online Reinforcement Learning with Imaginary Trajectories
Data-Efficient Molecular Generation with Hierarchical Textual Inversion
All-in-one simulation-based inference
Learning from Memory: Non-Parametric Memory Augmented Self-Supervised Learning of Visual Features
Position: A Safe Harbor for AI Evaluation and Red Teaming
Evaluation of Trajectory Distribution Predictions with Energy Score
Position: Future Directions in the Theory of Graph Machine Learning
Hierarchical Integral Probability Metrics: A distance on random probability measures with low sample complexity
HexGen: Generative Inference of Large Language Model over Heterogeneous Environment
Position: An Inner Interpretability Framework for AI Inspired by Lessons from Cognitive Neuroscience
Implicit Bias of Policy Gradient in Linear Quadratic Control: Extrapolation to Unseen Initial States
Stochastic positional embeddings improve masked image modeling
Differentially Private Representation Learning via Image Captioning
Unleashing the Power of Meta-tuning for Few-shot Generalization Through Sparse Interpolated Experts
Universality of Linear Recurrences Followed by Non-linear Projections: Finite-Width Guarantees and Benefits of Complex Eigenvalues
PlanDQ: Hierarchical Plan Orchestration via D-Conductor and Q-Performer
On Gradient-like Explanation under a Black-box Setting: When Black-box Explanations Become as Good as White-box
Tabular Insights, Visual Impacts: Transferring Expertise from Tables to Images
Neural operators meet conjugate gradients: The FCG-NO method for efficient PDE solving
No Double Descent in Principal Component Regression: A High-Dimensional Analysis
Dynamic Facility Location in High Dimensional Euclidean Spaces
Defense against Backdoor Attack on Pre-trained Language Models via Head Pruning and Attention Normalization
Rethinking the Flat Minima Searching in Federated Learning
Bounded and Uniform Energy-based Out-of-distribution Detection for Graphs
Deconstructing the Goldilocks Zone of Neural Network Initialization
AutoOS: Make Your OS More Powerful by Exploiting Large Language Models
Prompt-based Visual Alignment for Zero-shot Policy Transfer
Gradient-based Visual Explanation for Transformer-based CLIP
Performance Bounds for Active Binary Testing with Information Maximization
Latent Noise Segmentation: How Neural Noise Leads to the Emergence of Segmentation and Grouping
Provable Benefits of Local Steps in Heterogeneous Federated Learning for Neural Networks: A Feature Learning Perspective
Relaxed Quantile Regression: Prediction Intervals for Asymmetric Noise
A Fixed-Point Approach for Causal Generative Modeling
Towards Causal Foundation Model: on Duality between Optimal Balancing and Attention
Conditional Language Learning with Context
Getting the most out of your tokenizer for pre-training and domain adaptation
Optimization without Retraction on the Random Generalized Stiefel Manifold
Representing Molecules as Random Walks Over Interpretable Grammars
Towards a Better Theoretical Understanding of Independent Subnetwork Training
AegisFL: Efficient and Flexible Privacy-Preserving Byzantine-Robust Cross-silo Federated Learning
One for All: A Universal Generator for Concept Unlearnability via Multi-Modal Alignment
Forget Sharpness: Perturbed Forgetting of Model Biases Within SAM Dynamics
Just Cluster It: An Approach for Exploration in High-Dimensions using Clustering and Pre-Trained Representations
Risk Aware Benchmarking of Large Language Models
Deeper or Wider: A Perspective from Optimal Generalization Error with Sobolev Loss
Towards a Self-contained Data-driven Global Weather Forecasting Framework
Two-timescale Derivative Free Optimization for Performative Prediction with Markovian Data
To Each (Textual Sequence) Its Own: Improving Memorized-Data Unlearning in Large Language Models
S$\Omega$I: Score-based O-INFORMATION Estimation
SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks
DeepPolar: Inventing Nonlinear Large-Kernel Polar Codes via Deep Learning
Two Heads are Actually Better than One: Towards Better Adversarial Robustness via Transduction and Rejection
On the Sample Complexity and Metastability of Heavy-tailed Policy Search in Continuous Control
Position: On the Possibilities of AI-Generated Text Detection
PIPER: Primitive-Informed Preference-based Hierarchical Reinforcement Learning via Hindsight Relabeling
MaxMin-RLHF: Alignment with Diverse Human Preferences
Towards Global Optimality for Practical Average Reward Reinforcement Learning without Mixing Time Oracles
Don't be so Negative! Score-based Generative Modeling with Oracle-assisted Guidance
Et Tu Certifications: Robustness Certificates Yield Better Adversarial Examples
Quantum Theory and Application of Contextual Optimal Transport
Few-Shot Unsupervised Implicit Neural Shape Representation Learning with Spatial Adversaries
Learning with Partial-Label and Unlabeled Data: A Uniform Treatment for Supervision Redundancy and Insufficiency
HyperFields: Towards Zero-Shot Generation of NeRFs from Text
Guarantees for Nonlinear Representation Learning: Non-identical Covariates, Dependent Data, Fewer Samples
AlphaZero-Like Tree-Search can Guide Large Language Model Decoding and Training
Active Ranking and Matchmaking, with Perfect Matchings
Causal Bandits: The Pareto Optimal Frontier of Adaptivity, a Reduction to Linear Bandits, and Limitations around Unknown Marginals
Towards Neural Architecture Search through Hierarchical Generative Modeling
Learning from Students: Applying t-Distributions to Explore Accurate and Efficient Formats for LLMs
Amortized Variational Deep Kernel Learning
Disentanglement Learning via Topology
SaVeR: Optimal Data Collection Strategy for Safe Policy Evaluation in Tabular MDP
Codebook Features: Sparse and Discrete Interpretability for Neural Networks
Quality-Diversity Actor-Critic: Learning High-Performing and Diverse Behaviors via Value and Successor Features Critics
Position: Embracing Negative Results in Machine Learning
Position: Reinforcement Learning in Dynamic Treatment Regimes Needs Critical Reexamination
Learning Divergence Fields for Shift-Robust Graph Representations
How Graph Neural Networks Learn: Lessons from Training Dynamics
Graph Out-of-Distribution Detection Goes Neighborhood Shaping
Stay on Topic with Classifier-Free Guidance
Position: On the Societal Impact of Open Foundation Models
Learning Label Shift Correction for Test-Agnostic Long-Tailed Recognition
On the Recoverability of Causal Relations from Temporally Aggregated I.I.D. Data
Evaluation of Test-Time Adaptation Under Computational Time Constraints
Towards Interpretable Deep Local Learning with Successive Gradient Reconciliation
An Empirical Study of Realized GNN Expressiveness
Self-Driven Entropy Aggregation for Byzantine-Robust Heterogeneous Federated Learning
Understanding MLP-Mixer as a wide and sparse MLP
Self-attention Networks Localize When QK-eigenspectrum Concentrates
Faster Streaming and Scalable Algorithms for Finding Directed Dense Subgraphs in Large Graphs
Sparse-IFT: Sparse Iso-FLOP Transformations for Maximizing Training Efficiency
Scalable Real-Time Recurrent Learning Using Columnar-Constructive Networks
Effect-Invariant Mechanisms for Policy Generalization
eCeLLM: Generalizing Large Language Models for E-commerce from Large-scale, High-quality Instruction Data
Differentiable Combinatorial Scheduling at Scale
Bottleneck-Minimal Indexing for Generative Document Retrieval
Model-based Reinforcement Learning for Parameterized Action Spaces
Efficient Mixture Learning in Black-Box Variational Inference
Indirectly Parameterized Concrete Autoencoders
EVEREST: Efficient Masked Video Autoencoder by Removing Redundant Spatiotemporal Tokens
STELLA: Continual Audio-Video Pre-training with SpatioTemporal Localized Alignment
BECoTTA: Input-dependent Online Blending of Experts for Continual Test-time Adaptation
Covert Malicious Finetuning: Challenges in Safeguarding LLM Adaptation
What is Dataset Distillation Learning?
DPZero: Private Fine-Tuning of Language Models without Backpropagation
MLI Formula: A Nearly Scale-Invariant Solution with Noise Perturbation
Sampling in Unit Time with Kernel Fisher-Rao Flow
Structure-Aware E(3)-Invariant Molecular Conformer Aggregation Networks
Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity
Quantum Positional Encodings for Graph Neural Networks
Hidden Traveling Waves bind Working Memory Variables in Recurrent Neural Networks
Disentangled 3D Scene Generation with Layout Learning
Federated Offline Reinforcement Learning: Collaborative Single-Policy Coverage Suffices
Improved Dimensionality Dependence for Zeroth-Order Optimisation over Cross-Polytopes
A New Computationally Efficient Algorithm to solve Feature Selection for Functional Data Classification in High-dimensional Spaces
A Sparsity Principle for Partially Observable Causal Representation Learning
Conditional Normalizing Flows for Active Learning of Coarse-Grained Molecular Representations
Probability Distribution of Hypervolume Improvement in Bi-objective Bayesian Optimization
Position: Opportunities Exist for Machine Learning in Magnetic Fusion Energy
Bounding the Excess Risk for Linear Models Trained on Marginal-Preserving, Differentially-Private, Synthetic Data
Major-Minor Mean Field Multi-Agent Reinforcement Learning
Understanding Forgetting in Continual Learning with Linear Regression
Quality-Diversity with Limited Resources
Near-Optimal Regret in Linear MDPs with Aggregate Bandit Feedback
On the Independence Assumption in Neurosymbolic Learning
diff History for Neural Language Agents
Hierarchical State Space Models for Continuous Sequence-to-Sequence Modeling
Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning?
CuTS: Customizable Tabular Synthetic Data Generation
Instruction Tuning for Secure Code Generation
Mimicking Better by Matching the Approximate Action Distribution
Asymmetry in Low-Rank Adapters of Foundation Models
Slicing Mutual Information Generalization Bounds for Neural Networks
Position: Fundamental Limitations of LLM Censorship Necessitate New Approaches
Information Complexity of Stochastic Convex Optimization: Applications to Generalization, Memorization, and Tracing
Bifurcated Attention for Single-Context Large-Batch Sampling
Breadth-First Exploration on Adaptive Grid for Reinforcement Learning
Smoothing Proximal Gradient Methods for Nonsmooth Sparsity Constrained Optimization: Optimality Conditions and Global Convergence
Counterfactual Reasoning for Multi-Label Image Classification via Patching-Based Training
Learning to Model the World With Language
The Merit of River Network Topology for Neural Flood Forecasting
Target Networks and Over-parameterization Stabilize Off-policy Bootstrapping with Function Approximation
Learning Temporal Distances: Contrastive Successor Features Can Provide a Metric Structure for Decision-Making
Remembering to Be Fair: Non-Markovian Fairness in Sequential Decision Making
Disguised Copyright Infringement of Latent Diffusion Models
Predictive Linear Online Tracking for Unknown Targets
Learning Latent Dynamic Robust Representations for World Models
Stealing part of a production language model
Clifford-Steerable Convolutional Neural Networks
Sub-token ViT Embedding via Stochastic Resonance Transformers
Dynamic Metric Embedding into lp Space
Graph2Tac: Online Representation Learning of Formal Math Concepts
Diffusion Language Models Are Versatile Protein Learners
BWS: Best Window Selection Based on Sample Scores for Data Pruning across Broad Ranges
Localizing Task Information for Improved Model Merging and Compression
Detecting and Identifying Selection Structure in Sequential Data
Data Engineering for Scaling Language Models to 128K Context
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models
The Emergence of Reproducibility and Consistency in Diffusion Models
Accelerated Speculative Sampling Based on Tree Monte Carlo
Random Latent Exploration for Deep Reinforcement Learning
Enhancing Size Generalization in Graph Neural Networks through Disentangled Representation Learning
EvoluNet: Advancing Dynamic Non-IID Transfer Learning on Graphs
Active Statistical Inference
Protein Conformation Generation via Force-Guided SE(3) Diffusion Models
CHAI: Clustered Head Attention for Efficient LLM Inference
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
Deep Demonstration Tracing: Learning Generalizable Imitator Policy for Runtime Imitation from a Single Demonstration
Policy-conditioned Environment Models are More Generalizable
SILVER: Single-loop variance reduction and application to federated learning
Hierarchical Neural Operator Transformer with Learnable Frequency-aware Loss Prior for Arbitrary-scale Super-resolution
InterLUDE: Interactions between Labeled and Unlabeled Data to Enhance Semi-Supervised Learning
Using Left and Right Brains Together: Towards Vision and Language Planning
SMaRt: Improving GANs with Score Matching Regularity
ODIN: Disentangled Reward Mitigates Hacking in RLHF
Modular Learning of Deep Causal Generative Models for High-dimensional Causal Inference
Interplay of ROC and Precision-Recall AUCs: Theoretical Limits and Practical Implications in Binary Classification
FuRL: Visual-Language Models as Fuzzy Rewards for Reinforcement Learning
On Discrete Prompt Optimization for Diffusion Models
Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models
Directly Denoising Diffusion Models
Nearest Neighbour Score Estimators for Diffusion Generative Models
What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation
Thermometer: Towards Universal Calibration for Large Language Models
Out-of-Domain Generalization in Dynamical Systems Reconstruction
Bring Your Own (Non-Robust) Algorithm to Solve Robust MDPs by Estimating The Worst Kernel
VideoPoet: A Large Language Model for Zero-Shot Video Generation
Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications
TimeMIL: Advancing Multivariate Time Series Classification via a Time-aware Multiple Instance Learning
Position: LLMs Can’t Plan, But Can Help Planning in LLM-Modulo Frameworks
New Sample Complexity Bounds for Sample Average Approximation in Heavy-Tailed Stochastic Programming
EvTexture: Event-driven Texture Enhancement for Video Super-Resolution
Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling
Feature Reuse and Scaling: Understanding Transfer Learning with Protein Language Models
Learning Coverage Paths in Unknown Environments with Deep Reinforcement Learning
No Wrong Turns: The Simple Geometry Of Neural Networks Optimization Paths
MOMENT: A Family of Open Time-series Foundation Models
SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization
Counterfactual Image Editing
Learning to Route Among Specialized Experts for Zero-Shot Generalization
Fast Adversarial Attacks on Language Models In One GPU Minute
Orthogonal Bootstrap: Efficient Simulation of Input Uncertainty
EvIL: Evolution Strategies for Generalisable Imitation Learning
Improving Prototypical Visual Explanations with Reward Reweighing, Reselection, and Retraining
Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
On The Statistical Complexity of Offline Decision-Making
Probabilistic Constrained Reinforcement Learning with Formal Interpretability
Bringing Motion Taxonomies to Continuous Domains via GPLVM on Hyperbolic manifolds
Variational Learning is Effective for Large Deep Networks
Light and Optimal Schrödinger Bridge Matching
Controlled Decoding from Language Models
On the Duality Between Sharpness-Aware Minimization and Adversarial Training
Liouville Flow Importance Sampler
Offline Training of Language Model Agents with Functions as Learnable Weights
Scaling Exponents Across Parameterizations and Optimizers
Discovering Symmetry Breaking in Physical Systems with Relaxed Group Convolution
SPADE: Sparsity-Guided Debugging for Deep Neural Networks
Error Feedback Can Accurately Compress Preconditioners
Extreme Compression of Large Language Models via Additive Quantization
AttNS: Attention-Inspired Numerical Solving For Limited Data Scenarios
LEVI: Generalizable Fine-tuning via Layer-wise Ensemble of Different Views
Characterizing Large Language Model Geometry Helps Solve Toxicity Detection and Generation
Ameliorate Spurious Correlations in Dataset Condensation
VideoPrism: A Foundational Visual Encoder for Video Understanding
Particle Denoising Diffusion Sampler
LaMAGIC: Language-Model-based Topology Generation for Analog Integrated Circuits
Optimistic Multi-Agent Policy Gradient
How do Transformers Perform In-Context Autoregressive Learning ?
Vision Transformers as Probabilistic Expansion from Learngene
Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation
Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast
Graph Positional and Structural Encoder
Modeling Language Tokens as Functionals of Semantic Fields
Inferring Change Points in High-Dimensional Linear Regression via Approximate Message Passing
Score-Based Causal Discovery of Latent Variable Causal Models
DITTO: Diffusion Inference-Time T-Optimization for Music Generation
Language Models as Science Tutors
Acquisition Conditioned Oracle for Nongreedy Active Feature Acquisition
Position: Technical Research and Talent is Needed for Effective AI Governance
Token-Specific Watermarking with Enhanced Detectability and Semantic Coherence for Large Language Models
SpikeLM: Towards General Spike-Driven Language Modeling via Elastic Bi-Spiking Mechanisms
Integrated Hardware Architecture and Device Placement Search
Noise-Adaptive Confidence Sets for Linear Bandits and Application to Bayesian Optimization
Fast Sampling-Based Sketches for Tensors
Benchmarking and Building Long-Context Retrieval Models with LoCo and M2-BERT
Diffusion Models Encode the Intrinsic Dimension of Data Manifolds
Graph Neural Networks Use Graphs When They Shouldn't
Two Fists, One Heart: Multi-Objective Optimization Based Strategy Fusion for Long-tailed Learning
AD3: Implicit Action is the Key for World Models to Distinguish the Diverse Visual Distractors
Data-free Distillation of Diffusion Models with Bootstrapping
What’s the score? Automated Denoising Score Matching for Nonlinear Diffusions
Stochastic Interpolants with Data-Dependent Couplings
Adaptive Sampling of k-Space in Magnetic Resonance for Rapid Pathology Prediction
Efficient Black-box Adversarial Attacks via Bayesian Optimization Guided by a Function Prior
Self-Rewarding Language Models
COALA: A Practical and Vision-Centric Federated Learning Platform
Arrows of Time for Large Language Models
Overcoming Saturation in Density Ratio Estimation by Iterated Regularization
In-Context Learning Agents Are Asymmetric Belief Updaters
Efficient Algorithms for Empirical Group Distributionally Robust Optimization and Beyond
Tag-LLM: Repurposing General-Purpose LLMs for Specialized Domains
FADAS: Towards Federated Adaptive Asynchronous Optimization
Fast and Sample Efficient Multi-Task Representation Learning in Stochastic Contextual Bandits
A Closer Look at the Limitations of Instruction Tuning
Sign Gradient Descent-based Neuronal Dynamics: ANN-to-SNN Conversion Beyond ReLU Network
Theoretical Analysis of Learned Database Operations under Distribution Shift through Distribution Learnability
Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback
Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning
EquiPocket: an E(3)-Equivariant Geometric Graph Neural Network for Ligand Binding Site Prediction
Is Kernel Prediction More Powerful than Gating in Convolutional Neural Networks?
Dynamic Anisotropic Smoothing for Noisy Derivative-Free Optimization
Incorporating probabilistic domain knowledge into deep multiple instance learning
Mean-field Analysis on Two-layer Neural Networks from a Kernel Perspective
In-Context Language Learning: Architectures and Algorithms
Gated Linear Attention Transformers with Hardware-Efficient Training
Agnostic Sample Compression Schemes for Regression
ArtWhisperer: A Dataset for Characterizing Human-AI Interactions in Artistic Creations
Dealing With Unbounded Gradients in Stochastic Saddle-point Optimization
Reducing Item Discrepancy via Differentially Private Robust Embedding Alignment for Privacy-Preserving Cross Domain Recommendation
Masked Face Recognition with Generative-to-Discriminative Representations
Recovering the Pre-Fine-Tuning Weights of Generative Models
Plug-in Performative Optimization
Understanding Finetuning for Factual Knowledge Extraction
Re-Dock: Towards Flexible and Realistic Molecular Docking with Diffusion Bridge
Estimating Canopy Height at Scale
Learning to Predict Mutational Effects of Protein-Protein Interactions by Microenvironment-aware Hierarchical Prompt Learning
Position: Cracking the Code of Cascading Disparity Towards Marginalized Communities
A Global Geometric Analysis of Maximal Coding Rate Reduction
The Pitfalls of Next-Token Prediction
Failures Are Fated, But Can Be Faded: Characterizing and Mitigating Unwanted Behaviors in Large-Scale Vision and Language Models
Navigating Scaling Laws: Compute Optimality in Adaptive Model Training
A Language Model’s Guide Through Latent Space
PDHG-Unrolled Learning-to-Optimize Method for Large-Scale Linear Programming
One Meta-tuned Transformer is What You Need for Few-shot Learning
Generating Chain-of-Thoughts with a Pairwise-Comparison Approach to Searching for the Most Promising Intermediate Thought
Conformal Prediction for Deep Classifier via Label Ranking
Position: TrustLLM: Trustworthiness in Large Language Models
Longitudinal Targeted Minimum Loss-based Estimation with Temporal-Difference Heterogeneous Transformer
Multiplicative Weights Update, Area Convexity and Random Coordinate Descent for Densest Subgraph Problems
Representation Surgery: Theory and Practice of Affine Steering
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models
Accelerating Federated Learning with Quick Distributed Mean Estimation
From Classification Accuracy to Proper Scoring Rules: Elicitability of Probabilistic Top List Predictions
Layerwise Proximal Replay: A Proximal Point Method for Online Continual Learning
A Sober Look at LLMs for Material Discovery: Are They Actually Good for Bayesian Optimization Over Molecules?
Rate-Optimal Policy Optimization for Linear Markov Decision Processes
Fast Peer Adaptation with Context-aware Exploration
Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text
Learning to Infer Generative Template Programs for Visual Concepts
Gibbs Sampling of Continuous Potentials on a Quantum Computer
Dual Operating Modes of In-Context Learning
D-Flow: Differentiating through Flows for Controlled Generation
Transformers Implement Functional Gradient Descent to Learn Non-Linear Functions In Context
Unveiling the Dynamics of Information Interplay in Supervised Learning
Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling
Integrating Global Context Contrast and Local Sensitivity for Blind Image Quality Assessment
Adaptive Feature Selection for No-Reference Image Quality Assessment by Mitigating Semantic Noise Sensitivity
Can AI Assistants Know What They Don't Know?
Low-Rank Bandits via Tight Two-to-Infinity Singular Subspace Recovery
Classification Under Strategic Self-Selection
Degeneration-free Policy Optimization: RL Fine-Tuning for Language Models without Degeneration
Estimating Distributional Treatment Effects in Randomized Experiments: Machine Learning for Variance Reduction
USTAD: Unified Single-model Training Achieving Diverse Scores for Information Retrieval
StyDeSty: Min-Max Stylization and Destylization for Single Domain Generalization
Multi-layer Rehearsal Feature Augmentation for Class-Incremental Learning
On the Role of Edge Dependency in Graph Generative Models
Consistent Long-Term Forecasting of Ergodic Dynamical Systems
Visual-Text Cross Alignment: Refining the Similarity Score in Vision-Language Models
Learning to Explore in POMDPs with Informational Rewards
PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs
Can Mamba Learn How To Learn? A Comparative Study on In-Context Learning Tasks
Structure Your Data: Towards Semantic Graph Counterfactuals
Efficient Adaptation in Mixed-Motive Environments via Hierarchical Opponent Modeling and Planning
Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws
Accelerating Look-ahead in Bayesian Optimization: Multilevel Monte Carlo is All you Need
A Persuasive Approach to Combating Misinformation
Fine-tuning Reinforcement Learning Models is Secretly a Forgetting Mitigation Problem
Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models
HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding
Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data Flow and Per-Block Quantization
Self-Supervised Interpretable End-to-End Learning via Latent Functional Modularity
High-Dimensional Bayesian Optimization via Semi-Supervised Learning with Optimized Unlabeled Data Sampling
A Unified Adaptive Testing System Enabled by Hierarchical Structure Search
Observable Propagation: Uncovering Feature Vectors in Transformers
Complexity Matters: Feature Learning in the Presence of Spurious Correlations
EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty
Copyright Traps for Large Language Models
PAGER: Accurate Failure Characterization in Deep Regression Models
Policy Evaluation for Variance in Average Reward Reinforcement Learning
Interpreting Equivariant Representations
Risk Estimation in a Markov Cost Process: Lower and Upper Bounds
Physics and Lie symmetry informed Gaussian processes
Characterizing Truthfulness in Large Language Model Generations with Local Intrinsic Dimension
Robust Yet Efficient Conformal Prediction Sets
On the Trajectory Regularity of ODE-based Diffusion Sampling
CF-OPT: Counterfactual Explanations for Structured Prediction
Differentiable Weightless Neural Networks
Adaptive Observation Cost Control for Variational Quantum Eigensolvers
Learning Low-dimensional Latent Dynamics from High-dimensional Observations: Non-asymptotics and Lower Bounds
RoSA: Accurate Parameter-Efficient Fine-Tuning via Robust Adaptation
Kepler codebook
Tandem Transformers for Inference Efficient LLMs
We use cookies to store which papers have been visited.
I agree
Successful Page Load
ICML uses cookies for essential functions only. We do not sell your personal information.
Our Privacy Policy »
Accept Cookies
We use cookies to store which papers have been visited.
I agree