Skip to yearly menu bar
Skip to main content
Main Navigation
ICML
Help/FAQ
Contact ICML
Downloads
Code of Conduct
Create Profile
Journal To Conference Track
Diversity & Inclusion
Privacy Policy
Press
Careers
My Stuff
Login
Select Year: (2024)
2024
2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2002
1996
IMLS Archives
Getting Started
Schedule
Tutorials
Main Conference
Orals
Awards
Test of Time Award
Papers
Invited Talks
Workshops
Community
Socials
Town Hall / Business Meeting
Affinity Events
Exhibitors
Organizers
Help
Presenters Instructions
Moderators Instructions
RocketChat Help
RocketChat Desktop Client
FAQ
Browse
mini
compact
topic
detail
Showing papers for
.
×
×
title
author
topic
session
shuffle
by
serendipity
bookmarked first
visited first
not visited first
bookmarked but not visited
Enable Javascript in your browser to see the papers page.
Learning to Compile Programs to Neural Networks
MADA: Meta-Adaptive Optimizers Through Hyper-Gradient Descent
The Pitfalls and Promise of Conformal Inference Under Adversarial Attacks
SleepFM: Multi-modal Representation Learning for Sleep Across Brain Activity, ECG and Respiratory Signals
Faithfulness Measurable Masked Language Models
Graphon Mean Field Games with a Representative Player: Analysis and Learning Algorithm
Controlled Decoding from Language Models
Position: Stop Making Unscientific AGI Performance Claims
Understanding Retrieval-Augmented Task Adaptation for Vision-Language Models
VideoPoet: A Large Language Model for Zero-Shot Video Generation
Multi-Agent Reinforcement Learning with Hierarchical Coordination for Emergency Responder Stationing
VinT-6D: A Large-Scale Object-in-hand Dataset from Vision, Touch and Proprioception
GALA3D: Towards Text-to-3D Complex Scene Generation via Layout-guided Generative Gaussian Splatting
Bridging discrete and continuous state spaces: Exploring the Ehrenfest process in time-continuous diffusion models
eCeLLM: Generalizing Large Language Models for E-commerce from Large-scale, High-quality Instruction Data
Learning Optimal Projection for Forecast Reconciliation of Hierarchical Time Series
ULAREF: A Unified Label Refinement Framework for Learning with Inaccurate Supervision
Swallowing the Bitter Pill: Simplified Scalable Conformer Generation
StyDeSty: Min-Max Stylization and Destylization for Single Domain Generalization
Neural Networks Learn Statistics of Increasing Complexity
Do Transformer World Models Give Better Policy Gradients?
Augmenting Decision with Hypothesis in Reinforcement Learning
Towards Global Optimality for Practical Average Reward Reinforcement Learning without Mixing Time Oracles
Closing the Gap: Achieving Global Convergence (Last Iterate) of Actor-Critic under Markovian Sampling with Neural Network Parametrization
SPP: Sparsity-Preserved Parameter-Efficient Fine-Tuning for Large Language Models
The good, the bad and the ugly sides of data augmentation: An implicit spectral regularization perspective
A New Linear Scaling Rule for Private Adaptive Hyperparameter Optimization
Parameter Efficient Quasi-Orthogonal Fine-Tuning via Givens Rotation
Detecting Any instruction-to-answer interaction relationship:Universal Instruction-to-Answer Navigator for Med-VQA
Position: Categorical Deep Learning is an Algebraic Theory of All Architectures
Leverage Class-Specific Accuracy to Guide Data Generation for Improving Image Classification
Position: The No Free Lunch Theorem, Kolmogorov Complexity, and the Role of Inductive Biases in Machine Learning
Reducing Item Discrepancy via Differentially Private Robust Embedding Alignment for Privacy-Preserving Cross Domain Recommendation
Improving Adversarial Energy-Based Model via Diffusion Process
Monitoring AI-Modified Content at Scale: A Case Study on the Impact of ChatGPT on AI Conference Peer Reviews
Evaluating Instrument Validity using the Principle of Independent Mechanisms
Achieving Lossless Gradient Sparsification via Mapping to Alternative Space in Federated Learning
One for All: A Universal Generator for Concept Unlearnability via Multi-Modal Alignment
ProtoGate: Prototype-based Neural Networks with Global-to-local Feature Selection for Tabular Biomedical Data
Graph Distillation with Eigenbasis Matching
How to Make the Gradients Small Privately: Improved Rates for Differentially Private Non-Convex Optimization
Optimal Differentially Private Model Training with Public Data
Causal Action Influence Aware Counterfactual Data Augmentation
Modelling Microbial Communities with Graph Neural Networks
Learning with 3D rotations, a hitchhiker's guide to SO(3)
Split-Ensemble: Efficient OOD-aware Ensemble via Task and Model Splitting
Deep Stochastic Mechanics
PIDformer: Transformer Meets Control Theory
The Merit of River Network Topology for Neural Flood Forecasting
DeCoOp: Robust Prompt Tuning with Out-of-Distribution Detection
Adaptive Horizon Actor-Critic for Policy Learning in Contact-Rich Differentiable Simulation
A Statistical Framework for Data-dependent Retrieval-Augmented Models
BRAIn: Bayesian Reward-conditioned Amortized Inference for natural language generation from feedback
Conditional Normalizing Flows for Active Learning of Coarse-Grained Molecular Representations
Scalable Online Exploration via Coverability
Empowering Graph Invariance Learning with Deep Spurious Infomax
Regularizing with Pseudo-Negatives for Continual Self-Supervised Learning
CaRiNG: Learning Temporal Causal Representation under Non-Invertible Generation Process
PhAST: Physics-Aware, Scalable, and Task-Specific GNNs for Accelerated Catalyst Design
Multi-layer Rehearsal Feature Augmentation for Class-Incremental Learning
MLAgentBench: Evaluating Language Agents on Machine Learning Experimentation
PRISE: LLM-Style Sequence Compression for Learning Temporal Action Abstractions in Control
Boosting Reinforcement Learning with Strongly Delayed Feedback Through Auxiliary Short Delays
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models
Non-Vacuous Generalization Bounds for Large Language Models
Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning?
InterpreTabNet: Distilling Predictive Signals from Tabular Data by Salient Feature Interpretation
Promises and Pitfalls of Generative Masked Language Modeling: Theoretical Framework and Practical Guidelines
USTAD: Unified Single-model Training Achieving Diverse Scores for Information Retrieval
IBD-PSC: Input-level Backdoor Detection via Parameter-oriented Scaling Consistency
Private and Federated Stochastic Convex Optimization: Efficient Strategies for Centralized Systems
Sliced-Wasserstein Estimation with Spherical Harmonics as Control Variates
Diffusive Gibbs Sampling
Reason for Future, Act for Now: A Principled Architecture for Autonomous LLM Agents
Beyond Sole Strength: Customized Ensembles for Generalized Vision-Language Models
Q-Align: Teaching LMMs for Visual Scoring via Discrete Text-Defined Levels
BiE: Bi-Exponent Block Floating-Point for Large Language Models Quantization
Adaptive-Gradient Policy Optimization: Enhancing Policy Learning in Non-Smooth Differentiable Simulations
Sparse Inducing Points in Deep Gaussian Processes: Enhancing Modeling with Denoising Diffusion Variational Inference
Fair Classification with Partial Feedback: An Exploration-Based Data Collection Approach
Wukong: Towards a Scaling Law for Large-Scale Recommendation
In-Context Unlearning: Language Models as Few-Shot Unlearners
Deep Regression Representation Learning with Topology
Statistical Test for Attention Maps in Vision Transformers
Hybrid Reinforcement Learning from Offline Observation Alone
Benchmarking and Building Long-Context Retrieval Models with LoCo and M2-BERT
PPFLOW: Target-Aware Peptide Design with Torsional Flow Matching
PIPER: Primitive-Informed Preference-based Hierarchical Reinforcement Learning via Hindsight Relabeling
Purify Unlearnable Examples via Rate-Constrained Variational Autoencoders
MaxMin-RLHF: Alignment with Diverse Human Preferences
Resisting Stochastic Risks in Diffusion Planners with the Trajectory Aggregation Tree
VoroNav: Voronoi-based Zero-shot Object Navigation with Large Language Model
On the Sample Complexity and Metastability of Heavy-tailed Policy Search in Continuous Control
HAMLET: Graph Transformer Neural Operator for Partial Differential Equations
Differentially Private Synthetic Data via Foundation Model APIs 2: Text
By Tying Embeddings You Are Assuming the Distributional Hypothesis
PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs
Codebook Features: Sparse and Discrete Interpretability for Neural Networks
Generalizing Orthogonalization for Models with Non-Linearities
ED-Copilot: Reduce Emergency Department Wait Time with Language Model Diagnostic Assistance
Jacobian Regularizer-based Neural Granger Causality
Position: Opportunities Exist for Machine Learning in Magnetic Fusion Energy
Simple linear attention language models balance the recall-throughput tradeoff
MGit: A Model Versioning and Management System
Multi-Factor Adaptive Vision Selection for Egocentric Video Question Answering
Consistent Long-Term Forecasting of Ergodic Dynamical Systems
Efficient Contrastive Learning for Fast and Accurate Inference on Graphs
Position: Automatic Environment Shaping is the Next Frontier in RL
Disentanglement Learning via Topology
Autoformalizing Euclidean Geometry
Fully-Dynamic Approximate Decision Trees With Worst-Case Update Time Guarantees
Bayesian Exploration Networks
MLLM-as-a-Judge: Assessing Multimodal LLM-as-a-Judge with Vision-Language Benchmark
Error Feedback Can Accurately Compress Preconditioners
Feature Importance Disparities for Data Bias Investigations
Relational DNN Verification With Cross Executional Bound Refinement
OAK: Enriching Document Representations using Auxiliary Knowledge for Extreme Classification
Online conformal prediction with decaying step sizes
Reshape and Adapt for Output Quantization (RAOQ): Quantization-aware Training for In-memory Computing Systems
Probabilistic Constrained Reinforcement Learning with Formal Interpretability
Projection-Free Online Convex Optimization with Time-Varying Constraints
Prior Specification for Bayesian Matrix Factorization via Prior Predictive Matching
Polygonal Unadjusted Langevin Algorithms: Creating stable and efficient adaptive algorithms for neural networks
An Unsupervised Approach for Periodic Source Detection in Time Series
A Study of First-Order Methods with a Deterministic Relative-Error Gradient Oracle
Scaling Tractable Probabilistic Circuits: A Systems Perspective
How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis
Provable Representation with Efficient Planning for Partially Observable Reinforcement Learning
BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
Principled Gradient-Based MCMC for Conditional Sampling of Text
Adaptive Stabilization Based on Machine Learning for Column Generation
Sample as you Infer: Predictive Coding with Langevin Dynamics
Premier-TACO is a Few-Shot Policy Learner: Pretraining Multitask Representation via Temporal Action-Driven Contrastive Loss
Learning with Partial-Label and Unlabeled Data: A Uniform Treatment for Supervision Redundancy and Insufficiency
LCA-on-the-Line: Benchmarking Out of Distribution Generalization with Class Taxonomies
Position: On the Possibilities of AI-Generated Text Detection
AI Alignment with Changing and Influenceable Reward Functions
Adapting Static Fairness to Sequential Decision-Making: Bias Mitigation Strategies towards Equal Long-term Benefit Rate
High-dimensional Linear Bandits with Knapsacks
Faster Streaming and Scalable Algorithms for Finding Directed Dense Subgraphs in Large Graphs
Towards Neural Architecture Search through Hierarchical Generative Modeling
Contrastive Predict-and-Search for Mixed Integer Linear Programs
High-Dimensional Bayesian Optimization via Semi-Supervised Learning with Optimized Unlabeled Data Sampling
Fair Data Representation for Machine Learning at the Pareto Frontier
PARDEN, Can You Repeat That? Defending against Jailbreaks via Repetition
Loss Shaping Constraints for Long-Term Time Series Forecasting
Exploring the Enigma of Neural Dynamics Through A Scattering-Transform Mixer Landscape for Riemannian Manifold
$\bf{\Phi}_\textrm{Flow}$: Differentiable Simulations for PyTorch, TensorFlow and Jax
Tackling Prevalent Conditions in Unsupervised Combinatorial Optimization: Cardinality, Minimum, Covering, and More
CCM: Real-Time Controllable Visual Content Creation Using Text-to-Image Consistency Models
Realistic Unsupervised CLIP Fine-tuning with Universal Entropy Optimization
Do Models Explain Themselves? Counterfactual Simulatability of Natural Language Explanations
Repoformer: Selective Retrieval for Repository-Level Code Completion
Accelerating Federated Learning with Quick Distributed Mean Estimation
On Least Square Estimation in Softmax Gating Mixture of Experts
QBMK: Quantum-based Matching Kernels for Un-attributed Graphs
Sampling is as easy as keeping the consistency: convergence guarantee for Consistency Models
Conditional Common Entropy for Instrumental Variable Testing and Partial Identification
A General Theory for Softmax Gating Multinomial Logistic Mixture of Experts
Effect-Invariant Mechanisms for Policy Generalization
A General Framework for Learning from Weak Supervision
ACE: Off-Policy Actor-Critic with Causality-Aware Entropy Regularization
Revealing the Dark Secrets of Extremely Large Kernel ConvNets on Robustness
Revitalizing Multivariate Time Series Forecasting: Learnable Decomposition with Inter-Series Dependencies and Intra-Series Variations Modeling
UniCorn: A Unified Contrastive Learning Approach for Multi-view Molecular Representation Learning
A Dual-module Framework for Counterfactual Estimation over Time
An Effective Dynamic Gradient Calibration Method for Continual Learning
Proteus: Exploring Protein Structure Generation for Enhanced Designability and Efficiency
Scribble-Supervised Semantic Segmentation with Prototype-based Feature Augmentation
RMIB: Representation Matching Information Bottleneck for Matching Text Representations
Modular Learning of Deep Causal Generative Models for High-dimensional Causal Inference
Moreau Envelope for Nonconvex Bi-Level Optimization: A Single-Loop and Hessian-Free Solution Strategy
Autaptic Synaptic Circuit Enhances Spatio-temporal Predictive Learning of Spiking Neural Networks
Re-Dock: Towards Flexible and Realistic Molecular Docking with Diffusion Bridge
Q-Star Meets Scalable Posterior Sampling: Bridging Theory and Practice via HyperAgent
Language Generation with Strictly Proper Scoring Rules
Merging Multi-Task Models via Weight-Ensembling Mixture of Experts
Two Tales of Single-Phase Contrastive Hebbian Learning
D-Flow: Differentiating through Flows for Controlled Generation
Position: Data Authenticity, Consent, & Provenance for AI are all broken: what will it take to fix them?
Multi-Fidelity Residual Neural Processes for Scalable Surrogate Modeling
Overcoming Data and Model heterogeneities in Decentralized Federated Learning via Synthetic Anchors
From Geometry to Causality- Ricci Curvature and the Reliability of Causal Inference on Networks
FedCal: Achieving Local and Global Calibration in Federated Learning via Aggregated Parameterized Scaler
Total Variation Floodgate for Variable Importance Inference in Classification
Position: The Reasonable Person Standard for AI
Learning High-Order Relationships of Brain Regions
Unleashing the Power of Meta-tuning for Few-shot Generalization Through Sparse Interpolated Experts
Prototypical Transformer As Unified Motion Learners
LESS: Selecting Influential Data for Targeted Instruction Tuning
Benign Overfitting in Two-Layer ReLU Convolutional Neural Networks for XOR Data
Model Assessment and Selection under Temporal Distribution Shift
Position: Social Choice Should Guide AI Alignment in Dealing with Diverse Human Feedback
Stop Regressing: Training Value Functions via Classification for Scalable Deep RL
Causal Inference out of Control: Estimating Performativity without Treatment Randomization
Enhancing Cross-Modal Fine-Tuning with Gradually Intermediate Modality Generation
ILILT: Implicit Learning of Inverse Lithography Technologies
Discovering Mixtures of Structural Causal Models from Time Series Data
StackSight: Unveiling WebAssembly through Large Language Models and Neurosymbolic Chain-of-Thought Decompilation
CogBench: a large language model walks into a psychology lab
Density Ratio Estimation with Doubly Strong Robustness
Reparameterized Importance Sampling for Robust Variational Bayesian Neural Networks
LLaGA: Large Language and Graph Assistant
A3S: A General Active Clustering Method with Pairwise Constraints
Two Heads Are Better Than One: Boosting Graph Sparse Training via Semantic and Topological Awareness
Tell, Don't Show: Language Guidance Eases Transfer Across Domains in Images and Videos
Estimating the Permanent by Nesting Importance Sampling
All-in-one simulation-based inference
High-Performance Temporal Reversible Spiking Neural Networks with $\mathcal{O}(L)$ Training Memory and $\mathcal{O}(1)$ Inference Cost
Training-Free Long-Context Scaling of Large Language Models
Use Your INSTINCT: INSTruction optimization for LLMs usIng Neural bandits Coupled with Transformers
Adaptive Group Personalization for Federated Mutual Transfer Learning
PairNet: Training with Observed Pairs to Estimate Individual Treatment Effect
Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling
Tilt your Head: Activating the Hidden Spatial-Invariance of Classifiers
Feedback Efficient Online Fine-Tuning of Diffusion Models
Measures of diversity and space-filling designs for categorical data
Target Networks and Over-parameterization Stabilize Off-policy Bootstrapping with Function Approximation
HarmonyDream: Task Harmonization Inside World Models
Rethinking Decision Transformer via Hierarchical Reinforcement Learning
Improving Transformers with Dynamically Composable Multi-Head Attention
Privacy-Preserving Instructions for Aligning Large Language Models
Long-Tail Learning with Foundation Model: Heavy Fine-Tuning Hurts
Do Topological Characteristics Help in Knowledge Distillation?
Revealing Vision-Language Integration in the Brain with Multimodal Networks
Risk-Sensitive Policy Optimization via Predictive CVaR Policy Gradient
Learning Scale-Aware Spatio-temporal Implicit Representation for Event-based Motion Deblurring
Position: TrustLLM: Trustworthiness in Large Language Models
How Uniform Random Weights Induce Non-uniform Bias: Typical Interpolating Neural Networks Generalize with Narrow Teachers
A Differentiable Partially Observable Generalized Linear Model with Forward-Backward Message Passing
Position: Towards Implicit Prompt For Text-To-Image Models
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models
RoboCodeX: Multimodal Code Generation for Robotic Behavior Synthesis
Neural Image Compression with Text-guided Encoding for both Pixel-level and Perceptual Fidelity
Auto-Encoding Morph-Tokens for Multimodal LLM
Sarah Frank-Wolfe: Methods for Constrained Optimization with Best Rates and Practical Features
Tandem Transformers for Inference Efficient LLMs
Verification of Machine Unlearning is Fragile
Towards Certified Unlearning for Deep Neural Networks
IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation
Critical feature learning in deep neural networks
Efficient Online Set-valued Classification with Bandit Feedback
How Graph Neural Networks Learn: Lessons from Training Dynamics
Scalable AI Safety via Doubly-Efficient Debate
On the Expressive Power of Spectral Invariant Graph Neural Networks
Do Efficient Transformers Really Save Computation?
Compression of Structured Data with Autoencoders: Provable Benefit of Nonlinearities and Depth
Make-A-Shape: a Ten-Million-scale 3D Shape Model
Homomorphism Counts for Graph Neural Networks: All About That Basis
Position: Topological Deep Learning is the New Frontier for Relational Learning
Can AI Assistants Know What They Don't Know?
Position: Key Claims in LLM Research Have a Long Tail of Footnotes
Quasi-Monte Carlo Features for Kernel Approximation
Position: Future Directions in the Theory of Graph Machine Learning
Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation
Statistically Optimal Generative Modeling with Maximum Deviation from the Empirical Distribution
Using Left and Right Brains Together: Towards Vision and Language Planning
Locally Estimated Global Perturbations are Better than Local Perturbations for Federated Sharpness-aware Minimization
Quantum Positional Encodings for Graph Neural Networks
Weighted distance nearest neighbor condensing
Pre-Training Protein Bi-level Representation Through Span Mask Strategy On 3D Protein Chains
Position: Leverage Foundational Models for Black-Box Optimization
Inferring Change Points in High-Dimensional Linear Regression via Approximate Message Passing
Finite Volume Features, Global Geometry Representations, and Residual Training for Deep Learning-based CFD Simulation
Efficient Error Certification for Physics-Informed Neural Networks
PDHG-Unrolled Learning-to-Optimize Method for Large-Scale Linear Programming
Amortized Variational Deep Kernel Learning
O$n$ Learning Deep O($n$)-Equivariant Hyperspheres
Position: Tensor Networks are a Valuable Asset for Green AI
ReLUs Are Sufficient for Learning Implicit Neural Representations
Learning in Feature Spaces via Coupled Covariances: Asymmetric Kernel SVD and Nyström method
Self-Attention through Kernel-Eigen Pair Sparse Variational Gaussian Processes
Harnessing Hierarchical Label Distribution Variations in Test Agnostic Long-tail Recognition
Rethinking Specificity in SBDD: Leveraging Delta Score and Energy-Guided Diffusion
Sparser, Better, Deeper, Stronger: Improving Static Sparse Training with Exact Orthogonal Initialization
Scalable Safe Policy Improvement for Factored Multi-Agent MDPs
Evaluation of Test-Time Adaptation Under Computational Time Constraints
Graph Attention Retrospective
Confidence Aware Inverse Constrained Reinforcement Learning
Bagged Deep Image Prior for Recovering Images in the Presence of Speckle Noise
Adaptive Learning of Density Ratios in RKHS
Task-aware Orthogonal Sparse Network for Exploring Shared Knowledge in Continual Learning
Amend to Alignment: Decoupled Prompt Tuning for Mitigating Spurious Correlation in Vision-Language Models
AI Control: Improving Safety Despite Intentional Subversion
Fast-Slow Test-Time Adaptation for Online Vision-and-Language Navigation
RNAFlow: RNA Structure & Sequence Design via Inverse Folding-Based Flow Matching
Fast Adversarial Attacks on Language Models In One GPU Minute
SurfPro: Functional Protein Design Based on Continuous Surface
Generative Enzyme Design Guided by Functionally Important Sites and Small-Molecule Substrates
An Analysis of Linear Time Series Forecasting Models
Position: Standardization of Behavioral Use Clauses is Necessary for the Adoption of Responsible Licensing of AI
Stealing part of a production language model
Breaking the Barrier: Enhanced Utility and Robustness in Smoothed DRL Agents
Sobolev Space Regularised Pre Density Models
Copyright Traps for Large Language Models
Stability-Informed Initialization of Neural Ordinary Differential Equations
Position: Measure Dataset Diversity, Don't Just Claim It
Position: Near to Mid-term Risks and Opportunities of Open-Source Generative AI
Position: Building Guardrails for Large Language Models Requires Systematic Design
Indirectly Parameterized Concrete Autoencoders
Distinguishing the Knowable from the Unknowable with Language Models
Watermarks in the Sand: Impossibility of Strong Watermarking for Language Models
Supervised Matrix Factorization: Local Landscape Analysis and Applications
On The Complexity of First-Order Methods in Stochastic Bilevel Optimization
Compositional Curvature Bounds for Deep Neural Networks
Patchscopes: A Unifying Framework for Inspecting Hidden Representations of Language Models
InstructRetro: Instruction Tuning post Retrieval-Augmented Pretraining
Variational Inference with Coverage Guarantees in Simulation-Based Inference
ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL
Stability and Generalization of Stochastic Compositional Gradient Descent Algorithms
From Classification Accuracy to Proper Scoring Rules: Elicitability of Probabilistic Top List Predictions
How Language Model Hallucinations Can Snowball
The Illusion of State in State-Space Models
Feature Reuse and Scaling: Understanding Transfer Learning with Protein Language Models
On PI Controllers for Updating Lagrange Multipliers in Constrained Optimization
Pursuing Overall Welfare in Federated Learning through Sequential Decision Making
Path-Guided Particle-based Sampling
Graph Positional and Structural Encoder
Structure-based drug design by denoising voxel grids
Provably Efficient Reinforcement Learning for Adversarial Restless Multi-Armed Bandits with Unknown Transitions and Bandit Feedback
Improving Generalization in Offline Reinforcement Learning via Adversarial Data Splitting
A Hierarchical Adaptive Multi-Task Reinforcement Learning Framework for Multiplier Circuit Design
Interpreting and Improving Large Language Models in Arithmetic Calculation
Fast Co-Training under Weak Dependence via Stream-Based Active Learning
A Dynamic Algorithm for Weighted Submodular Cover Problem
Sparse Cocktail: Every Sparse Pattern Every Sparse Ratio All At Once
Evolution-Inspired Loss Functions for Protein Representation Learning
Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark
Position: Open-Endedness is Essential for Artificial Superhuman Intelligence
Position: A Safe Harbor for AI Evaluation and Red Teaming
Bringing Motion Taxonomies to Continuous Domains via GPLVM on Hyperbolic manifolds
Position: Technical Research and Talent is Needed for Effective AI Governance
Overcoming Saturation in Density Ratio Estimation by Iterated Regularization
Survival Kernets: Scalable and Interpretable Deep Kernel Survival Analysis with an Accuracy Guarantee
Environment Design for Inverse Reinforcement Learning
Position: On the Societal Impact of Open Foundation Models
MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions
BAGEL: Bootstrapping Agents by Guiding Exploration with Language
Locality-Sensitive Hashing-Based Efficient Point Transformer with Applications in High-Energy Physics
Ambiguity-Aware Abductive Learning
Code as Reward: Empowering Reinforcement Learning with VLMs
RankSEG: A Consistent Ranking-based Framework for Segmentation
DUPLEX: Dual GAT for Complex Embedding of Directed Graphs
Straight-Through Meets Sparse Recovery: the Support Exploration Algorithm
Adaptive Feature Selection for No-Reference Image Quality Assessment by Mitigating Semantic Noise Sensitivity
Latent Space Symmetry Discovery
Latent Optimal Paths by Gumbel Propagation for Variational Bayesian Dynamic Programming
StrokeNUWA—Tokenizing Strokes for Vector Graphic Synthesis
Towards AutoAI: Optimizing a Machine Learning System with Black-box and Differentiable Components
Language Models as Science Tutors
Hyperbolic Geometric Latent Diffusion Model for Graph Generation
Iterated Denoising Energy Matching for Sampling from Boltzmann Densities
Machine Vision Therapy: Multimodal Large Language Models Can Enhance Visual Robustness via Denoising In-Context Learning
Principled Preferential Bayesian Optimization
Offline Multi-Objective Optimization
A Tale of Tails: Model Collapse as a Change of Scaling Laws
BECoTTA: Input-dependent Online Blending of Experts for Continual Test-time Adaptation
Centralized Selection with Preferences in the Presence of Biases
Adaptive Observation Cost Control for Variational Quantum Eigensolvers
StrWAEs to Invariant Representations
An Efficient Maximal Ancestral Graph Listing Algorithm
Subequivariant Reinforcement Learning in 3D Multi-Entity Physical Environments
Sequential Disentanglement by Extracting Static Information From A Single Sequence Element
Improving Accuracy-robustness Trade-off via Pixel Reweighted Adversarial Training
Deep Fusion: Efficient Network Training via Pre-trained Initializations
Taylor Videos for Action Recognition
Position: Bayesian Deep Learning is Needed in the Age of Large-Scale AI
Vocabulary for Universal Approximation: A Linguistic Perspective of Mapping Compositions
DetKDS: Knowledge Distillation Search for Object Detectors
Positive Concave Deep Equilibrium Models
Decouple then Classify: A Dynamic Multi-view Labeling Strategy with Shared and Specific Information
A General Online Algorithm for Optimizing Complex Performance Metrics
Data-Efficient Learning via Clustering-Based Sensitivity Sampling: Foundation Models and Beyond
Scalable Multiple Kernel Clustering: Learning Clustering Structure from Expectation
Learning 1-Bit Tiny Object Detector with Discriminative Feature Refinement
ConTextual: Evaluating Context-Sensitive Text-Rich Visual Reasoning in Large Multimodal Models
Decomposed Linear Dynamical Systems (dLDS) for learning the latent components of neural dynamics
SiBBlInGS: Similarity-driven Building-Block Inference using Graphs across States
Temporal Logic Specification-Conditioned Decision Transformer for Offline Safe Reinforcement Learning
Shifted Interpolation for Differential Privacy
RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning via Generative Simulation
LLM and Simulation as Bilevel Optimizers: A New Paradigm to Advance Physical Scientific Discovery
Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models
Online Learning with Bounded Recall
Enhancing Class-Imbalanced Learning with Pre-Trained Guidance through Class-Conditional Knowledge Distillation
Adaptive Hierarchical Certification for Segmentation using Randomized Smoothing
Time Series Diffusion in the Frequency Domain
Detecting and Identifying Selection Structure in Sequential Data
Rich-Observation Reinforcement Learning with Continuous Latent Dynamics
PcLast: Discovering Plannable Continuous Latent States
Neural Operators with Localized Integral and Differential Kernels
In-Context Reinforcement Learning for Variable Action Spaces
Convex and Bilevel Optimization for Neural-Symbolic Inference and Learning
Private Heterogeneous Federated Learning Without a Trusted Server Revisited: Error-Optimal and Communication-Efficient Algorithms for Convex Losses
Scaling Laws for the Value of Individual Data Points in Machine Learning
The Relative Value of Prediction in Algorithmic Decision Making
Auto-Linear Phenomenon in Subsurface Imaging
Statistical Inference Under Constrained Selection Bias
MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI
MFTN: A Multi-scale Feature Transfer Network Based on IMatchFormer for Hyperspectral Image Super-Resolution
Unified Training of Universal Time Series Forecasting Transformers
Efficient and Effective Time-Series Forecasting with Spiking Neural Networks
Ensemble Pruning for Out-of-distribution Generalization
An Empirical Study Into What Matters for Calibrating Vision-Language Models
Rate-Optimal Policy Optimization for Linear Markov Decision Processes
QUEST: Query-Aware Sparsity for Efficient Long-Context LLM Inference
Total Variation Distance Meets Probabilistic Inference
A General Framework for Sequential Decision-Making under Adaptivity Constraints
COALA: A Practical and Vision-Centric Federated Learning Platform
How to Trace Latent Generative Model Generated Images without Artificial Watermark?
Ditto: Quantization-aware Secure Inference of Transformers upon MPC
Socialized Learning: Making Each Other Better Through Multi-Agent Collaboration
Automating the Selection of Proxy Variables of Unmeasured Confounders
Local Causal Structure Learning in the Presence of Latent Variables
CogDPM: Diffusion Probabilistic Models via Cognitive Predictive Coding
Generalization Bound and New Algorithm for Clean-Label Backdoor Attack
Learning Universal Predictors
Byzantine-Robust Federated Learning: Impact of Client Subsampling and Local Updates
Neural SPH: Improved Neural Modeling of Lagrangian Fluid Dynamics
Constrained Ensemble Exploration for Unsupervised Skill Discovery
Position: Towards Unified Alignment Between Agents, Humans, and Environment
The Linear Representation Hypothesis and the Geometry of Large Language Models
Auctionformer: A Unified Deep Learning Algorithm for Solving Equilibrium Strategies in Auction Games
Transforming and Combining Rewards for Aligning Large Language Models
Extreme Compression of Large Language Models via Additive Quantization
Detecting Influence Structures in Multi-Agent Reinforcement Learning
Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization
Can Implicit Bias Imply Adversarial Robustness?
Position: Application-Driven Innovation in Machine Learning
SAMformer: Unlocking the Potential of Transformers in Time Series Forecasting with Sharpness-Aware Minimization and Channel-Wise Attention
Mathematical Framework for Online Social Media Auditing
Generalization in Kernel Regression Under Realistic Assumptions
$\texttt{MoE-RBench}$: Towards Building Reliable Language Models with Sparse Mixture-of-Experts
Incentivized Learning in Principal-Agent Bandit Games
Rejuvenating image-GPT as Strong Visual Representation Learners
How Private are DP-SGD Implementations?
Extending Adversarial Attacks to Produce Adversarial Class Probability Distributions
Chain-of-Thought Predictive Control
Data-free Neural Representation Compression with Riemannian Neural Dynamics
Cooperative Graph Neural Networks
Modeling Language Tokens as Functionals of Semantic Fields
Conformal Predictions under Markovian Data
On Universally Optimal Algorithms for A/B Testing
Low-Rank Bandits via Tight Two-to-Infinity Singular Subspace Recovery
Verifying message-passing neural networks via topology-based bounds tightening
TimeSiam: A Pre-Training Framework for Siamese Time-Series Modeling
Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text
Imitation Learning from Purified Demonstrations
Roping in Uncertainty: Robustness and Regularization in Markov Games
Scaling Speech Technology to 1,000+ Languages
Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition
NExT-GPT: Any-to-Any Multimodal LLM
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Understanding Server-Assisted Federated Learning in the Presence of Incomplete Client Participation
Bias of Stochastic Gradient Descent or the Architecture: Disentangling the Effects of Overparameterization of Neural Networks
S3GCL: Spectral, Swift, Spatial Graph Contrastive Learning
How Does Goal Relabeling Improve Sample Efficiency?
Don’t Label Twice: Quantity Beats Quality when Comparing Binary Classifiers on a Budget
Conditionally-Conjugate Gaussian Process Factor Analysis for Spike Count Data via Data Augmentation
Handling Heterogeneous Curvatures in Bandit LQR Control
Agent Instructs Large Language Models to be General Zero-Shot Reasoners
Impact of Decentralized Learning on Player Utilities in Stackelberg Games
Transitional Uncertainty with Layered Intermediate Predictions
Guarantees for Nonlinear Representation Learning: Non-identical Covariates, Dependent Data, Fewer Samples
Individual Contributions as Intrinsic Exploration Scaffolds for Multi-agent Reinforcement Learning
Hypergraph-enhanced Dual Semi-supervised Graph Classification
What is Dataset Distillation Learning?
Demystifying SGD with Doubly Stochastic Gradients
Amortizing Pragmatic Program Synthesis with Rankings
Learning Shadow Variable Representation for Treatment Effect Estimation under Collider Bias
CLIF: Complementary Leaky Integrate-and-Fire Neuron for Spiking Neural Networks
${\rm E}(3)$-Equivariant Actor-Critic Methods for Cooperative Multi-Agent Reinforcement Learning
RODEO: Robust Outlier Detection via Exposing Adaptive Out-of-Distribution Samples
Performative Prediction with Bandit Feedback: Learning through Reparameterization
Provable Benefits of Local Steps in Heterogeneous Federated Learning for Neural Networks: A Feature Learning Perspective
On the Implicit Bias of Adam
No Double Descent in Principal Component Regression: A High-Dimensional Analysis
Learning Graph Representation via Graph Entropy Maximization
Drug Discovery with Dynamic Goal-aware Fragments
Memory Efficient Neural Processes via Constant Memory Attention Block
Position: Reinforcement Learning in Dynamic Treatment Regimes Needs Critical Reexamination
Membership Inference Attacks on Diffusion Models via Quantile Regression
Evaluating and Analyzing Relationship Hallucinations in Large Vision-Language Models
The WMDP Benchmark: Measuring and Reducing Malicious Use with Unlearning
Embodied CoT Distillation From LLM To Off-the-shelf Agents
Preference Optimization for Molecule Synthesis with Conditional Residual Energy-based Models
S3O: A Dual-Phase Approach for Reconstructing Dynamic Shape and Skeleton of Articulated Objects from Single Monocular Video
Robust Classification via a Single Diffusion Model
Model-based Reinforcement Learning for Parameterized Action Spaces
Achieving Margin Maximization Exponentially Fast via Progressive Norm Rescaling
Learning Temporal Distances: Contrastive Successor Features Can Provide a Metric Structure for Decision-Making
A New Branch-and-Bound Pruning Framework for $\ell_0$-Regularized Problems
Graph Neural Stochastic Diffusion for Estimating Uncertainty in Node Classification
Mastering Robot Manipulation with Multimodal Prompts through Pretraining and Multi-task Fine-tuning
Ai-sampler: Adversarial Learning of Markov kernels with involutive maps
Online Algorithms with Uncertainty-Quantified Predictions
Conformal Prediction for Deep Classifier via Label Ranking
Randomized Confidence Bounds for Stochastic Partial Monitoring
A Computational Framework for Solving Wasserstein Lagrangian Flows
Improved Communication-Privacy Trade-offs in $L_2$ Mean Estimation under Streaming Differential Privacy
Attention Meets Post-hoc Interpretability: A Mathematical Perspective
Unsupervised Parameter-free Simplicial Representation Learning with Scattering Transforms
Improving Equivariant Graph Neural Networks on Large Geometric Graphs via Virtual Nodes Learning
An Interpretable Evaluation of Entropy-based Novelty of Generative Models
What’s the score? Automated Denoising Score Matching for Nonlinear Diffusions
A Unified Framework for Learning with Nonlinear Model Classes from Arbitrary Linear Samples
Position: A Call to Action for a Human-Centered AutoML Paradigm
Sample-Efficient Multiagent Reinforcement Learning with Reset Replay
Recurrent Early Exits for Federated Learning with Heterogeneous Clients
On the Hardness of Probabilistic Neurosymbolic Learning
What is the Long-Run Distribution of Stochastic Gradient Descent? A Large Deviations Analysis
Kernel Debiased Plug-in Estimation: Simultaneous, Automated Debiasing without Influence Functions for Many Target Parameters
DiffDA: a Diffusion model for weather-scale Data Assimilation
Mean-field Analysis on Two-layer Neural Networks from a Kernel Perspective
Transformers Learn Nonlinear Features In Context: Nonconvex Mean-field Dynamics on the Attention Landscape
On Computational Limits of Modern Hopfield Models: A Fine-Grained Complexity Analysis
Stereographic Spherical Sliced Wasserstein Distances
Comparing Graph Transformers via Positional Encodings
Chain of Code: Reasoning with a Language Model-Augmented Code Emulator
Simulation-Based Inference with Quantile Regression
UGrid: An Efficient-And-Rigorous Neural Multigrid Solver for Linear PDEs
Lookbehind-SAM: k steps back, 1 step forward
Provably Neural Active Learning Succeeds via Prioritizing Perplexing Samples
Delaunay Graph: Addressing Over-Squashing and Over-Smoothing Using Delaunay Triangulation
RLAIF vs. RLHF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
How do Transformers Perform In-Context Autoregressive Learning ?
Understanding the Training Speedup from Sampling with Approximate Losses
Flora: Low-Rank Adapters Are Secretly Gradient Compressors
Amortized Equation Discovery in Hybrid Dynamical Systems
High-Dimensional Kernel Methods under Covariate Shift: Data-Dependent Implicit Regularization
Fast and Sample Efficient Multi-Task Representation Learning in Stochastic Contextual Bandits
Sparse Dimensionality Reduction Revisited
Fast White-Box Adversarial Streaming Without a Random Oracle
Language Models with Conformal Factuality Guarantees
Slicing Mutual Information Generalization Bounds for Neural Networks
BadPart: Unified Black-box Adversarial Patch Attacks against Pixel-wise Regression Tasks
Asymmetry in Low-Rank Adapters of Foundation Models
DNCs Require More Planning Steps
Mean Field Langevin Actor-Critic: Faster Convergence and Global Optimality beyond Lazy Learning
Pairwise Alignment Improves Graph Domain Adaptation
Various Lengths, Constant Speed: Efficient Language Modeling with Lightning Attention
Sequential Asynchronous Action Coordination in Multi-Agent Systems: A Stackelberg Decision Transformer Approach
Keypoint-based Progressive Chain-of-Thought Distillation for LLMs
MD tree: a model-diagnostic tree grown on loss landscape
Barrier Algorithms for Constrained Non-Convex Optimization
Mitigating Privacy Risk in Membership Inference by Convex-Concave Loss
Zero-Shot Reinforcement Learning via Function Encoders
Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference
Coprocessor Actor Critic: A Model-Based Reinforcement Learning Approach For Adaptive Brain Stimulation
Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning
Advancing Dynamic Sparse Training by Exploring Optimization Opportunities
Transferable Facial Privacy Protection against Blind Face Restoration via Domain-Consistent Adversarial Obfuscation
On a Combinatorial Problem Arising in Machine Teaching
Online Learning under Budget and ROI Constraints via Weak Adaptivity
Fast Algorithms for Hypergraph PageRank with Applications to Semi-Supervised Learning
In value-based deep reinforcement learning, a pruned network is a good network
Projecting Molecules into Synthesizable Chemical Spaces
Language Models as Semantic Indexers
Low-Cost High-Power Membership Inference Attacks
PAC-Bayesian Generalization Bounds for Knowledge Graph Representation Learning
Optimal Eye Surgeon: Finding image priors through sparse generators at initialization
Stochastic Quantum Sampling for Non-Logconcave Distributions and Estimating Partition Functions
Large Language Models are Geographically Biased
Vectorized Conditional Neural Fields: A Framework for Solving Time-dependent Parametric Partial Differential Equations
LASER: Linear Compression in Wireless Distributed Optimization
MS-TIP: Imputation Aware Pedestrian Trajectory Prediction
In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation
Don't trust your eyes: on the (un)reliability of feature visualizations
SAM-E: Leveraging Visual Foundation Model with Sequence Imitation for Embodied Manipulation
Designing Decision Support Systems using Counterfactual Prediction Sets
Position: Intent-aligned AI Systems Must Optimize for Agency Preservation
Contrastive Representation for Data Filtering in Cross-Domain Offline Reinforcement Learning
In-context Learning on Function Classes Unveiled for Transformers
Linguistic Calibration of Long-Form Generations
GNNs Also Deserve Editing, and They Need It More Than Once
PASOA- PArticle baSed Bayesian Optimal Adaptive design
SILVER: Single-loop variance reduction and application to federated learning
UPOCR: Towards Unified Pixel-Level OCR Interface
BiSHop: Bi-Directional Cellular Learning for Tabular Data with Generalized Sparse Modern Hopfield Model
Reweighted Solutions for Weighted Low Rank Approximation
Learning High-Frequency Functions Made Easy with Sinusoidal Positional Encoding
Implicit Regularization in Feedback Alignment Learning Mechanisms for Neural Networks
Transformers Provably Learn Sparse Token Selection While Fully-Connected Nets Cannot
Extracting Training Data From Document-Based VQA Models
Cluster-Aware Similarity Diffusion for Instance Retrieval
Structure-Aware E(3)-Invariant Molecular Conformer Aggregation Networks
Operator SVD with Neural Networks via Nested Low-Rank Approximation
Sample-Efficient Robust Multi-Agent Reinforcement Learning in the Face of Environmental Uncertainty
Disguised Copyright Infringement of Latent Diffusion Models
Classification Under Strategic Self-Selection
Sparse-to-dense Multimodal Image Registration via Multi-Task Learning
Probabilistic Routing for Graph-Based Approximate Nearest Neighbor Search
Relational Learning in Pre-Trained Models: A Theory from Hypergraph Recovery Perspective
An Intrinsic Vector Heat Network
More Flexible PAC-Bayesian Meta-Learning by Learning Learning Algorithms
Robust Inverse Constrained Reinforcement Learning under Model Misspecification
DiffStitch: Boosting Offline Reinforcement Learning with Diffusion-based Trajectory Stitching
Guiding LLMs The Right Way: Fast, Non-Invasive Constrained Generation
Position: Understanding LLMs Requires More Than Statistical Generalization
Mechanistic Design and Scaling of Hybrid Architectures
SelMatch: Effectively Scaling Up Dataset Distillation via Selection-Based Initialization and Partial Updates by Trajectory Matching
Efficient Algorithms for Empirical Group Distributionally Robust Optimization and Beyond
Combinatorial Multivariant Multi-Armed Bandits with Applications to Episodic Reinforcement Learning and Beyond
RL-CFR: Improving Action Abstraction for Imperfect Information Extensive-Form Games with Reinforcement Learning
Position: Mission Critical – Satellite Data is a Distinct Modality in Machine Learning
Nesting Particle Filters for Experimental Design in Dynamical Systems
A Theory of Fault-Tolerant Learning
Generating In-Distribution Proxy Graphs for Explaining Graph Neural Networks
Q-value Regularized Transformer for Offline Reinforcement Learning
BWS: Best Window Selection Based on Sample Scores for Data Pruning across Broad Ranges
Human-like Category Learning by Injecting Ecological Priors from Large Language Models into Neural Networks
On the sample complexity of conditional independence testing with Von Mises estimator with application to causal discovery
Position: Amazing Things Come From Having Many Good Models
How Spurious Features are Memorized: Precise Analysis for Random and NTK Features
ReLU Network with Width $d+\mathcal{O}(1)$ Can Achieve Optimal Approximation Rate
Can a Few Decide for Many? The Metric Distortion of Sortition
Language Agent Tree Search Unifies Reasoning, Acting, and Planning in Language Models
Refined Coreset Selection: Towards Minimal Coreset Size under Model Performance Constraints
Relaxing the Accurate Imputation Assumption in Doubly Robust Learning for Debiased Collaborative Filtering
Tag-LLM: Repurposing General-Purpose LLMs for Specialized Domains
Conformalized Adaptive Forecasting of Heterogeneous Trajectories
Collage: Light-Weight Low-Precision Strategy for LLM Training
STEER: Assessing the Economic Rationality of Large Language Models
Predictive Coding beyond Correlations
Thermometer: Towards Universal Calibration for Large Language Models
Completing Visual Objects via Bridging Generation and Segmentation
Adaptive Online Experimental Design for Causal Discovery
Fourier Controller Networks for Real-Time Decision-Making in Embodied Learning
Combining Experimental and Historical Data for Policy Evaluation
Decoding-time Realignment of Language Models
Learning the Uncertainty Sets of Linear Control Systems via Set Membership: A Non-asymptotic Analysis
Bayesian Regret Minimization in Offline Bandits
Soft Prompt Recovers Compressed LLMs, Transferably
Gambling-Based Confidence Sequences for Bounded Random Vectors
Fine-Grained Causal Dynamics Learning with Quantization for Improving Robustness in Reinforcement Learning
Unsupervised Domain Adaptation for Anatomical Structure Detection in Ultrasound Images
A2Q+: Improving Accumulator-Aware Weight Quantization
Low-Rank Similarity Mining for Multimodal Dataset Distillation
Reinforcement Learning from Reachability Specifications: PAC Guarantees with Expected Conditional Distance
Adaptive Accompaniment with ReaLchords
WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
What Improves the Generalization of Graph Transformers? A Theoretical Dive into the Self-attention and Positional Encoding
Smooth Min-Max Monotonic Networks
Layer-Aware Analysis of Catastrophic Overfitting: Revealing the Pseudo-Robust Shortcut Dependency
When Representations Align: Universality in Representation Learning Dynamics
Degeneration-free Policy Optimization: RL Fine-Tuning for Language Models without Degeneration
Convergence and Trade-Offs in Riemannian Gradient Descent and Riemannian Proximal Point
MVMoE: Multi-Task Vehicle Routing Solver with Mixture-of-Experts
DistiLLM: Towards Streamlined Distillation for Large Language Models
Federated Combinatorial Multi-Agent Multi-Armed Bandits
SMaRt: Improving GANs with Score Matching Regularity
Density-Softmax: Efficient Test-time Model for Uncertainty Estimation and Robustness under Distribution Shifts
On the Complexity of Finite-Sum Smooth Optimization under the Polyak–Łojasiewicz Condition
Meta-Reinforcement Learning Robust to Distributional Shift Via Performing Lifelong In-Context Learning
Revisiting Inexact Fixed-Point Iterations for Min-Max Problems: Stochasticity and Structured Nonconvexity
Predictive Linear Online Tracking for Unknown Targets
Unsupervised Concept Discovery Mitigates Spurious Correlations
Counterfactual Reasoning for Multi-Label Image Classification via Patching-Based Training
GaussianPro: 3D Gaussian Splatting with Progressive Propagation
Which Frequencies do CNNs Need? Emergent Bottleneck Structure in Feature Learning
Understanding Stochastic Natural Gradient Variational Inference
TravelPlanner: A Benchmark for Real-World Planning with Language Agents
Causal Discovery via Conditional Independence Testing with Proxy Variables
Semantically-correlated memories in a dense associative model
Efficient Policy Evaluation with Offline Data Informed Behavior Policy Design
When and How Does In-Distribution Label Help Out-of-Distribution Detection?
High-Probability Bound for Non-Smooth Non-Convex Stochastic Optimization with Heavy Tails
Intersecting-Boundary-Sensitive Fingerprinting for Tampering Detection of DNN Models
R2E: Turning any Github Repository into a Programming Agent Environment
Towards the Theory of Unsupervised Federated Learning: Non-asymptotic Analysis of Federated EM Algorithms
Slicedit: Zero-Shot Video Editing With Text-to-Image Diffusion Models Using Spatio-Temporal Slices
A Generative Approach for Treatment Effect Estimation under Collider Bias: From an Out-of-Distribution Perspective
Variational Schrödinger Diffusion Models
MusicFlow: Cascaded Flow Matching for Text Guided Music Generation
State-Free Inference of State-Space Models: The *Transfer Function* Approach
Outlier-Efficient Hopfield Layers for Large Transformer-Based Models
ATraDiff: Accelerating Online Reinforcement Learning with Imaginary Trajectories
Acquisition Conditioned Oracle for Nongreedy Active Feature Acquisition
Pluvial Flood Emulation with Hydraulics-informed Message Passing
Self-Driven Entropy Aggregation for Byzantine-Robust Heterogeneous Federated Learning
Exploiting Negative Samples: A Catalyst for Cohort Discovery in Healthcare Analytics
Position: Foundation Agents as the Paradigm Shift for Decision Making
Minimizing $f$-Divergences by Interpolating Velocity Fields
AlphaFold Meets Flow Matching for Generating Protein Ensembles
Efficient Algorithms for Sum-Of-Minimum Optimization
FedMBridge: Bridgeable Multimodal Federated Learning
Does Label Smoothing Help Deep Partial Label Learning?
PGODE: Towards High-quality System Dynamics Modeling
Efficient Precision and Recall Metrics for Assessing Generative Models using Hubness-aware Sampling
Plug-in Performative Optimization
BBox-Adapter: Lightweight Adapting for Black-Box Large Language Models
Simple Ingredients for Offline Reinforcement Learning
SyCoCa: Symmetrizing Contrastive Captioners with Attentive Masking for Multimodal Alignment
From Vision to Audio and Beyond: A Unified Model for Audio-Visual Representation and Generation
How Transformers Learn Causal Structure with Gradient Descent
Averaging $n$-step Returns Reduces Variance in Reinforcement Learning
ERQ: Error Reduction for Post-Training Quantization of Vision Transformers
Non-clairvoyant Scheduling with Partial Predictions
Tight Partial Identification of Causal Effects with Marginal Distribution of Unmeasured Confounders
Discovering Multiple Solutions from a Single Task in Offline Reinforcement Learning
Uniform Memory Retrieval with Larger Capacity for Modern Hopfield Models
FADAS: Towards Federated Adaptive Asynchronous Optimization
Distributional Bellman Operators over Mean Embeddings
Learning from Integral Losses in Physics Informed Neural Networks
Private Gradient Descent for Linear Regression: Tighter Error Bounds and Instance-Specific Uncertainty Estimation
Evaluating Model Bias Requires Characterizing its Mistakes
Best of Both Worlds Guarantees for Smoothed Online Quadratic Optimization
Modeling Caption Diversity in Contrastive Vision-Language Pretraining
Locally Interdependent Multi-Agent MDP: Theoretical Framework for Decentralized Agents with Dynamic Dependencies
Learning Decision Policies with Instrumental Variables through Double Machine Learning
Beyond the Calibration Point: Mechanism Comparison in Differential Privacy
The Benefits of Reusing Batches for Gradient Descent in Two-Layer Networks: Breaking the Curse of Information and Leap Exponents
Learning from Students: Applying t-Distributions to Explore Accurate and Efficient Formats for LLMs
Sampling-based Multi-dimensional Recalibration
Understanding Adam Optimizer via Online Learning of Updates: Adam is FTRL in Disguise
On the Effectiveness of Supervision in Asymmetric Non-Contrastive Learning
Truly No-Regret Learning in Constrained MDPs
Nearest Neighbour Score Estimators for Diffusion Generative Models
Behavior Generation with Latent Actions
Privacy Profiles for Private Selection
Peeking with PEAK: Sequential, Nonparametric Composite Hypothesis Tests for Means of Multiple Data Streams
Referee Can Play: An Alternative Approach to Conditional Generation via Model Inversion
Harmony in Diversity: Merging Neural Networks with Canonical Correlation Analysis
Hyperbolic Active Learning for Semantic Segmentation under Domain Shift
Multiply Robust Estimation for Local Distribution Shifts with Multiple Domains
Q-Probe: A Lightweight Approach to Reward Maximization for Language Models
Equivariance via Minimal Frame Averaging for More Symmetries and Efficiency
Fast, Scalable, Warm-Start Semidefinite Programming with Spectral Bundling and Sketching
Subsampling is not Magic: Why Large Batch Sizes Work for Differentially Private Stochastic Optimisation
Representing Molecules as Random Walks Over Interpretable Grammars
Data-free Distillation of Diffusion Models with Bootstrapping
Applying language models to algebraic topology: generating simplicial cycles using multi-labeling in Wu's formula
Stochastic Gradient Flow Dynamics of Test Risk and its Exact Solution for Weak Features
Can Gaussian Sketching Converge Faster on a Preconditioned Landscape?
Policy-conditioned Environment Models are More Generalizable
Momentum for the Win: Collaborative Federated Reinforcement Learning across Heterogeneous Environments
The Expressive Power of Path-Based Graph Neural Networks
On Mechanistic Knowledge Localization in Text-to-Image Generative Models
Unified Generation, Reconstruction, and Representation: Generalized Diffusion with Adaptive Latent Encoding-Decoding
Differentially Private Bias-Term Fine-tuning of Foundation Models
Encodings for Prediction-based Neural Architecture Search
EDISON: Enhanced Dictionary-Induced Tensorized Incomplete Multi-View Clustering with Gaussian Error Rank Minimization
Lessons from Generalization Error Analysis of Federated Learning: You May Communicate Less Often!
MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models
Dense Reward for Free in Reinforcement Learning from Human Feedback
Asymptotically Optimal and Computationally Efficient Average Treatment Effect Estimation in A/B testing
Hierarchical Neural Operator Transformer with Learnable Frequency-aware Loss Prior for Arbitrary-scale Super-resolution
AttnLRP: Attention-Aware Layer-Wise Relevance Propagation for Transformers
Contrastive Learning for Clinical Outcome Prediction with Partial Data Sources
Layerwise Proximal Replay: A Proximal Point Method for Online Continual Learning
Interaction-based Retrieval-augmented Diffusion Models for Protein-specific 3D Molecule Generation
Leveraging Attractor Dynamics in Spatial Navigation for Better Language Parsing
On the Generalization of Stochastic Gradient Descent with Momentum
Optimal Hessian/Jacobian-Free Nonconvex-PL Bilevel Optimization
Theoretical Guarantees for Variational Inference with Fixed-Variance Mixture of Gaussians
Learning-Rate-Free Stochastic Optimization over Riemannian Manifolds
Towards Theoretical Understanding of Learning Large-scale Dependent Data via Random Features
Few-shot Adaptation to Distribution Shifts By Mixing Source and Target Embeddings
Characteristic Guidance: Non-linear Correction for Diffusion Model at Large Guidance Scale
RoboMP$^2$: A Robotic Multimodal Perception-Planning Framework with Multimodal Large Language Models
Simplicity Bias of Two-Layer Networks beyond Linearly Separable Data
Provably Efficient Exploration in Quantum Reinforcement Learning with Logarithmic Worst-Case Regret
Generalization Bounds for Heavy-Tailed SDEs through the Fractional Fokker-Planck Equation
Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
Improving Instruction Following in Language Models through Proxy-Based Uncertainty Estimation
Saliency strikes back: How filtering out high frequencies improves white-box explanations
Is Temperature Sample Efficient for Softmax Gaussian Mixture of Experts?
Double Variance Reduction: A Smoothing Trick for Composite Optimization Problems without First-Order Gradient
Robust Inverse Graphics via Probabilistic Inference
Community-Invariant Graph Contrastive Learning
Exploration by Optimization with Hybrid Regularizers: Logarithmic Regret with Adversarial Robustness in Partial Monitoring
Inferring the Long-Term Causal Effects of Long-Term Treatments from Short-Term Experiments
Causality Based Front-door Defense Against Backdoor Attack on Language Models
Balanced Resonate-and-Fire Neurons
ELTA: An Enhancer against Long-Tail for Aesthetics-oriented Models
Deep Equilibrium Models are Almost Equivalent to Not-so-deep Explicit Models for High-dimensional Gaussian Mixtures
On The Statistical Complexity of Offline Decision-Making
Contrasting Multiple Representations with the Multi-Marginal Matching Gap
Sign is Not a Remedy: Multiset-to-Multiset Message Passing for Learning on Heterophilic Graphs
InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks
From Yes-Men to Truth-Tellers: Addressing Sycophancy in Large Language Models with Pinpoint Tuning
Neural Jump-Diffusion Temporal Point Processes
Collapse-Aware Triplet Decoupling for Adversarially Robust Image Retrieval
Dynamic Spectral Clustering with Provable Approximation Guarantee
Federated Full-Parameter Tuning of Billion-Sized Language Models with Communication Cost under 18 Kilobytes
Understanding the Impact of Introducing Constraints at Inference Time on Generalization Error
The Max-Min Formulation of Multi-Objective Reinforcement Learning: From Theory to a Model-Free Algorithm
Just Cluster It: An Approach for Exploration in High-Dimensions using Clustering and Pre-Trained Representations
FESSNC: Fast Exponentially Stable and Safe Neural Controller
SIN: Selective and Interpretable Normalization for Long-Term Time Series Forecasting
Forget Sharpness: Perturbed Forgetting of Model Biases Within SAM Dynamics
Accelerating Heterogeneous Federated Learning with Closed-form Classifiers
Bayesian Adaptation of Network Depth and Width for Continual Learning
Position: Insights from Survey Methodology can Improve Training Data
Precise Accuracy / Robustness Tradeoffs in Regression: Case of General Norms
Feature Distribution on Graph Topology Mediates the Effect of Graph Convolution: Homophily Perspective
Challenges and Considerations in the Evaluation of Bayesian Causal Discovery
Batch and match: black-box variational inference with a score-based divergence
Sequential Kernel Goodness-of-fit Testing
Learning Decision Trees and Forests with Algorithmic Recourse
Riemannian coordinate descent algorithms on matrix manifolds
Fundamental Limits of Distributed Covariance Matrix Estimation Under Communication Constraints
Timer: Generative Pre-trained Transformers Are Large Time Series Models
VideoPrism: A Foundational Visual Encoder for Video Understanding
Differentiable Model Scaling using Differentiable Topk
Policy Evaluation for Variance in Average Reward Reinforcement Learning
Analyzing $D^\alpha$ seeding for $k$-means
Fine-grained Classes and How to Find Them
Not Just Pretty Pictures: Toward Interventional Data Augmentation Using Text-to-Image Generators
SceneCraft: An LLM Agent for Synthesizing 3D Scenes as Blender Code
Category-Aware Active Domain Adaptation
Towards Theoretical Understandings of Self-Consuming Generative Models
Language Models Represent Beliefs of Self and Others
Toward Adaptive Reasoning in Large Language Models with Thought Rollback
Quality Diversity through Human Feedback: Towards Open-Ended Diversity-Driven Optimization
EvoluNet: Advancing Dynamic Non-IID Transfer Learning on Graphs
Exploiting Human-AI Dependence for Learning to Defer
Gaussian Processes on Cellular Complexes
Graph Geometry-Preserving Autoencoders
Beyond Regular Grids: Fourier-Based Neural Operators on Arbitrary Domains
Improving Open-Ended Text Generation via Adaptive Decoding
A Universal Transfer Theorem for Convex Optimization Algorithms Using Inexact First-order Oracles
On the Embedding Collapse when Scaling up Recommendation Models
Position: Machine Learning-powered Assessments of the EU Digital Services Act Aid Quantify Policy Impacts on Online Harms
How Smooth Is Attention?
Protein Conformation Generation via Force-Guided SE(3) Diffusion Models
Rolling Diffusion Models
Unsupervised Zero-Shot Reinforcement Learning via Functional Reward Encodings
Reward Shaping for Reinforcement Learning with An Assistant Reward Agent
ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy
PAC-Bayesian Error Bound, via Rényi Divergence, for a Class of Linear Time-Invariant State-Space Models
Grokking Group Multiplication with Cosets
Listenable Maps for Audio Classifiers
tinyBenchmarks: evaluating LLMs with fewer examples
Improving Antibody Humanness Prediction using Patent Data
Estimating Canopy Height at Scale
Federated Self-Explaining GNNs with Anti-shortcut Augmentations
Zero-Shot ECG Classification with Multimodal Learning and Test-time Clinical Knowledge Enhancement
GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer
Discounted Adaptive Online Learning: Towards Better Regularization
Probabilistic Inference in Language Models via Twisted Sequential Monte Carlo
Hard Tasks First: Multi-Task Reinforcement Learning Through Task Scheduling
Optimal Acceleration for Minimax and Fixed-Point Problems is Not Unique
Score-Based Causal Discovery of Latent Variable Causal Models
Test-Time Regret Minimization in Meta Reinforcement Learning
OSSCAR: One-Shot Structured Pruning in Vision and Language Models with Combinatorial Optimization
Few-Shot Character Understanding in Movies as an Assessment to Meta-Learning of Theory-of-Mind
Improving Sample Efficiency of Model-Free Algorithms for Zero-Sum Markov Games
Online Learning and Information Exponents: The Importance of Batch size & Time/Complexity Tradeoffs
A Bayesian Approach to Online Planning
AdsorbDiff: Adsorbate Placement via Conditional Denoising Diffusion
Decomposing and Editing Predictions by Modeling Model Computation
A Unified Adaptive Testing System Enabled by Hierarchical Structure Search
ReLU to the Rescue: Improve Your On-Policy Actor-Critic with Positive Advantages
NExT-Chat: An LMM for Chat, Detection and Segmentation
Identification and Estimation for Nonignorable Missing Data: A Data Fusion Approach
How Universal Polynomial Bases Enhance Spectral Graph Neural Networks: Heterophily, Over-smoothing, and Over-squashing
Learning Reward for Robot Skills Using Large Language Models via Self-Alignment
Neural Tangent Kernels for Axis-Aligned Tree Ensembles
Non-stationary Online Convex Optimization with Arbitrary Delays
Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement Learning
AND: Audio Network Dissection for Interpreting Deep Acoustic Models
CasCast: Skillful High-resolution Precipitation Nowcasting via Cascaded Modelling
Sample Average Approximation for Conditional Stochastic Optimization with Dependent Data
Infinite-Horizon Distributionally Robust Regret-Optimal Control
Stability and Multigroup Fairness in Ranking with Uncertain Predictions
Two-Stage Shadow Inclusion Estimation: An IV Approach for Causal Inference under Latent Confounding and Collider Bias
Non-Asymptotic Analysis for Single-Loop (Natural) Actor-Critic with Compatible Function Approximation
Risk Aware Benchmarking of Large Language Models
Subhomogeneous Deep Equilibrium Models
On Gradient-like Explanation under a Black-box Setting: When Black-box Explanations Become as Good as White-box
Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback
Structured Inverse-Free Natural Gradient Descent: Memory-Efficient & Numerically-Stable KFAC
Towards a Self-contained Data-driven Global Weather Forecasting Framework
Trustworthy Alignment of Retrieval-Augmented Large Language Models via Reinforcement Learning
Generalization Analysis of Stochastic Weight Averaging with General Sampling
Probability Distribution of Hypervolume Improvement in Bi-objective Bayesian Optimization
Value-Evolutionary-Based Reinforcement Learning
Pseudo-Calibration: Improving Predictive Uncertainty Estimation in Unsupervised Domain Adaptation
MAGNOLIA: Matching Algorithms via GNNs for Online Value-to-go Approximation
Position: Considerations for Differentially Private Learning with Large-Scale Public Pretraining
Discovering Environments with XRM
CW Complex Hypothesis for Image Data
Magicoder: Empowering Code Generation with OSS-Instruct
Towards a Better Theoretical Understanding of Independent Subnetwork Training
Parameter-Efficient Fine-Tuning with Discrete Fourier Transform
Harmonic Self-Conditioned Flow Matching for joint Multi-Ligand Docking and Binding Site Design
Clifford-Steerable Convolutional Neural Networks
On the Diminishing Returns of Width for Continual Learning
Implicit Bias of Policy Gradient in Linear Quadratic Control: Extrapolation to Unseen Initial States
Network Tight Community Detection
Test-Time Degradation Adaptation for Open-Set Image Restoration
Position: Cracking the Code of Cascading Disparity Towards Marginalized Communities
Enhancing Implicit Shape Generators Using Topological Regularizations
Weakly-Supervised Residual Evidential Learning for Multi-Instance Uncertainty Estimation
Mixtures of Experts Unlock Parameter Scaling for Deep RL
Sparse-IFT: Sparse Iso-FLOP Transformations for Maximizing Training Efficiency
DPOT: Auto-Regressive Denoising Operator Transformer for Large-Scale PDE Pre-Training
Best Arm Identification for Stochastic Rising Bandits
Exploring the LLM Journey from Cognition to Expression with Linear Representations
Cross-domain Open-world Discovery
Symbolic Music Generation with Non-Differentiable Rule Guided Diffusion
FedREDefense: Defending against Model Poisoning Attacks for Federated Learning using Model Update Reconstruction Error
From Generalization Analysis to Optimization Designs for State Space Models
Dissecting Multimodality in VideoQA Transformer Models by Impairing Modality Fusion
GeoReasoner: Geo-localization with Reasoning in Street Views using a Large Vision-Language Model
Deep Neural Room Acoustics Primitive
Allocation Requires Prediction Only if Inequality Is Low
One Size Fits All for Semantic Shifts: Adaptive Prompt Tuning for Continual Learning
Helpful or Harmful Data? Fine-tuning-free Shapley Attribution for Explaining Language Model Predictions
A Subquadratic Time Algorithm for Robust Sparse Mean Estimation
Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers
Position: Exploring the Robustness of Pipeline-Parallelism-Based Decentralized Training
Why Larger Language Models Do In-context Learning Differently?
Tackling Non-Stationarity in Reinforcement Learning via Causal-Origin Representation
Position: AI-Powered Autonomous Weapons Risk Geopolitical Instability and Threaten AI Research
Robust Graph Matching when Nodes are Corrupt
Linear Explanations for Individual Neurons
Novel Spectral Algorithms for the Partial Credit Model
Generalization Analysis for Multi-Label Learning
Self-Consistency Training for Density-Functional-Theory Hamiltonian Prediction
Improved Bounds for Pure Private Agnostic Learning: Item-Level and User-Level Privacy
Reinforcement Learning and Regret Bounds for Admission Control
Discrete Latent Perspective Learning for Segmentation and Detection
Simplicity Bias via Global Convergence of Sharpness Minimization
Equivariant Diffusion for Crystal Structure Prediction
Positional Knowledge is All You Need: Position-induced Transformer (PiT) for Operator Learning
Second-Order Uncertainty Quantification: A Distance-Based Approach
Variance-reduced Zeroth-Order Methods for Fine-Tuning Language Models
Sign Rank Limitations for Inner Product Graph Decoders
Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning
A Neural-Preconditioned Poisson Solver for Mixed Dirichlet and Neumann Boundary Conditions
An Embodied Generalist Agent in 3D World
Position: Graph Foundation Models Are Already Here
Enabling Few-Shot Learning with PID Control: A Layer Adaptive Optimizer
No Free Prune: Information-Theoretic Barriers to Pruning at Initialization
Provable Contrastive Continual Learning
On dimensionality of feature vectors in MPNNs
Outlier-aware Slicing for Post-Training Quantization in Vision Transformer
SAPG: Split and Aggregate Policy Gradients
Task Groupings Regularization: Data-Free Meta-Learning with Heterogeneous Pre-trained Models
Irregular Multivariate Time Series Forecasting: A Transformable Patching Graph Neural Networks Approach
Active Ranking and Matchmaking, with Perfect Matchings
Probabilistic Forecasting with Stochastic Interpolants and Föllmer Processes
Towards General Algorithm Discovery for Combinatorial Optimization: Learning Symbolic Branching Policy from Bipartite Graph
Collaborative Learning with Different Labeling Functions
A Persuasive Approach to Combating Misinformation
Provably Efficient Long-Horizon Exploration in Monte Carlo Tree Search through State Occupancy Regularization
Gaussian Plane-Wave Neural Operator for Electron Density Estimation
PinNet: Pinpoint Instructive Information for Retrieval Augmented Code-to-Text Generation
Testing the Feasibility of Linear Programs with Bandit Feedback
Genie: Generative Interactive Environments
SF-DQN: Provable Knowledge Transfer using Successor Feature for Deep Reinforcement Learning
Short-Long Convolutions Help Hardware-Efficient Linear Attention to Focus on Long Sequences
Energy-based Backdoor Defense without Task-Specific Samples and Model Retraining
Bridging Mini-Batch and Asymptotic Analysis in Contrastive Learning: From InfoNCE to Kernel-Based Losses
MLI Formula: A Nearly Scale-Invariant Solution with Noise Perturbation
Doubly Robust Causal Effect Estimation under Networked Interference via Targeted Learning
Wasserstein Wormhole: Scalable Optimal Transport Distance with Transformer
Reducing Balancing Error for Causal Inference via Optimal Transport
Absolute Policy Optimization: Enhancing Lower Probability Bound of Performance with High Confidence
Mastering Zero-Shot Interactions in Cooperative and Competitive Simultaneous Games
Position: Do pretrained Transformers Learn In-Context by Gradient Descent?
Position: Compositional Generative Modeling: A Single Model is Not All You Need
Nonsmooth Implicit Differentiation: Deterministic and Stochastic Convergence Rates
Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation
A Theoretical Analysis of Backdoor Poisoning Attacks in Convolutional Neural Networks
Reflected Flow Matching
Representation Surgery for Multi-Task Model Merging
An Explicit Frame Construction for Normalizing 3D Point Clouds
CodeIt: Self-Improving Language Models with Prioritized Hindsight Replay
Adaptive Proximal Gradient Methods Are Universal Without Approximation
A Multimodal Automated Interpretability Agent
A Geometric Decomposition of Finite Games: Convergence vs. Recurrence under Exponential Weights
Mapping the Multiverse of Latent Representations
Few-Shot Unsupervised Implicit Neural Shape Representation Learning with Spatial Adversaries
Benchmarking Deletion Metrics with the Principled Explanations
Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks
Position: Social Environment Design Should be Further Developed for AI-based Policy-Making
Practical Performance Guarantees for Pipelined DNN Inference
Finite Smoothing Algorithm for High-Dimensional Support Vector Machines and Quantile Regression
TabLog: Test-Time Adaptation for Tabular Data Using Logic Rules
Predictive Dynamic Fusion
Locally Differentially Private Decentralized Stochastic Bilevel Optimization with Guaranteed Convergence Accuracy
AutoOS: Make Your OS More Powerful by Exploiting Large Language Models
On the Second-Order Convergence of Biased Policy Gradient Algorithms
A Closer Look at the Limitations of Instruction Tuning
Memory Consolidation Enables Long-Context Video Understanding
Ameliorate Spurious Correlations in Dataset Condensation
Dealing With Unbounded Gradients in Stochastic Saddle-point Optimization
Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs
Mitigating Oversmoothing Through Reverse Process of GNNs for Heterophilic Graphs
A Bias-Variance-Covariance Decomposition of Kernel Scores for Generative Models
Online Learning in CMDPs: Handling Stochastic and Adversarial Constraints
Score identity Distillation: Exponentially Fast Distillation of Pretrained Diffusion Models for One-Step Generation
Naive Bayes Classifiers over Missing Data: Decision and Poisoning
Enhancing Storage and Computational Efficiency in Federated Multimodal Learning for Large-Scale Models
Exploring the Complexity of Deep Neural Networks through Functional Equivalence
Nash Incentive-compatible Online Mechanism Learning via Weakly Differentially Private Online Learning
Interpretable Deep Clustering for Tabular Data
Accelerating Iterative Retrieval-augmented Language Model Serving with Speculation
How to Escape Sharp Minima with Random Perturbations
Delving into the Convergence of Generalized Smooth Minimax Optimization
Prediction-powered Generalization of Causal Inferences
DPZero: Private Fine-Tuning of Language Models without Backpropagation
Optimizing Watermarks for Large Language Models
Transformers, parallel computation, and logarithmic depth
RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content
Diagnosing the Compositional Knowledge of Vision Language Models from a Game-Theoretic View
Graph Mixup on Approximate Gromov–Wasserstein Geodesics
AegisFL: Efficient and Flexible Privacy-Preserving Byzantine-Robust Cross-silo Federated Learning
On the Origins of Linear Representations in Large Language Models
High-Order Contrastive Learning with Fine-grained Comparative Levels for Sparse Ordinal Tensor Completion
Conformal prediction for multi-dimensional time series by ellipsoidal sets
Bootstrapping Fisher Market Equilibrium and First-Price Pacing Equilibrium
COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability
Limited Preference Aided Imitation Learning from Imperfect Demonstrations
Position: The Platonic Representation Hypothesis
Triadic-OCD: Asynchronous Online Change Detection with Provable Robustness, Optimality, and Convergence
Open Ad Hoc Teamwork with Cooperative Game Theory
A Simple Early Exiting Framework for Accelerated Sampling in Diffusion Models
Pragmatic Feature Preferences: Learning Reward-Relevant Preferences from Human Input
Improved Modelling of Federated Datasets using Mixtures-of-Dirichlet-Multinomials
Vanilla Bayesian Optimization Performs Great in High Dimensions
Structure Your Data: Towards Semantic Graph Counterfactuals
Private Vector Mean Estimation in the Shuffle Model: Optimal Rates Require Many Messages
Compact Optimality Verification for Optimization Proxies
Counterfactual Image Editing
Universal Consistency of Wide and Deep ReLU Neural Networks and Minimax Optimal Convergence Rates for Kolmogorov-Donoho Optimal Function Classes
One-Shot Strategic Classification Under Unknown Costs
A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts
Liouville Flow Importance Sampler
DNA-SE: Towards Deep Neural-Nets Assisted Semiparametric Estimation
KV-Runahead: Scalable Causal LLM Inference by Parallel Key-Value Cache Generation
What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation
Towards Resource-friendly, Extensible and Stable Incomplete Multi-view Clustering
Exploring Correlations of Self-Supervised Tasks for Graphs
Larimar: Large Language Models with Episodic Memory Control
Automated Loss function Search for Class-imbalanced Node Classification
Dynamic Byzantine-Robust Learning: Adapting to Switching Byzantine Workers
What Would Gauss Say About Representations? Probing Pretrained Image Models using Synthetic Gaussian Benchmarks
Information Complexity of Stochastic Convex Optimization: Applications to Generalization, Memorization, and Tracing
Differentially Private Representation Learning via Image Captioning
Multigroup Robustness
Boundary Exploration for Bayesian Optimization With Unknown Physical Constraints
Making Old Things New: A Unified Algorithm for Differentially Private Clustering
A Federated Stochastic Multi-level Compositional Minimax Algorithm for Deep AUC Maximization
Geometry-Calibrated DRO: Combating Over-Pessimism with Free Energy Implications
SimPro: A Simple Probabilistic Framework Towards Realistic Long-Tailed Semi-Supervised Learning
Revisit the Essence of Distilling Knowledge through Calibration
Enhancing Vision Transformer: Amplifying Non-Linearity in Feedforward Network Module
LoCoCo: Dropping In Convolutions for Long Context Compression
Exploring Training on Heterogeneous Data with Mixture of Low-rank Adapters
Residual Quantization with Implicit Neural Codebooks
Enabling Uncertainty Estimation in Iterative Neural Networks
Trust the Model Where It Trusts Itself - Model-Based Actor-Critic with Uncertainty-Aware Rollout Adaption
Online Learning in Betting Markets: Profit versus Prediction
Matroid Semi-Bandits in Sublinear Time
Position: LLMs Can’t Plan, But Can Help Planning in LLM-Modulo Frameworks
Local Feature Selection without Label or Feature Leakage for Interpretable Machine Learning Predictions
On the Last-Iterate Convergence of Shuffling Gradient Methods
On the Convergence of Projected Bures-Wasserstein Gradient Descent under Euclidean Strong Convexity
LaMAGIC: Language-Model-based Topology Generation for Analog Integrated Circuits
Knowledge-aware Reinforced Language Models for Protein Directed Evolution
TinyTrain: Resource-Aware Task-Adaptive Sparse Training of DNNs at the Data-Scarce Edge
Optimistic Multi-Agent Policy Gradient
Unifying Image Processing as Visual Prompting Question Answering
Integrating Global Context Contrast and Local Sensitivity for Blind Image Quality Assessment
Non-confusing Generation of Customized Concepts in Diffusion Models
Sign Gradient Descent-based Neuronal Dynamics: ANN-to-SNN Conversion Beyond ReLU Network
Position: Optimization in SciML Should Employ the Function Space Geometry
The Non-linear $F$-Design and Applications to Interactive Learning
Differentially private exact recovery for stochastic block models
Exact Conversion of In-Context Learning to Model Weights in Linearized-Attention Transformers
Collective Certified Robustness against Graph Injection Attacks
Decentralized Convex Finite-Sum Optimization with Better Dependence on Condition Numbers
Flexible Residual Binarization for Image Super-Resolution
Accurate LoRA-Finetuning Quantization of LLMs via Information Retention
Collaborative Heterogeneous Causal Inference Beyond Meta-analysis
Federated Representation Learning in the Under-Parameterized Regime
Compressing Large Language Models by Joint Sparsification and Quantization
Towards Optimal Adversarial Robust Q-learning with Bellman Infinity-error
CaM: Cache Merging for Memory-efficient LLMs Inference
Relaxed Quantile Regression: Prediction Intervals for Asymmetric Noise
Criterion Collapse and Loss Distribution Control
KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache
Parallelized Spatiotemporal Slot Binding for Videos
Neural Collapse meets Differential Privacy: Curious behaviors of NoisyGD with Near-Perfect Representation Learning
MathScale: Scaling Instruction Tuning for Mathematical Reasoning
$H$-Consistency Guarantees for Regression
Regression with Multi-Expert Deferral
CKGConv: General Graph Convolution with Continuous Kernels
Differentially Private Domain Adaptation with Theoretical Guarantees
Model-Based Minimum Bayes Risk Decoding for Text Generation
Exploring the Benefit of Activation Sparsity in Pre-training
convSeq: Fast and Scalable Method for Detecting Patterns in Spike Data
Covert Malicious Finetuning: Challenges in Safeguarding LLM Adaptation
MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities
Efficient Non-stationary Online Learning by Wavelets with Applications to Online Distribution Shift Adaptation
Embarrassingly Parallel GFlowNets
Algorithm of Thoughts: Enhancing Exploration of Ideas in Large Language Models
DFD: Distillng the Feature Disparity Differently for Detectors
Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications
Active Adaptive Experimental Design for Treatment Effect Estimation with Covariate Choice
EE-LLM: Large-Scale Training and Inference of Early-Exit Large Language Models with 3D Parallelism
Visual-Text Cross Alignment: Refining the Similarity Score in Vision-Language Models
Prometheus: Out-of-distribution Fluid Dynamics Modeling with Disentangled Graph ODE
A Fixed-Point Approach for Causal Generative Modeling
Prompting4Debugging: Red-Teaming Text-to-Image Diffusion Models by Finding Problematic Prompts
FRAPPÉ: A Group Fairness Framework for Post-Processing Everything
Learning Cognitive Maps from Transformer Representations for Efficient Planning in Partially Observed Environments
Improving Neural Logic Machines via Failure Reflection
Differentially Private Post-Processing for Fair Regression
Measuring Stochastic Data Complexity with Boltzmann Influence Functions
Improved Stability and Generalization Guarantees of the Decentralized SGD Algorithm
Box Facets and Cut Facets of Lifted Multicut Polytopes
Concentration Inequalities for General Functions of Heavy-Tailed Random Variables
Coarse-to-Fine Highlighting: Reducing Knowledge Hallucination in Large Language Models
Towards Understanding the Word Sensitivity of Attention Layers: A Study via Random Features
Mimicking Better by Matching the Approximate Action Distribution
Improving Sharpness-Aware Minimization by Lookahead
Learning Mixtures of Gaussian Processes through Random Projection
Clustered Federated Learning via Gradient-based Partitioning
Learning Label Shift Correction for Test-Agnostic Long-Tailed Recognition
Efficient Value Iteration for s-rectangular Robust Markov Decision Processes
Accelerated Policy Gradient for s-rectangular Robust MDPs with Large State Spaces
Neural operators meet conjugate gradients: The FCG-NO method for efficient PDE solving
Provable Privacy with Non-Private Pre-Processing
Effects of Exponential Gaussian Distribution on (Double Sampling) Randomized Smoothing
Correlation-Induced Label Prior for Semi-Supervised Multi-Label Learning
Improving Context Understanding in Multimodal Large Language Models via Multimodal Composition Learning
Learning to Continually Learn with the Bayesian Principle
Listwise Reward Estimation for Offline Preference-based Reinforcement Learning
A Neural-Guided Dynamic Symbolic Network for Exploring Mathematical Expressions from Data
Quality-Diversity Actor-Critic: Learning High-Performing and Diverse Behaviors via Value and Successor Features Critics
Superpoint Gaussian Splatting for Real-Time High-Fidelity Dynamic Scene Reconstruction
Graph Adversarial Diffusion Convolution
Activation-Descent Regularization for Input Optimization of ReLU Networks
Finding NEM-U: Explaining unsupervised representation learning through neural network generated explanation masks
Out-of-Distribution Detection via Deep Multi-Comprehension Ensemble
Predictive Performance Comparison of Decision Policies Under Confounding
GliDe with a CaPE: A Low-Hassle Method to Accelerate Speculative Decoding
Disentangled Continual Graph Neural Architecture Search with Invariant Modular Supernet
Towards Unified Multi-granularity Text Detection with Interactive Attention
Causal Inference from Competing Treatments
Weisfeiler-Leman at the margin: When more expressivity matters
A Field Guide for Pacing Budget and ROS Constraints
Orthogonal Bootstrap: Efficient Simulation of Input Uncertainty
EiG-Search: Generating Edge-Induced Subgraphs for GNN Explanation in Linear Time
Bayesian Design Principles for Offline-to-Online Reinforcement Learning
Adversarially Robust Hypothesis Transfer Learning
Stealthy Imitation: Reward-guided Environment-free Policy Stealing
Out of the Ordinary: Spectrally Adapting Regression for Covariate Shift
Active Label Correction for Semantic Segmentation with Foundation Models
Block Acceleration Without Momentum: On Optimal Stepsizes of Block Gradient Descent for Least-Squares
Representation Surgery: Theory and Practice of Affine Steering
Iterative Regularized Policy Optimization with Imperfect Demonstrations
Networked Inequality: Preferential Attachment Bias in Graph Neural Network Link Prediction
A Doubly Recursive Stochastic Compositional Gradient Descent Method for Federated Multi-Level Compositional Optimization
LangCell: Language-Cell Pre-training for Cell Identity Understanding
Constrained Reinforcement Learning Under Model Mismatch
Can Mamba Learn How To Learn? A Comparative Study on In-Context Learning Tasks
Retrieval Across Any Domains via Large-scale Pre-trained Model
Neurodegenerative Brain Network Classification via Adaptive Diffusion with Temporal Regularization
QuRating: Selecting High-Quality Data for Training Language Models
Active Statistical Inference
StableMask: Refining Causal Masking in Decoder-only Transformer
DsDm: Model-Aware Dataset Selection with Datamodels
Differentiable Weightless Neural Networks
SCoRe: Submodular Combinatorial Representation Learning
GFlowNet Training by Policy Gradients
ByMI: Byzantine Machine Identification with False Discovery Rate Control
Learning Pseudo-Contractive Denoisers for Inverse Problems
Implicit meta-learning may lead language models to trust more reliable sources
On Multi-Armed Bandit with Impatient Arms
Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations
Accelerating PDE Data Generation via Differential Operator Action in Solution Space
3D Geometric Shape Assembly via Efficient Point Cloud Matching
To Each (Textual Sequence) Its Own: Improving Memorized-Data Unlearning in Large Language Models
MOMENT: A Family of Open Time-series Foundation Models
Auditing Private Prediction
Adaptive Advantage-Guided Policy Regularization for Offline Reinforcement Learning
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
Towards General Neural Surrogate Solvers with Specialized Neural Accelerators
Ranking-based Client Imitation Selection for Efficient Federated Learning
An Image is Worth Multiple Words: Discovering Object Level Concepts using Multi-Concept Prompt Learning
How Interpretable Are Interpretable Graph Neural Networks?
The Role of Learning Algorithms in Collective Action
Feedback Loops With Language Models Drive In-Context Reward Hacking
Practical Hamiltonian Monte Carlo on Riemannian Manifolds via Relativity Theory
A Tensor Decomposition Perspective on Second-order RNNs
MusicRL: Aligning Music Generation to Human Preferences
Random Latent Exploration for Deep Reinforcement Learning
Optimal Kernel Quantile Learning with Random Features
Dual Operating Modes of In-Context Learning
On the Identifiability of Switching Dynamical Systems
Distributed High-Dimensional Quantile Regression: Estimation Efficiency and Support Recovery
On Online Experimentation without Device Identifiers
Asymptotics of feature learning in two-layer networks after one gradient-step
3D-VLA: A 3D Vision-Language-Action Generative World Model
Reducing sequential change detection to sequential estimation
Position: Video as the New Language for Real-World Decision Making
A Geometric Explanation of the Likelihood OOD Detection Paradox
Federated Neuro-Symbolic Learning
SuDA: Support-based Domain Adaptation for Sim2Real Hinge Joint Tracking with Flexible Sensors
Accelerated Algorithms for Constrained Nonconvex-Nonconcave Min-Max Optimization and Comonotone Inclusion
Defense against Model Extraction Attack by Bayesian Active Watermarking
Fine-grained Local Sensitivity Analysis of Standard Dot-Product Self-Attention
Contextual Feature Selection with Conditional Stochastic Gates
From Fourier to Neural ODEs: Flow Matching for Modeling Complex Systems
Recovering Labels from Local Updates in Federated Learning
Robust Sparse Estimation for Gaussians with Optimal Error under Huber Contamination
KISA: A Unified Keyframe Identifier and Skill Annotator for Long-Horizon Robotics Demonstrations
Faster Adaptive Decentralized Learning Algorithms
Stability and Generalization for Stochastic Recursive Momentum-based Algorithms for (Strongly-)Convex One to $K$-Level Stochastic Optimizations
Fundamental Benefit of Alternating Updates in Minimax Optimization
Position: Why We Must Rethink Empirical Research in Machine Learning
Model-based Reinforcement Learning for Confounded POMDPs
GenCO: Generating Diverse Designs with Combinatorial Constraints
Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs
Viewing Transformers Through the Lens of Long Convolutions Layers
Evolving Subnetwork Training for Large Language Models
Localizing Task Information for Improved Model Merging and Compression
Evaluating Quantized Large Language Models
Deconstructing the Goldilocks Zone of Neural Network Initialization
Sharp Rates in Dependent Learning Theory: Avoiding Sample Size Deflation for the Square Loss
Adversarially Robust Deep Multi-View Clustering: A Novel Attack and Defense Framework
Generalized Neural Collapse for a Large Number of Classes
A Fine-grained Analysis of Fitted Q-evaluation: Beyond Parametric Models
Outlier-robust Kalman Filtering through Generalised Bayes
Standardized Interpretable Fairness Measures for Continuous Risk Scores
Optimal Recurrent Network Topologies for Dynamical Systems Reconstruction
Unlock the Cognitive Generalization of Deep Reinforcement Learning via Granular Ball Representation
WARM: On the Benefits of Weight Averaged Reward Models
Floating Anchor Diffusion Model for Multi-motif Scaffolding
Nash Learning from Human Feedback
Hybrid Neural Representations for Spherical Data
Reflective Policy Optimization
Meta Evidential Transformer for Few-Shot Open-Set Recognition
Implicit Compressibility of Overparametrized Neural Networks Trained with Heavy-Tailed SGD
Beyond Point Prediction: Score Matching-based Pseudolikelihood Estimation of Neural Marked Spatio-Temporal Point Process
Learning with Adaptive Resource Allocation
Adaptive Robust Learning using Latent Bernoulli Variables
Rethinking DP-SGD in Discrete Domain: Exploring Logistic Distribution in the Realm of signSGD
Provable Interactive Learning with Hindsight Instruction Feedback
Self-Composing Policies for Scalable Continual Reinforcement Learning
Seizing Serendipity: Exploiting the Value of Past Success in Off-Policy Actor-Critic
Generalized Preference Optimization: A Unified Approach to Offline Alignment
Reference Neural Operators: Learning the Smooth Dependence of Solutions of PDEs on Geometric Deformations
Learning Iterative Reasoning through Energy Diffusion
Online Isolation Forest
Bayesian Program Learning by Decompiling Amortized Knowledge
Understanding Unimodal Bias in Multimodal Deep Linear Networks
Efficient Black-box Adversarial Attacks via Bayesian Optimization Guided by a Function Prior
Quantum Algorithms and Lower Bounds for Finite-Sum Optimization
How Far Can Fairness Constraints Help Recover From Biased Data?
Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution
Replicable Learning of Large-Margin Halfspaces
Enforcing Constraints in RNA Secondary Structure Predictions: A Post-Processing Framework Based on the Assignment Problem
Longitudinal Targeted Minimum Loss-based Estimation with Temporal-Difference Heterogeneous Transformer
Attribute Based Interpretable Evaluation Metrics for Generative Models
TSLANet: Rethinking Transformers for Time Series Representation Learning
Self-cognitive Denoising in the Presence of Multiple Noisy Label Sources
Purifying Quantization-conditioned Backdoors via Layer-wise Activation Correction with Distribution Approximation
Multi-Agent Reinforcement Learning Meets Leaf Sequencing in Radiotherapy
On Which Nodes Does GCN Fail? Enhancing GCN From the Node Perspective
Parameter-Efficient Fine-Tuning with Controls
AlphaZero-Like Tree-Search can Guide Large Language Model Decoding and Training
Toward Availability Attacks in 3D Point Clouds
RIME: Robust Preference-based Reinforcement Learning with Noisy Preferences
Uncertainty-Aware Reward-Free Exploration with General Function Approximation
Debiased Offline Representation Learning for Fast Online Adaptation in Non-stationary Dynamics
Planning, Fast and Slow: Online Reinforcement Learning with Action-Free Offline Data via Multiscale Planners
Dynamic Anisotropic Smoothing for Noisy Derivative-Free Optimization
FuRL: Visual-Language Models as Fuzzy Rewards for Reinforcement Learning
Proactive Detection of Voice Cloning with Localized Watermarking
Efficient World Models with Context-Aware Tokenization
ULTRAFEEDBACK: Boosting Language Models with Scaled AI Feedback
Out-of-Domain Generalization in Dynamical Systems Reconstruction
TENG: Time-Evolving Natural Gradient for Solving PDEs With Deep Neural Nets Toward Machine Precision
Multi-View Stochastic Block Models
Position: Relational Deep Learning - Graph Representation Learning on Relational Databases
Random features models: a way to study the success of naive imputation
Learning Constraints from Offline Demonstrations via Superior Distribution Correction Estimation
Token-Specific Watermarking with Enhanced Detectability and Semantic Coherence for Large Language Models
Bayesian Knowledge Distillation: A Bayesian Perspective of Distillation with Uncertainty Quantification
On Statistical Learning Theory for Distributional Inputs
Weisfeiler Leman for Euclidean Equivariant Machine Learning
DecisionNCE: Embodied Multimodal Representations via Implicit Preference Learning
Reservoir Computing for Short High-Dimensional Time Series: an Application to SARS-CoV-2 Hospitalization Forecast
FedBPT: Efficient Federated Black-box Prompt Tuning for Large Language Models
Probabilistic Subgoal Representations for Hierarchical Reinforcement Learning
How Deep Do We Need: Accelerating Training and Inference of Neural ODEs via Control Perspective
Why do Variational Autoencoders Really Promote Disentanglement?
SparQ Attention: Bandwidth-Efficient LLM Inference
Knowledge Graphs Can be Learned with Just Intersection Features
Consistent Submodular Maximization
Counterfactual Metarules for Local and Global Recourse
Learning Causal Dynamics Models in Object-Oriented Environments
Decoupling Feature Extraction and Classification Layers for Calibrated Neural Networks
Robustly Learning Single-Index Models via Alignment Sharpness
Matrix Information Theory for Self-Supervised Learning
Prior Mismatch and Adaptation in PnP-ADMM with a Nonconvex Convergence Analysis
MLIP: Efficient Multi-Perspective Language-Image Pretraining with Exhaustive Data Utilization
Inherent Trade-Offs between Diversity and Stability in Multi-Task Benchmarks
Adaptive Sampling of k-Space in Magnetic Resonance for Rapid Pathology Prediction
Open-Vocabulary Calibration for Fine-tuned CLIP
DFlow: A Generative Model Combining Denoising AutoEncoder and Normalizing Flow for High Fidelity Waveform Generation
Do Large Code Models Understand Programming Concepts? Counterfactual Analysis for Code Predicates
Learning Optimal Deterministic Policies with Stochastic Policy Gradients
Tuning-Free Stochastic Optimization
Learning Multiple Secrets in Mastermind
When is Transfer Learning Possible?
Flextron: Many-in-One Flexible Large Language Model
Pi-DUAL: Using privileged information to distinguish clean from noisy labels
Sliding Down the Stairs: How Correlated Latent Variables Accelerate Learning with Neural Networks
Promptbreeder: Self-Referential Self-Improvement via Prompt Evolution
Acquiring Diverse Skills using Curriculum Reinforcement Learning with Mixture of Experts
A Universal Class of Sharpness-Aware Minimization Algorithms
Uncertainty for Active Learning on Graphs
A New Computationally Efficient Algorithm to solve Feature Selection for Functional Data Classification in High-dimensional Spaces
Converting Transformers to Polynomial Form for Secure Inference Over Homomorphic Encryption
Identifiability Matters: Revealing the Hidden Recoverable Condition in Unbiased Learning to Rank
A Unified View of FANOVA: A Comprehensive Bayesian Framework for Component Selection and Estimation
EvIL: Evolution Strategies for Generalisable Imitation Learning
QuIP$\#$: Even Better LLM Quantization with Hadamard Incoherence and Lattice Codebooks
On the Independence Assumption in Neurosymbolic Learning
Topological Neural Networks go Persistent, Equivariant, and Continuous
ReconBoost: Boosting Can Achieve Modality Reconcilement
On the Generalization of Equivariant Graph Neural Networks
Knowledge Distillation with Auxiliary Variable
AquaLoRA: Toward White-box Protection for Customized Stable Diffusion Models via Watermark LoRA
Tilting the Odds at the Lottery: the Interplay of Overparameterisation and Curricula in Neural Networks
Sequential Neural Score Estimation: Likelihood-Free Inference with Conditional Score Based Diffusion Models
Position: Quo Vadis, Unsupervised Time Series Anomaly Detection?
EquiAV: Leveraging Equivariance for Audio-Visual Contrastive Learning
CARTE: Pretraining and Transfer for Tabular Learning
Equivariant Deep Weight Space Alignment
Sharpness-Aware Data Generation for Zero-shot Quantization
Optimal Coresets for Low-Dimensional Geometric Median
Generalization Error of Graph Neural Networks in the Mean-field Regime
Compositional Capabilities of Autoregressive Transformers: A Study on Synthetic, Interpretable Tasks
ALERT-Transformer: Bridging Asynchronous and Synchronous Machine Learning for Real-Time Event-based Spatio-Temporal Data
On Convergence of Incremental Gradient for Non-convex Smooth Functions
On Stronger Computational Separations Between Multimodal and Unimodal Machine Learning
The Computational Complexity of Finding Second-Order Stationary Points
Towards an Understanding of Stepwise Inference in Transformers: A Synthetic Graph Navigation Model
Active Preference Learning for Large Language Models
Revisiting Character-level Adversarial Attacks for Language Models
HyperFields: Towards Zero-Shot Generation of NeRFs from Text
Context-Guided Diffusion for Out-of-Distribution Molecular and Protein Design
Data-Efficient Molecular Generation with Hierarchical Textual Inversion
An Infinite-Width Analysis on the Jacobian-Regularised Training of a Neural Network
Compositional Image Decomposition with Diffusion Models
Layerwise Change of Knowledge in Neural Networks
Boximator: Generating Rich and Controllable Motions for Video Synthesis
Vector Quantization Pretraining for EEG Time Series with Random Projection and Phase Alignment
Synergistic Integration of Coordinate Network and Tensorial Feature for Improving Neural Radiance Fields from Sparse Inputs
Predicting and Interpreting Energy Barriers of Metallic Glasses with Graph Neural Networks
When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models
Faster Sampling via Stochastic Gradient Proximal Sampler
Maestro: Uncovering Low-Rank Structures via Trainable Decomposition
Learning Useful Representations of Recurrent Neural Network Weight Matrices
Highway Value Iteration Networks
Convergence and Complexity Guarantee for Inexact First-order Riemannian Optimization Algorithms
Symmetry Induces Structure and Constraint of Learning
Algorithmic Stability Unleashed: Generalization Bounds with Unbounded Losses
Diffusion Models Encode the Intrinsic Dimension of Data Manifolds
$\mathtt{VITS}$ : Variational Inference Thompson Sampling for contextual bandits
Improved Operator Learning by Orthogonal Attention
ViP: A Differentially Private Foundation Model for Computer Vision
DoRA: Weight-Decomposed Low-Rank Adaptation
Fair Federated Learning via the Proportional Veto Core
Disparate Impact on Group Accuracy of Linearization for Private Inference
Position: Do Not Explain Vision Models Without Context
Contamination-Resilient Anomaly Detection via Adversarial Learning on Partially-Observed Normal and Anomalous Data
Enhancing Adversarial Robustness in SNNs with Sparse Gradients
Fair Risk Control: A Generalized Framework for Calibrating Multi-group Fairness Risks
MH-pFLID: Model Heterogeneous personalized Federated Learning via Injection and Distillation for Medical Data Analysis
High-Probability Convergence for Composite and Distributed Stochastic Minimization and Variational Inequalities with Heavy-Tailed Noise
Monotone Individual Fairness
Sub-token ViT Embedding via Stochastic Resonance Transformers
CoLoRA: Continuous low-rank adaptation for reduced implicit neural modeling of parameterized partial differential equations
Neural NeRF Compression
An Iterative Min-Min Optimization Method for Sparse Bayesian Learning
Quality-Diversity with Limited Resources
Tilt and Average : Geometric Adjustment of the Last Layer for Recalibration
No Wrong Turns: The Simple Geometry Of Neural Networks Optimization Paths
DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents
Understanding Diffusion Models by Feynman's Path Integral
Generative Modeling on Manifolds Through Mixture of Riemannian Diffusion Processes
Rethinking Guidance Information to Utilize Unlabeled Samples: A Label Encoding Perspective
A Primal-Dual Algorithm for Offline Constrained Reinforcement Learning with Linear MDPs
Learning Causal Relations from Subsampled Time Series with Two Time-Slices
Graph2Tac: Online Representation Learning of Formal Math Concepts
Full-Atom Peptide Design based on Multi-modal Flow Matching
Compress Clean Signal from Noisy Raw Image: A Self-Supervised Approach
Creative Text-to-Audio Generation via Synthesizer Programming
Unveiling the Potential of AI for Nanomaterial Morphology Prediction
Partial Multi-View Multi-Label Classification via Semantic Invariance Learning and Prototype Modeling
IW-GAE: Importance weighted group accuracy estimation for improved calibration and model selection in unsupervised domain adaptation
Adversarial Robustness Limits via Scaling-Law and Human-Alignment Studies
Iterative Search Attribution for Deep Neural Networks
Position: Data-driven Discovery with Large Generative Models
BOtied: Multi-objective Bayesian optimization with tied multivariate ranks
Diversified Batch Selection for Training Acceleration
Leveraging VLM-Based Pipelines to Annotate 3D Objects
Parameter-Dependent Competitive Analysis for Online Capacitated Coverage Maximization through Boostings and Attenuations
The Privacy Power of Correlated Noise in Decentralized Learning
Kepler codebook
Discovering Symmetry Breaking in Physical Systems with Relaxed Group Convolution
A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity
Fine-tuning Reinforcement Learning Models is Secretly a Forgetting Mitigation Problem
Gated Linear Attention Transformers with Hardware-Efficient Training
Data Engineering for Scaling Language Models to 128K Context
Pricing with Contextual Elasticity and Heteroscedastic Valuation
Quantum Implicit Neural Representations
Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads
Privacy-Preserving Data Release Leveraging Optimal Transport and Particle Gradient Descent
Bounding the Excess Risk for Linear Models Trained on Marginal-Preserving, Differentially-Private, Synthetic Data
An Efficient Self-Learning Framework For Interactive Spoken Dialog Systems
Sample-specific Masks for Visual Reprogramming-based Prompting
Mean-field Underdamped Langevin Dynamics and its Spacetime Discretization
GroupCover: A Secure, Efficient and Scalable Inference Framework for On-device Model Protection based on TEEs
Equivariant Frames and the Impossibility of Continuous Canonicalization
Learning-Efficient Yet Generalizable Collaborative Filtering for Item Recommendation
Be Your Own Neighborhood: Detecting Adversarial Examples by the Neighborhood Relations Built on Self-Supervised Learning
Fast Text-to-3D-Aware Face Generation and Manipulation via Direct Cross-modal Mapping and Geometric Regularization
DRCT: Diffusion Reconstruction Contrastive Training towards Universal Detection of Diffusion Generated Images
Certifiably Byzantine-Robust Federated Conformal Prediction
Position: Beyond Personhood: Agency, Accountability, and the Limits of Anthropomorphic Ethical Analysis
Case-Based or Rule-Based: How Do Transformers Do the Math?
Conformal Validity Guarantees Exist for Any Data Distribution (and How to Find Them)
SpikeLM: Towards General Spike-Driven Language Modeling via Elastic Bi-Spiking Mechanisms
Size-invariance Matters: Rethinking Metrics and Losses for Imbalanced Multi-object Salient Object Detection
Conformal Prediction Sets Improve Human Decision Making
To the Max: Reinventing Reward in Reinforcement Learning
PerceptAnon: Exploring the Human Perception of Image Anonymization Beyond Pseudonymization for GDPR
A Probabilistic Approach to Learning the Degree of Equivariance in Steerable CNNs
Learning to Route Among Specialized Experts for Zero-Shot Generalization
Universality of Linear Recurrences Followed by Non-linear Projections: Finite-Width Guarantees and Benefits of Complex Eigenvalues
Perturb-and-Project: Differentially Private Similarities and Marginals
Position: Scarce Resource Allocations That Rely On Machine Learning Should Be Randomized
Diffusion Tempering Improves Parameter Estimation with Probabilistic Integrators for Ordinary Differential Equations
Lyapunov-stable Neural Control for State and Output Feedback: A Novel Formulation
Biharmonic Distance of Graphs and its Higher-Order Variants: Theoretical Properties with Applications to Centrality and Clustering
Improved Generalization of Weight Space Networks via Augmentations
Regression Learning with Limited Observations of Multivariate Outcomes and Features
Prompting a Pretrained Transformer Can Be a Universal Approximator
Diffusion-based Missing-view Generation With the Application on Incomplete Multi-view Clustering
Dynamic Correlation Clustering in Sublinear Update Time
FreeBind: Free Lunch in Unified Multimodal Space via Knowledge Fusion
$S^2$IP-LLM: Semantic Space Informed Prompt Learning with LLM for Time Series Forecasting
Borda Regret Minimization for Generalized Linear Dueling Bandits
Position: Embracing Negative Results in Machine Learning
Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
Navigating Scaling Laws: Compute Optimality in Adaptive Model Training
Neuroexplicit Diffusion Models for Inpainting of Optical Flow Fields
Expand-and-Cluster: Parameter Recovery of Neural Networks
When Do Skills Help Reinforcement Learning? A Theoretical Analysis of Temporal Abstractions
Learning a Diffusion Model Policy from Rewards via Q-Score Matching
Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models
Hybrid Inverse Reinforcement Learning
Switched Flow Matching: Eliminating Singularities via Switching ODEs
Look Ahead or Look Around? A Theoretical Comparison Between Autoregressive and Masked Pretraining
New Sample Complexity Bounds for Sample Average Approximation in Heavy-Tailed Stochastic Programming
Position: A Call for Embodied AI
Editing Partially Observable Networks via Graph Diffusion Models
Prompt Sketching for Large Language Models
Stacking Deep Set Networks and Pooling by Quantiles
Revisiting the Power of Prompt for Visual Tuning
Implicit Representations via Operator Learning
Explaining Graph Neural Networks via Structure-aware Interaction Index
DynSyn: Dynamical Synergistic Representation for Efficient Learning and Control in Overactuated Embodied Systems
Evaluation of LLMs on Syntax-Aware Code Fill-in-the-Middle Tasks
CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution
Human Alignment of Large Language Models through Online Preference Optimisation
Diffusion Models Demand Contrastive Guidance for Adversarial Purification to Advance
Estimating Barycenters of Distributions with Neural Optimal Transport
Geometric Active Exploration in Markov Decision Processes: the Benefit of Abstraction
Unveiling Privacy, Memorization, and Input Curvature Links
The Stronger the Diffusion Model, the Easier the Backdoor: Data Poisoning to Induce Copyright BreachesWithout Adjusting Finetuning Pipeline
On the Duality Between Sharpness-Aware Minimization and Adversarial Training
ESM All-Atom: Multi-Scale Protein Language Model for Unified Molecular Modeling
Light and Optimal Schrödinger Bridge Matching
Coarse-To-Fine Tensor Trains for Compact Visual Representations
Non-convex Stochastic Composite Optimization with Polyak Momentum
IIANet: An Intra- and Inter-Modality Attention Network for Audio-Visual Speech Separation
Proactive DP: A Multiple Target Optimization Framework for DP-SGD
Generalization Bounds for Causal Regression: Insights, Guarantees and Sensitivity Analysis
Variational Linearized Laplace Approximation for Bayesian Deep Learning
Unifying Bayesian Flow Networks and Diffusion Models through Stochastic Differential Equations
Lie Neurons: Adjoint-Equivariant Neural Networks for Semisimple Lie Algebras
Correcting Diffusion-Based Perceptual Image Compression with Privileged End-to-End Decoder
Defense against Backdoor Attack on Pre-trained Language Models via Head Pruning and Attention Normalization
What Will My Model Forget? Forecasting Forgotten Examples in Language Model Refinement
Token-level Direct Preference Optimization
Privacy Preserving Adaptive Experiment Design
Promoting External and Internal Equities Under Ex-Ante/Ex-Post Metrics in Online Resource Allocation
A Circuit Domain Generalization Framework for Efficient Logic Synthesis in Chip Design
INViT: A Generalizable Routing Problem Solver with Invariant Nested View Transformer
Decoupling Learning and Decision-Making: Breaking the $\mathcal{O}(\sqrt{T})$ Barrier in Online Resource Allocation with First-Order Methods
Exploiting Code Symmetries for Learning Program Semantics
Gradual Divergence for Seamless Adaptation: A Novel Domain Incremental Learning Method
Finite-Time Convergence and Sample Complexity of Actor-Critic Multi-Objective Reinforcement Learning
PlanDQ: Hierarchical Plan Orchestration via D-Conductor and Q-Performer
Tuning-free Estimation and Inference of Cumulative Distribution Function under Local Differential Privacy
Simultaneous identification of models and parameters of scientific simulators
Risk Estimation in a Markov Cost Process: Lower and Upper Bounds
Sparse is Enough in Fine-tuning Pre-trained Large Language Models
Robust Optimization in Protein Fitness Landscapes Using Reinforcement Learning in Latent Space
Efficient Contextual Bandits with Uninformed Feedback Graphs
Log Neural Controlled Differential Equations: The Lie Brackets Make A Difference
Switching the Loss Reduces the Cost in Batch Reinforcement Learning
Position: Levels of AGI for Operationalizing Progress on the Path to AGI
Random Exploration in Bayesian Optimization: Order-Optimal Regret and Computational Efficiency
LoRA+: Efficient Low Rank Adaptation of Large Models
FedSC: Provable Federated Self-supervised Learning with Spectral Contrastive Objective over Non-i.i.d. Data
DFA-RAG: Conversational Semantic Router for Large Language Model with Definite Finite Automaton
Exact Soft Analytical Side-Channel Attacks using Tractable Circuits
CLIPZyme: Reaction-Conditioned Virtual Screening of Enzymes
Scaling Exponents Across Parameterizations and Optimizers
SqueezeLLM: Dense-and-Sparse Quantization
Dynamic Metric Embedding into lp Space
Domain-wise Data Acquisition to Improve Performance under Distribution Shift
The Entropy Enigma: Success and Failure of Entropy Minimization
Early Time Classification with Accumulated Accuracy Gap Control
On The Fairness Impacts of Hardware Selection in Machine Learning
TVE: Learning Meta-attribution for Transferable Vision Explainer
Towards Modular LLMs by Building and Reusing a Library of LoRAs
SLOG: An Inductive Spectral Graph Neural Network Beyond Polynomial Filter
Autonomous Sparse Mean-CVaR Portfolio Optimization
Global Reinforcement Learning : Beyond Linear and Convex Rewards via Submodular Semi-gradient Methods
PointMC: Multi-instance Point Cloud Registration based on Maximal Cliques
OxyGenerator: Reconstructing Global Ocean Deoxygenation Over a Century with Deep Learning
Large Scale Dataset Distillation with Domain Shift
Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning
Regularized Q-learning through Robust Averaging
SHINE: Shielding Backdoors in Deep Reinforcement Learning
SSL4Q: Semi-Supervised Learning of Quantum Data with Application to Quantum State Classification
Hyperbolic Optimizer as a Dynamical System
MindEye2: Shared-Subject Models Enable fMRI-To-Image With 1 Hour of Data
Deletion-Anticipative Data Selection with a Limited Budget
ODIN: Disentangled Reward Mitigates Hacking in RLHF
DITTO: Diffusion Inference-Time T-Optimization for Music Generation
Towards Generalization beyond Pointwise Learning: A Unified Information-theoretic Perspective
Neural Collapse in Multi-label Learning with Pick-all-label Loss
CHAI: Clustered Head Attention for Efficient LLM Inference
Bayesian Optimization of Function Networks with Partial Evaluations
Effective Federated Graph Matching
Restoring balance: principled under/oversampling of data for optimal classification
InstructZero: Efficient Instruction Optimization for Black-Box Large Language Models
Surprisingly Strong Performance Prediction with Neural Graph Features
Model Tailor: Mitigating Catastrophic Forgetting in Multi-modal Large Language Models
Off-policy Evaluation Beyond Overlap: Sharp Partial Identification Under Smoothness
StableSSM: Alleviating the Curse of Memory in State-space Models through Stable Reparameterization
Bounded and Uniform Energy-based Out-of-distribution Detection for Graphs
Feasible Reachable Policy Iteration
Variational Learning is Effective for Large Deep Networks
Scalable Pre-training of Large Autoregressive Image Models
Submodular framework for structured-sparse optimal transport
Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
DAG-Based Column Generation for Adversarial Team Games
Language-guided Skill Learning with Temporal Variational Inference
Safe and Robust Subgame Exploitation in Imperfect Information Games
Configurable Mirror Descent: Towards a Unification of Decision Making
Learning to Explore for Stochastic Gradient MCMC
Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity
SeMOPO: Learning High-quality Model and Policy from Low-quality Offline Visual Datasets
The Perception-Robustness Tradeoff in Deterministic Image Restoration
DPN: Decoupling Partition and Navigation for Neural Solvers of Min-max Vehicle Routing Problems
Smooth Tchebycheff Scalarization for Multi-Objective Optimization
Position: Benchmarking is Limited in Reinforcement Learning Research
Evolution of Heuristics: Towards Efficient Automatic Algorithm Design Using Large Language Model
ExCP: Extreme LLM Checkpoint Compression via Weight-Momentum Joint Shrinking
SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization
Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-constraint
On the Calibration of Human Pose Estimation
Agnostic Interactive Imitation Learning: New Theory and Practical Algorithms
A decoder-only foundation model for time-series forecasting
In-Context Freeze-Thaw Bayesian Optimization for Hyperparameter Optimization
NExT: Teaching Large Language Models to Reason about Code Execution
Remembering to Be Fair: Non-Markovian Fairness in Sequential Decision Making
An LLM Compiler for Parallel Function Calling
Transport of Algebraic Structure to Latent Embeddings
Robust Learning-Augmented Dictionaries
RLVF: Learning from Verbal Feedback without Overgeneralization
Hierarchical State Space Models for Continuous Sequence-to-Sequence Modeling
Prompt-guided Precise Audio Editing with Diffusion Models
UPAM: Unified Prompt Attack in Text-to-Image Generation Models Against Both Textual Filters and Visual Checkers
Learning the Target Network in Function Space
Causal Bandits: The Pareto Optimal Frontier of Adaptivity, a Reduction to Linear Bandits, and Limitations around Unknown Marginals
A Fresh Take on Stale Embeddings: Improving Dense Retriever Training with Corrector Networks
A Sober Look at LLMs for Material Discovery: Are They Actually Good for Bayesian Optimization Over Molecules?
Progressive Inference: Explaining Decoder-Only Sequence Classification Models Using Intermediate Predictions
Time-Series Forecasting for Out-of-Distribution Generalization Using Invariant Learning
Neuro-Symbolic Temporal Point Processes
Classification under Nuisance Parameters and Generalized Label Shift in Likelihood-Free Inference
Latent Logic Tree Extraction for Event Sequence Explanation from LLMs
Erasing the Bias: Fine-Tuning Foundation Models for Semi-Supervised Learning
FrameQuant: Flexible Low-Bit Quantization for Transformers
A Theory of Non-Linear Feature Learning with One Gradient Step in Two-Layer Neural Networks
Diving into Underwater: Segment Anything Model Guided Underwater Salient Instance Segmentation and A Large-scale Dataset
Think Before You Act: Decision Transformers with Working Memory
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Generative Active Learning for Long-tailed Instance Segmentation
FAFE: Immune Complex Modeling with Geodesic Distance Loss on Noisy Group Frames
How to Explore with Belief: State Entropy Maximization in POMDPs
KnowFormer: Revisiting Transformers for Knowledge Graph Reasoning
Dirichlet Flow Matching with Applications to DNA Sequence Design
Bifurcated Attention for Single-Context Large-Batch Sampling
CRoFT: Robust Fine-Tuning with Concurrent Optimization for OOD Generalization and Open-Set OOD Detection
Dr. Strategy: Model-Based Generalist Agents with Strategic Dreaming
Faster Maximum Inner Product Search in High Dimensions
Bespoke Non-Stationary Solvers for Fast Sampling of Diffusion and Flow Models
Residual-Conditioned Optimal Transport: Towards Structure-Preserving Unpaired and Paired Image Restoration
Implicit Representations for Constrained Image Segmentation
On the Maximal Local Disparity of Fairness-Aware Classifiers
GeoAB: Towards Realistic Antibody Design and Reliable Affinity Maturation
Optimal Ridge Regularization for Out-of-Distribution Prediction
Cell2Sentence: Teaching Large Language Models the Language of Biology
On the Weight Dynamics of Deep Normalized Networks
Robust Yet Efficient Conformal Prediction Sets
Double Stochasticity Gazes Faster: Snap-Shot Decentralized Stochastic Gradient Tracking Methods
Scaling Laws for Fine-Grained Mixture of Experts
Deep Demonstration Tracing: Learning Generalizable Imitator Policy for Runtime Imitation from a Single Demonstration
Noise-Aware Algorithm for Heterogeneous Differentially Private Federated Learning
Should we be going MAD? A Look at Multi-Agent Debate Strategies for LLMs
A connection between Tempering and Entropic Mirror Descent
Debating with More Persuasive LLMs Leads to More Truthful Answers
Interplay of ROC and Precision-Recall AUCs: Theoretical Limits and Practical Implications in Binary Classification
Towards Interpretable Deep Local Learning with Successive Gradient Reconciliation
LSEnet: Lorentz Structural Entropy Neural Network for Deep Graph Clustering
Stochastic Weakly Convex Optimization beyond Lipschitz Continuity
Attack-free Evaluating and Enhancing Adversarial Robustness on Categorical Data
EvoRainbow: Combining Improvements in Evolutionary Reinforcement Learning for Policy Search
From Self-Attention to Markov Models: Unveiling the Dynamics of Generative Transformers
Leveraging Self-Consistency for Data-Efficient Amortized Bayesian Inference
Learning in Deep Factor Graphs with Gaussian Belief Propagation
Overestimation, Overfitting, and Plasticity in Actor-Critic: the Bitter Lesson of Reinforcement Learning
Test-Time Model Adaptation with Only Forward Passes
A Minimaximalist Approach to Reinforcement Learning from Human Feedback
A Resilient and Accessible Distribution-Preserving Watermark for Large Language Models
Accelerating Look-ahead in Bayesian Optimization: Multilevel Monte Carlo is All you Need
PrE-Text: Training Language Models on Private Federated Data in the Age of LLMs
Kernel-Based Evaluation of Conditional Biological Sequence Models
Retrieval-Augmented Score Distillation for Text-to-3D Generation
An amortized approach to non-linear mixed-effects modeling based on neural posterior estimation
What Can Transformer Learn with Varying Depth? Case Studies on Sequence Learning Tasks
Improving Group Robustness on Spurious Correlation Requires Preciser Group Inference
Et Tu Certifications: Robustness Certificates Yield Better Adversarial Examples
HexGen: Generative Inference of Large Language Model over Heterogeneous Environment
Mol-AE: Auto-Encoder Based Molecular Representation Learning With 3D Cloze Test Objective
Offline Imitation from Observation via Primal Wasserstein State Occupancy Matching
NDOT: Neuronal Dynamics-based Online Training for Spiking Neural Networks
Large Language Models Can Automatically Engineer Features for Few-Shot Tabular Learning
Fair Off-Policy Learning from Observational Data
Prompt-based Visual Alignment for Zero-shot Policy Transfer
Energy-Guided Diffusion Sampling for Offline-to-Online Reinforcement Learning
FightLadder: A Benchmark for Competitive Multi-Agent Reinforcement Learning
Positive and Unlabeled Learning with Controlled Probability Boundary Fence
Aligned Objective for Soft-Pseudo-Label Generation in Supervised Learning
Rapid Learning without Catastrophic Forgetting in the Morris Water Maze
Assessing Large Language Models on Climate Information
HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding
Zeroth-Order Methods for Constrained Nonconvex Nonsmooth Stochastic Optimization
How Learning by Reconstruction Produces Uninformative Features For Perception
A Graph is Worth $K$ Words: Euclideanizing Graph using Pure Transformer
Deciphering RNA Secondary Structure Prediction: A Probabilistic K-Rook Matching Perspective
Surface-VQMAE: Vector-quantized Masked Auto-encoders on Molecular Surfaces
Graph-enhanced Large Language Models in Asynchronous Plan Reasoning
Learning from Streaming Data when Users Choose
Receptive Fields As Experts in Convolutional Neural Architectures
Learning Low-dimensional Latent Dynamics from High-dimensional Observations: Non-asymptotics and Lower Bounds
TroVE: Inducing Verifiable and Efficient Toolboxes for Solving Programmatic Tasks
Neuro-Visualizer: A Novel Auto-Encoder-Based Loss Landscape Visualization Method With an Application in Knowledge-Guided Machine Learning
Diffusion Language Models Are Versatile Protein Learners
Improving Factuality and Reasoning in Language Models through Multiagent Debate
Self-Rewarding Language Models
Potential Based Diffusion Motion Planning
Learning Exceptional Subgroups by End-to-End Maximizing KL-Divergence
Trained Random Forests Completely Reveal your Dataset
PICLe: Eliciting Diverse Behaviors from Large Language Models with Persona In-Context Learning
Reprompting: Automated Chain-of-Thought Prompt Inference Through Gibbs Sampling
Lightweight Image Super-Resolution via Flexible Meta Pruning
Stereo Risk: A Continuous Modeling Approach to Stereo Matching
Sparse Model Inversion: Efficient Inversion of Vision Transformers for Data-Free Applications
HGAP: Boosting Permutation Invariant and Permutation Equivariant in Multi-Agent Reinforcement Learning via Graph Attention Network
MF-CLR: Multi-Frequency Contrastive Learning Representation for Time Series
Accelerated Speculative Sampling Based on Tree Monte Carlo
Image Fusion via Vision-Language Model
OMPO: A Unified Framework for RL under Policy and Dynamics Shifts
Offline-Boosted Actor-Critic: Adaptively Blending Optimal Historical Behaviors in Deep Off-Policy RL
Accelerating Convergence of Score-Based Diffusion Models, Provably
Rethinking Data Shapley for Data Selection Tasks: Misleads and Merits
Online Cascade Learning for Efficient Inference over Streams
FiT: Flexible Vision Transformer for Diffusion Model
Theoretical insights for diffusion guidance: A case study for Gaussian mixture models
Towards Efficient Training and Evaluation of Robust Models against $l_0$ Bounded Adversarial Perturbations
Scene Graph Generation Strategy with Co-occurrence Knowledge and Learnable Term Frequency
Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment
Self-attention Networks Localize When QK-eigenspectrum Concentrates
Is Kernel Prediction More Powerful than Gating in Convolutional Neural Networks?
On Prompt-Driven Safeguarding for Large Language Models
Data-efficient Large Vision Models through Sequential Autoregression
Language-Driven Cross-Modal Classifier for Zero-Shot Multi-Label Image Recognition
EvTexture: Event-driven Texture Enhancement for Video Super-Resolution
Multi-Source Conformal Inference Under Distribution Shift
Trust Regions for Explanations via Black-Box Probabilistic Certification
New Bounds on the Cohesion of Complete-link and Other Linkage Methods for Agglomerative Clustering
ODIM: Outlier Detection via Likelihood of Under-Fitted Generative Models
Piecewise Constant and Linear Regression Trees: An Optimal Dynamic Programming Approach
Gradient Compressed Sensing: A Query-Efficient Gradient Estimator for High-Dimensional Zeroth-Order Optimization
Class-Imbalanced Graph Learning without Class Rebalancing
Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision
COPAL: Continual Pruning in Large Language Generative Models
Learning with Complementary Labels Revisited: The Selected-Completely-at-Random Setting Is More Practical
Revisiting Context Aggregation for Image Matting
Defining Neural Network Architecture through Polytope Structures of Datasets
Self-Alignment of Large Language Models via Monopolylogue-based Social Scene Simulation
Bidirectional Reciprocative Information Communication for Few-Shot Semantic Segmentation
Bridging Model Heterogeneity in Federated Learning via Uncertainty-based Asymmetrical Reciprocity Learning
LPGD: A General Framework for Backpropagation through Embedded Optimization Layers
STELLA: Continual Audio-Video Pre-training with SpatioTemporal Localized Alignment
Tabular Insights, Visual Impacts: Transferring Expertise from Tables to Images
HumanTOMATO: Text-aligned Whole-body Motion Generation
Distribution Alignment Optimization through Neural Collapse for Long-tailed Classification
Equivariant Graph Neural Operator for Modeling 3D Dynamics
Better Locally Private Sparse Estimation Given Multiple Samples Per User
Polynomial-based Self-Attention for Table Representation Learning
DRED: Zero-Shot Transfer in Reinforcement Learning via Data-Regularised Environment Design
DataFreeShield: Defending Adversarial Attacks without Training Data
Parameterized Physics-informed Neural Networks for Parameterized PDEs
Individualized Privacy Accounting via Subsampling with Applications in Combinatorial Optimization
MolCRAFT: Structure-Based Drug Design in Continuous Parameter Space
Improving SAM Requires Rethinking its Optimization Formulation
Universal Gradient Methods for Stochastic Convex Optimization
Learning from Memory: Non-Parametric Memory Augmented Self-Supervised Learning of Visual Features
Initial Guessing Bias: How Untrained Networks Favor Some Classes
CurBench: Curriculum Learning Benchmark
Online bipartite matching with imperfect advice
Robust Stable Spiking Neural Networks
Predicting Dose-Response Curves with Deep Neural Networks
Improving Token-Based World Models with Parallel Observation Prediction
Emergence of In-Context Reinforcement Learning from Noise Distillation
Multi-Track Message Passing: Tackling Oversmoothing and Oversquashing in Graph Learning via Preventing Heterophily Mixing
Constrained Exploration via Reflected Replica Exchange Stochastic Gradient Langevin Dynamics
Cross-Domain Policy Adaptation by Capturing Representation Mismatch
Beyond Individual Input for Deep Anomaly Detection on Tabular Data
End-to-End Neuro-Symbolic Reinforcement Learning with Textual Explanations
Optimal bounds for $\ell_p$ sensitivity sampling via $\ell_2$ augmentation
Turnstile $\ell_p$ leverage score sampling with applications
Seesaw: Compensating for Nonlinear Reduction with Linear Computations for Private Inference
Enhancing Sufficient Dimension Reduction via Hellinger Correlation
Imitation Learning in Discounted Linear MDPs without exploration assumptions
Algorithm and Hardness for Dynamic Attention Maintenance in Large Language Models
MOKD: Cross-domain Finetuning for Few-shot Classification via Maximizing Optimized Kernel Dependence
Eureka-Moments in Transformers: Multi-Step Tasks Reveal Softmax Induced Optimization Problems
Expressivity and Generalization: Fragment-Biases for Molecular GNNs
Accelerating Convergence in Bayesian Few-Shot Classification
Sliced Wasserstein with Random-Path Projecting Directions
Differentiable Mapper for Topological Optimization of Data Representation
Balanced Data, Imbalanced Spectra: Unveiling Class Disparities with Spectral Imbalance
State-Constrained Zero-Sum Differential Games with One-Sided Information
GPTSwarm: Language Agents as Optimizable Graphs
Probabilistic Generating Circuits - Demystified
WorkArena: How Capable are Web Agents at Solving Common Knowledge Work Tasks?
Mind the Boundary: Coreset Selection via Reconstructing the Decision Boundary
Differentiability and Optimization of Multiparameter Persistent Homology
Multi-Region Markovian Gaussian Process: An Efficient Method to Discover Directional Communications Across Multiple Brain Regions
Whispering Experts: Neural Interventions for Toxicity Mitigation in Language Models
Learning Linear Block Error Correction Codes
VisionGraph: Leveraging Large Multimodal Models for Graph Theory Problems in Visual Context
Random matrix theory improved Fréchet mean of symmetric positive definite matrices
Cascade-CLIP: Cascaded Vision-Language Embeddings Alignment for Zero-Shot Semantic Segmentation
Optimally Improving Cooperative Learning in a Social Setting
ContPhy: Continuum Physical Concept Learning and Reasoning from Videos
Accelerating Legacy Numerical Solvers by Non-intrusive Gradient-based Meta-solving
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
Inferring Dynamic Networks from Marginals with Iterative Proportional Fitting
Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression
Individual Fairness in Graph Decomposition
APT: Adaptive Pruning and Tuning Pretrained Language Models for Efficient Training and Inference
VQDNA: Unleashing the Power of Vector Quantization for Multi-Species Genomic Sequence Modeling
Learning to Predict Mutational Effects of Protein-Protein Interactions by Microenvironment-aware Hierarchical Prompt Learning
GATE: How to Keep Out Intrusive Neighbors
SciBench: Evaluating College-Level Scientific Problem-Solving Abilities of Large Language Models
Tripod: Three Complementary Inductive Biases for Disentangled Representation Learning
OODRobustBench: a Benchmark and Large-Scale Analysis of Adversarial Robustness under Distribution Shift
Vague Prototype-Oriented Diffusion Model for Multi-Class Anomaly Detection
TimeX++: Learning Time-Series Explanations with Information Bottleneck
X-Oscar: A Progressive Framework for High-quality Text-guided 3D Animatable Avatar Generation
On the Asymptotic Distribution of the Minimum Empirical Risk
CATS: Enhancing Multivariate Time Series Forecasting by Constructing Auxiliary Time Series as Exogenous Variables
Coactive Learning for Large Language Models using Implicit User Feedback
Optimal Transport for Structure Learning Under Missing Data
Generalized Sobolev Transport for Probability Measures on a Graph
See More Details: Efficient Image Super-Resolution by Experts Mining
Enhancing Size Generalization in Graph Neural Networks through Disentangled Representation Learning
On the Role of Edge Dependency in Graph Generative Models
Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws
Provably Scalable Black-Box Variational Inference with Structured Variational Families
Graph External Attention Enhanced Transformer
Improving Neural Additive Models with Bayesian Principles
Complexity Matters: Feature Learning in the Presence of Spurious Correlations
An Information Theoretic Approach to Interaction-Grounded Learning
Adversarial Attacks on Combinatorial Multi-Armed Bandits
DiJiang: Efficient Large Language Models through Compact Kernelization
PEARL: Zero-shot Cross-task Preference Alignment and Robust Reward Learning for Robotic Manipulation
Neural-Kernel Conditional Mean Embeddings
Provable Risk-Sensitive Distributional Reinforcement Learning with General Function Approximation
On the Nonlinearity of Layer Normalization
Controlling Behavioral Diversity in Multi-Agent Reinforcement Learning
Not all distributional shifts are equal: Fine-grained robust conformal inference
PID: Prompt-Independent Data Protection Against Latent Diffusion Models
Bayesian Power Steering: An Effective Approach for Domain Adaptation of Diffusion Models
EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty
Nonparametric Teaching of Implicit Neural Representations
SPABA: A Single-Loop and Probabilistic Stochastic Bilevel Algorithm Achieving Optimal Sample Complexity
Physics and Lie symmetry informed Gaussian processes
Antibody Design Using a Score-based Diffusion Model Guided by Evolutionary, Physical and Geometric Constraints
Understanding MLP-Mixer as a wide and sparse MLP
Uncertainty Estimation by Density Aware Evidential Deep Learning
Rethinking Generative Large Language Model Evaluation for Semantic Comprehension
A Provably Effective Method for Pruning Experts in Fine-tuned Sparse Mixture-of-Experts
Going beyond Compositions, DDPMs Can Produce Zero-Shot Interpolations
Offline Inverse RL: New Solution Concepts and Provably Efficient Algorithms
On the Recoverability of Causal Relations from Temporally Aggregated I.I.D. Data
Explaining Probabilistic Models with Distributional Values
HarmoDT: Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning
The Surprising Effectiveness of Skip-Tuning in Diffusion Sampling
Beyond the Federation: Topology-aware Federated Learning for Generalization to Unseen Clients
Stochastic Bandits with ReLU Neural Networks
A Nearly Optimal Single Loop Algorithm for Stochastic Bilevel Optimization under Unbounded Smoothness
Offline Training of Language Model Agents with Functions as Learnable Weights
Projection-Free Variance Reduction Methods for Stochastic Constrained Multi-Level Compositional Optimization
Single-Trajectory Distributionally Robust Reinforcement Learning
Temporal Spiking Neural Networks with Synaptic Delay for Graph Reasoning
Multimodal Prototyping for cancer survival prediction
Learning to Reach Goals via Diffusion
How Flawed Is ECE? An Analysis via Logit Smoothing
MALIBO: Meta-learning for Likelihood-free Bayesian Optimization
Differentiable Combinatorial Scheduling at Scale
DySLIM: Dynamics Stable Learning by Invariant Measure for Chaotic Systems
Sparse and Structured Hopfield Networks
Generalization Analysis of Deep Non-linear Matrix Completion
Eluder-based Regret for Stochastic Contextual MDPs
Preventing Model Collapse in Gaussian Process Latent Variable Models
Premise Order Matters in Reasoning with Large Language Models
Breadth-First Exploration on Adaptive Grid for Reinforcement Learning
Vision Transformers as Probabilistic Expansion from Learngene
Scalable and Flexible Causal Discovery with an Efficient Test for Adjacency
Robust and Conjugate Gaussian Process Regression
Distributed Bilevel Optimization with Communication Compression
Recurrent Distance Filtering for Graph Representation Learning
Improving fine-grained understanding in image-text pre-training
Physics of Language Models: Part 3.1, Knowledge Storage and Extraction
Distilling Morphology-Conditioned Hypernetworks for Efficient Universal Morphology Control
When Will Gradient Regularization Be Harmful?
Neural Tangent Kernels Motivate Cross-Covariance Graphs in Neural Networks
MaSS: Multi-attribute Selective Suppression for Utility-preserving Data Transformation from an Information-theoretic Perspective
Learning to Stabilize Online Reinforcement Learning in Unbounded State Spaces
Position: An Inner Interpretability Framework for AI Inspired by Lessons from Cognitive Neuroscience
Towards Efficient Exact Optimization of Language Model Alignment
How do Large Language Models Navigate Conflicts between Honesty and Helpfulness?
Stationarity without mean reversion in improper Gaussian processes
On Positivity Condition for Causal Inference
Geometry-Aware Instrumental Variable Regression
Rethinking the Flat Minima Searching in Federated Learning
Multicalibration for Confidence Scoring in LLMs
Is Inverse Reinforcement Learning Harder than Standard Reinforcement Learning? A Theoretical Perspective
Subgraphormer: Unifying Subgraph GNNs and Graph Transformers via Graph Products
SelfVC: Voice Conversion With Iterative Refinement using Self Transformations
Agnostic Sample Compression Schemes for Regression
The Pitfalls of Next-Token Prediction
Recovering the Pre-Fine-Tuning Weights of Generative Models
Characterizing Truthfulness in Large Language Model Generations with Local Intrinsic Dimension
DiffFPR: Diffusion Prior for Oversampled Fourier Phase Retrieval
Implicit Bias of AdamW: $\ell_\infty$-Norm Constrained Optimization
Structured Chemistry Reasoning with Large Language Models
AttNS: Attention-Inspired Numerical Solving For Limited Data Scenarios
Online Matrix Completion: A Collaborative Approach with Hott Items
Adaptive Text Watermark for Large Language Models
NeuralIndicator: Implicit Surface Reconstruction from Neural Indicator Priors
Learning to Model the World With Language
DOGE: Domain Reweighting with Generalization Estimation
Privacy Backdoors: Stealing Data with Corrupted Pretrained Models
Policy Learning for Balancing Short-Term and Long-Term Rewards
DIDI: Diffusion-Guided Diversity for Offline Behavioral Generation
On the Error-Propagation of Inexact Hotelling's Deflation for Principal Component Analysis
Image Hijacks: Adversarial Images can Control Generative Models at Runtime
Optimal Exact Recovery in Semi-Supervised Learning: A Study of Spectral Methods and Graph Convolutional Networks
HGCN2SP: Hierarchical Graph Convolutional Network for Two-Stage Stochastic Programming
WISER: Weak Supervision and Supervised Representation Learning to Improve Drug Response Prediction in Cancer
Graph Neural Network Explanations are Fragile
A Contextual Combinatorial Bandit Approach to Negotiation
In-context Convergence of Transformers
REST: Efficient and Accelerated EEG Seizure Analysis through Residual State Updates
Bipartite Matching in Massive Graphs: A Tight Analysis of EDCS
Open-Domain Text Evaluation via Contrastive Distribution Methods
Adaptively Perturbed Mirror Descent for Learning in Games
Curated LLM: Synergy of LLMs and Data Curation for tabular augmentation in low-data regimes
Online Linear Regression in Dynamic Environments via Discounting
A Statistical Theory of Regularization-Based Continual Learning
Quantum Theory and Application of Contextual Optimal Transport
Learning Associative Memories with Gradient Descent
On the Consistency of Kernel Methods with Dependent Observations
A Diffusion Model Framework for Unsupervised Neural Combinatorial Optimization
Experts Don't Cheat: Learning What You Don't Know By Predicting Pairs
Analysis for Abductive Learning and Neural-Symbolic Reasoning Shortcuts
Automated Statistical Model Discovery with Language Models
TimeMIL: Advancing Multivariate Time Series Classification via a Time-aware Multiple Instance Learning
DISCRET: Synthesizing Faithful Explanations For Treatment Effect Estimation
Probabilistic Time Series Modeling with Decomposable Denoising Diffusion Model
A Space Group Symmetry Informed Network for O(3) Equivariant Crystal Tensor Prediction
Two-sided Competing Matching Recommendation Markets With Quota and Complementary Preferences Constraints
Safe Reinforcement Learning using Finite-Horizon Gradient-based Estimation
Selecting Large Language Model to Fine-tune via Rectified Scaling Law
Causal Effect Identification in LiNGAM Models with Latent Confounders
Learning Latent Dynamic Robust Representations for World Models
Fast Peer Adaptation with Context-aware Exploration
An Independence-promoting Loss for Music Generation with Language Models
Accelerating Parallel Sampling of Diffusion Models
Double Momentum Method for Lower-Level Constrained Bilevel Optimization
IOI: Invisible One-Iteration Adversarial Attack on No-Reference Image- and Video-Quality Metrics
Generating Chain-of-Thoughts with a Pairwise-Comparison Approach to Searching for the Most Promising Intermediate Thought
How Deep Networks Learn Sparse and Hierarchical Data: the Sparse Random Hierarchy Model
Probabilistic Conceptual Explainers: Trustworthy Conceptual Explanations for Vision Foundation Models
Improved Dimensionality Dependence for Zeroth-Order Optimisation over Cross-Polytopes
Mollification Effects of Policy Gradient Methods
Posterior Sampling-Based Bayesian Optimization with Tighter Bayesian Regret Bounds
Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibration
Optimal Batched Linear Bandits
Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast
Failures Are Fated, But Can Be Faded: Characterizing and Mitigating Unwanted Behaviors in Large-Scale Vision and Language Models
Improving Diffusion Models for Inverse Problems Using Optimal Posterior Covariance
Dynamic Evaluation of Large Language Models by Meta Probing Agents
Incorporating Information into Shapley Values: Reweighting via a Maximum Entropy Approach
Benign Overfitting in Adversarial Training of Neural Networks
Run-Time Task Composition with Safety Semantics
Near-Linear Time Approximation Algorithms for k-means with Outliers
FairProof : Confidential and Certifiable Fairness for Neural Networks
Observable Propagation: Uncovering Feature Vectors in Transformers
Diffusion Rejection Sampling
Junk DNA Hypothesis: Pruning Small Pre-Trained Weights $\textit{Irreversibly}$ and $\textit{Monotonically}$ Impairs ``Difficult" Downstream Tasks in LLMs
Differentially Private Worst-group Risk Minimization
Binning as a Pretext Task: Improving Self-Supervised Learning in Tabular Domains
Compute Better Spent: Replacing Dense Layers with Structured Matrices
Libra: Building Decoupled Vision System on Large Language Models
Stochastic Interpolants with Data-Dependent Couplings
C-RAG: Certified Generation Risks for Retrieval-Augmented Language Models
From Inverse Optimization to Feasibility to ERM
RoSA: Accurate Parameter-Efficient Fine-Tuning via Robust Adaptation
Generative Conditional Distributions by Neural (Entropic) Optimal Transport
Intersectional Unfairness Discovery
ReGAL: Refactoring Programs to Discover Generalizable Abstractions
Combinatorial Approximations for Cluster Deletion: Simpler, Faster, and Better
On Discrete Prompt Optimization for Diffusion Models
Prodigy: An Expeditiously Adaptive Parameter-Free Learner
No-Regret Reinforcement Learning in Smooth MDPs
Privately Learning Smooth Distributions on the Hypercube by Projections
Hybrid$^2$ Neural ODE Causal Modeling and an Application to Glycemic Response
Learning to Scale Logits for Temperature-Conditional GFlowNets
Bayesian Uncertainty for Gradient Aggregation in Multi-Task Learning
Efficient Mixture Learning in Black-Box Variational Inference
EMC$^2$: Efficient MCMC Negative Sampling for Contrastive Learning with Global Convergence
On the Trajectory Regularity of ODE-based Diffusion Sampling
Towards Understanding Inductive Bias in Transformers: A View From Infinity
Stochastic Q-learning for Large Discrete Action Spaces
Differentiable Distributionally Robust Optimization Layers
AMPA: Adaptive Mixed Precision Allocation for Low-Bit Integer Training
MMPareto: Boosting Multimodal Learning with Innocent Unimodal Assistance
Causal Discovery with Fewer Conditional Independence Tests
Single-Model Attribution of Generative Models Through Final-Layer Inversion
The Emergence of Reproducibility and Consistency in Diffusion Models
Debiased Distribution Compression
LLark: A Multimodal Instruction-Following Language Model for Music
Understanding the Learning Dynamics of Alignment with Human Feedback
MultiMax: Sparse and Multi-Modal Attention Learning
Pessimism Meets Risk: Risk-Sensitive Offline Reinforcement Learning
Hieros: Hierarchical Imagination on Structured State Space Sequence World Models
Deeper or Wider: A Perspective from Optimal Generalization Error with Sobolev Loss
Careful with that Scalpel: Improving Gradient Surgery with an EMA
Riemannian Preconditioned LoRA for Fine-Tuning Foundation Models
Binary Decomposition: A Problem Transformation Perspective for Open-Set Semi-Supervised Learning
Revisiting the Role of Language Priors in Vision-Language Models
Feature Attribution with Necessity and Sufficiency via Dual-stage Perturbation Test for Causal Explanation
Understanding Inter-Concept Relationships in Concept-Based Models
Quantum Algorithm for Online Exp-concave Optimization
Stable Differentiable Causal Discovery
Feasibility Consistent Representation Learning for Safe Reinforcement Learning
Information-Directed Pessimism for Offline Reinforcement Learning
Switchable Decision: Dynamic Neural Generation Networks
Reward Model Learning vs. Direct Policy Optimization: A Comparative Analysis of Learning from Human Preferences
Foundation Policies with Hilbert Representations
Stochastic Optimization with Arbitrary Recurrent Data Sampling
Trainable Transformer in Transformer
ArtWhisperer: A Dataset for Characterizing Human-AI Interactions in Artistic Creations
Position: Scaling Simulation is Neither Necessary Nor Sufficient for In-the-Wild Robot Manipulation
Robust Multi-Task Learning with Excess Risks
Fundamental Limitations of Alignment in Large Language Models
Understanding the Effects of Iterative Prompting on Truthfulness
Federated Continual Learning via Prompt-based Dual Knowledge Transfer
GLoRe: When, Where, and How to Improve LLM Reasoning via Global and Local Refinements
EfficientZero V2: Mastering Discrete and Continuous Control with Limited Data
Federated Offline Reinforcement Learning: Collaborative Single-Policy Coverage Suffices
Successor Features for Efficient Multi-Subject Controlled Text Generation
DE-COP: Detecting Copyrighted Content in Language Models Training Data
tnGPS: Discovering Unknown Tensor Network Structure Search Algorithms via Large Language Models (LLMs)
Learning Causal Domain-Invariant Temporal Dynamics for Few-Shot Action Recognition
S$\Omega$I: Score-based O-INFORMATION Estimation
SAM as the Guide: Mastering Pseudo-Label Refinement in Semi-Supervised Referring Expression Segmentation
Bottleneck-Minimal Indexing for Generative Document Retrieval
I/O Complexity of Attention, or How Optimal is FlashAttention?
A Sparsity Principle for Partially Observable Causal Representation Learning
Rotational Equilibrium: How Weight Decay Balances Learning Across Neural Networks
Consistent Adversarially Robust Linear Classification: Non-Parametric Setting
Improving Prototypical Visual Explanations with Reward Reweighing, Reselection, and Retraining
PARCv2: Physics-aware Recurrent Convolutional Neural Networks for Spatiotemporal Dynamics Modeling
Easing Concept Bleeding in Diffusion via Entity Localization and Anchoring
Unraveling the Impact of Heterophilic Structures on Graph Positive-Unlabeled Learning
Random Scaling and Momentum for Non-smooth Non-convex Optimization
DSD-DA: Distillation-based Source Debiasing for Domain Adaptive Object Detection
Sequence Compression Speeds Up Credit Assignment in Reinforcement Learning
LIDAO: Towards Limited Interventions for Debiasing (Large) Language Models
Spectral Preconditioning for Gradient Methods on Graded Non-convex Functions
An Empirical Examination of Balancing Strategy for Counterfactual Estimation on Time Series
Disentangled Graph Self-supervised Learning for Out-of-Distribution Generalization
Causal Representation Learning Made Identifiable by Grouping of Observational Variables
How Free is Parameter-Free Stochastic Optimization?
ReDiffuser: Reliable Decision-Making Using a Diffuser with Confidence Estimation
Enhancing Trajectory Prediction through Self-Supervised Waypoint Distortion Prediction
Controllable Prompt Tuning For Balancing Group Distributional Robustness
Investigating Pre-Training Objectives for Generalization in Vision-Based Reinforcement Learning
A Unified Linear Programming Framework for Offline Reward Learning from Human Demonstrations and Feedback
Unsupervised Episode Generation for Graph Meta-learning
FRAG: Frequency Adapting Group for Diffusion Video Editing
First-Order Manifold Data Augmentation for Regression Learning
RICE: Breaking Through the Training Bottlenecks of Reinforcement Learning with Explanation
Diffusion Model-Augmented Behavioral Cloning
Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling
Speech Self-Supervised Learning Using Diffusion Model Synthetic Data
An Online Optimization Perspective on First-Order and Zero-Order Decentralized Nonsmooth Nonconvex Stochastic Optimization
Robust Universal Adversarial Perturbations
Prospector Heads: Generalized Feature Attribution for Large Models & Data
Consistent Diffusion Meets Tweedie: Training Exact Ambient Diffusion Models with Noisy Data
Bridging Data Gaps in Diffusion Models with Adversarial Noise-Based Transfer Learning
Causal Representation Learning from Multiple Distributions: A General Setting
Efficient Exploration for LLMs
Building Socially-Equitable Public Models
MuxServe: Flexible Spatial-Temporal Multiplexing for Multiple LLM Serving
Estimating Distributional Treatment Effects in Randomized Experiments: Machine Learning for Variance Reduction
$f$-Divergence Based Classification: Beyond the Use of Cross-Entropy
Asymptotics of Learning with Deep Structured (Random) Features
Let Go of Your Labels with Unsupervised Transfer
LAGMA: LAtent Goal-guided Multi-Agent Reinforcement Learning
Stay on Topic with Classifier-Free Guidance
Online Resource Allocation with Non-Stationary Customers
PAPM: A Physics-aware Proxy Model for Process Systems
Multi-Patch Prediction: Adapting Language Models for Time Series Representation Learning
Batch Singular Value Polarization and Weighted Semantic Augmentation for Universal Domain Adaptation
Winner-takes-all learners are geometry-aware conditional density estimators
Decomposable Submodular Maximization in Federated Setting
Confidence-aware Contrastive Learning for Selective Classification
RAUCA: A Novel Physical Adversarial Attack on Vehicle Detectors via Robust and Accurate Camouflage Generation
A Provable Decision Rule for Out-of-Distribution Detection
UniAudio: Towards Universal Audio Generation with Large Language Models
SiT: Symmetry-invariant Transformers for Generalisation in Reinforcement Learning
ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models
Generalizing Knowledge Graph Embedding with Universal Orthogonal Parameterization
Rethinking Transformers in Solving POMDPs
diff History for Neural Language Agents
OSN: Infinite Representations of Dynamic 3D Scenes from Monocular Videos
Inverse-Variance Weighting for Estimation of Heterogeneous Treatment Effects
MS$^3$D: A RG Flow-Based Regularization for GAN Training with Limited Data
Online Matching with Stochastic Rewards: Provable Better Bound via Adversarial Reinforcement Learning
Can Machines Learn the True Probabilities?
Exponential Spectral Pursuit: An Effective Initialization Method for Sparse Phase Retrieval
Convergence of Online Learning Algorithm for a Mixture of Multiple Linear Regressions
WAVES: Benchmarking the Robustness of Image Watermarks
Zero-Sum Positional Differential Games as a Framework for Robust Reinforcement Learning: Deep Q-Learning Approach
Beyond Implicit Bias: The Insignificance of SGD Noise in Online Learning
Explain Temporal Black-Box Models via Functional Decomposition
Connect Later: Improving Fine-tuning for Robustness with Targeted Augmentations
Bring Your Own (Non-Robust) Algorithm to Solve Robust MDPs by Estimating The Worst Kernel
Minimum-Norm Interpolation Under Covariate Shift
Adapt and Diffuse: Sample-adaptive Reconstruction via Latent Diffusion Models
Non-parametric Online Change Point Detection on Riemannian Manifolds
Position: Will we run out of data? Limits of LLM scaling based on human-generated data
Adaptively Learning to Select-Rank in Online Platforms
Towards Realistic Model Selection for Semi-supervised Learning
Random Masking Finds Winning Tickets for Parameter Efficient Fine-tuning
Critical windows: non-asymptotic theory for feature emergence in diffusion models
FlowMM: Generating Materials with Riemannian Flow Matching
Noise-Adaptive Confidence Sets for Linear Bandits and Application to Bayesian Optimization
Safety Fine-Tuning at (Almost) No Cost: A Baseline for Vision Large Language Models
Symmetric Matrix Completion with ReLU Sampling
Spectral Phase Transition and Optimal PCA in Block-Structured Spiked Models
GistScore: Learning Better Representations for In-Context Example Selection with Gist Bottlenecks
Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models
Fool Your (Vision and) Language Model with Embarrassingly Simple Permutations
Gradient-based Visual Explanation for Transformer-based CLIP
Iterative Data Smoothing: Mitigating Reward Overfitting and Overoptimization in RLHF
Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities
Graph Neural PDE Solvers with Conservation and Similarity-Equivariance
Time Weaver: A Conditional Time Series Generation Model
Watermark Stealing in Large Language Models
Leveraging (Biased) Information: Multi-armed Bandits with Offline Data
Copula-Nested Spectral Kernel Network
Stability Evaluation through Distributional Perturbation Analysis
From Neurons to Neutrons: A Case Study in Interpretability
Conformal Prediction with Learned Features
OptiMUS: Scalable Optimization Modeling with (MI)LP Solvers and Large Language Models
Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF
Graph Structure Extrapolation for Out-of-Distribution Generalization
LeaPformer: Enabling Linear Transformers for Autoregressive and Simultaneous Tasks via Learned Proportions
Generative Marginalization Models
Transferring Knowledge From Large Foundation Models to Small Downstream Models
Autoencoding Conditional Neural Processes for Representation Learning
Using Uncertainty Quantification to Characterize and Improve Out-of-Domain Learning for PDEs
Neural Collapse for Cross-entropy Class-Imbalanced Learning with Unconstrained ReLU Features Model
Contextualized Policy Recovery: Modeling and Interpreting Medical Decisions with Adaptive Imitation Learning
Harmonizing Generalization and Personalization in Federated Prompt Learning
To Cool or not to Cool? Temperature Network Meets Large Foundation Models via DRO
Unveiling the Cycloid Trajectory of EM Iterations in Mixed Linear Regression
Improving Interpretation Faithfulness for Vision Transformers
Beyond the Norms: Detecting Prediction Errors in Regression Models
Model-Free Robust $\phi$-Divergence Reinforcement Learning Using Both Offline and Online Data
Transolver: A Fast Transformer Solver for PDEs on General Geometries
Getting the most out of your tokenizer for pre-training and domain adaptation
MorphGrower: A Synchronized Layer-by-layer Growing Approach for Plausible Neuronal Morphology Generation
CaPS: Collaborative and Private Synthetic Data Generation from Distributed Sources
Convex Relaxations of ReLU Neural Networks Approximate Global Optima in Polynomial Time
Bridging Environments and Language with Rendering Functions and Vision-Language Models
Smoothing Proximal Gradient Methods for Nonsmooth Sparsity Constrained Optimization: Optimality Conditions and Global Convergence
MC-GTA: Metric-Constrained Model-Based Clustering using Goodness-of-fit Tests with Autocorrelations
Efficient Pareto Manifold Learning with Low-Rank Structure
A Touch, Vision, and Language Dataset for Multimodal Alignment
Position: What makes an image realistic?
Simulation of Graph Algorithms with Looped Transformers
BayOTIDE: Bayesian Online Multivariate Time Series Imputation with Functional Decomposition
Predicting Lagrangian Multipliers for Mixed Integer Linear Programs
Position: Why Tabular Foundation Models Should Be a Research Priority
FedLMT: Tackling System Heterogeneity of Federated Learning via Low-Rank Model Training with Theoretical Guarantees
Privacy-Preserving Embedding via Look-up Table Evaluation with Fully Homomorphic Encryption
IM-Unpack: Training and Inference with Arbitrarily Low Precision Integers
Integrating Multimodal Data for Joint Generative Modeling of Complex Dynamics
Is In-Context Learning in Large Language Models Bayesian? A Martingale Perspective
Graph As Point Set
Learning Solution-Aware Transformers for Efficiently Solving Quadratic Assignment Problem
Graph-Triggered Rising Bandits
Self-Infilling Code Generation
Split-and-Denoise: Protect large language model inference with local differential privacy
Unbiased Multi-Label Learning from Crowdsourced Annotations
A Language Model’s Guide Through Latent Space
Nonlinear Filtering with Brenier Optimal Transport Maps
Local vs. Global Interpretability: A Computational Complexity Perspective
Plug-and-Play image restoration with Stochastic deNOising REgularization
AST-T5: Structure-Aware Pretraining for Code Generation and Understanding
Position: Rethinking Post-Hoc Search-Based Neural Approaches for Solving Large-Scale Traveling Salesman Problems
Towards Causal Foundation Model: on Duality between Optimal Balancing and Attention
Understanding Finetuning for Factual Knowledge Extraction
A sampling theory perspective on activations for implicit neural representations
Continuous Treatment Effects with Surrogate Outcomes
Graph Generation with Diffusion Mixture
Symmetric Replay Training: Enhancing Sample Efficiency in Deep Reinforcement Learning for Combinatorial Optimization
Hierarchical Integral Probability Metrics: A distance on random probability measures with low sample complexity
Mean Estimation in the Add-Remove Model of Differential Privacy
Multiplicative Weights Update, Area Convexity and Random Coordinate Descent for Densest Subgraph Problems
GRATH: Gradual Self-Truthifying for Large Language Models
Position: The Causal Revolution Needs Scientific Pragmatism
Self-Supervised Interpretable End-to-End Learning via Latent Functional Modularity
Finite Time Logarithmic Regret Bounds for Self-Tuning Regulation
Deep Functional Factor Models: Forecasting High-Dimensional Functional Time Series via Bayesian Nonparametric Factorization
In-context Vectors: Making In Context Learning More Effective and Controllable Through Latent Space Steering
Scalable Wasserstein Gradient Flow for Generative Modeling through Unbalanced Optimal Transport
Solving Poisson Equations using Neural Walk-on-Spheres
NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models
Compositional Text-to-Image Generation with Dense Blob Representations
Denoising Autoregressive Representation Learning
Understanding Reasoning Ability of Language Models From the Perspective of Reasoning Paths Aggregation
LQER: Low-Rank Quantization Error Reconstruction for LLMs
Efficient Adaptation in Mixed-Motive Environments via Hierarchical Opponent Modeling and Planning
ACPO: A Policy Optimization Algorithm for Average MDPs with Constraints
An Improved Finite-time Analysis of Temporal Difference Learning with Deep Neural Networks
Offline Transition Modeling via Contrastive Energy Learning
Slot Abstractors: Toward Scalable Abstract Visual Reasoning
Repeat After Me: Transformers are Better than State Space Models at Copying
Discovering Bias in Latent Space: An Unsupervised Debiasing Approach
Harnessing Neural Unit Dynamics for Effective and Scalable Class-Incremental Learning
Towards efficient deep spiking neural networks construction with spiking activity based pruning
Advancing DRL Agents in Commercial Fighting Games: Training, Integration, and Agent-Human Alignment
Fast Decision Boundary based Out-of-Distribution Detector
Calibration Bottleneck: Over-compressed Representations are Less Calibratable
Learning Adaptive and View-Invariant Vision Transformer for Real-Time UAV Tracking
Agnostic Learning of Mixed Linear Regressions with EM and AM Algorithms
Discovering Features with Synergistic Interactions in Multiple Views
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal
Scale-Free Image Keypoints Using Differentiable Persistent Homology
Prompting is a Double-Edged Sword: Improving Worst-Group Robustness of Foundation Models
Graph Neural Networks Use Graphs When They Shouldn't
Riemannian Accelerated Zeroth-order Algorithm: Improved Robustness and Lower Query Complexity
Beyond ELBOs: A Large-Scale Evaluation of Variational Methods for Sampling
Using AI Uncertainty Quantification to Improve Human Decision-Making
SFC: Achieve Accurate Fast Convolution under Low-precision Arithmetic
SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks
Navigating Complexity: Toward Lossless Graph Condensation via Expanding Window Matching
Small-loss Adaptive Regret for Online Convex Optimization
Learning to Intervene on Concept Bottlenecks
Causal-IQA: Towards the Generalization of Image Quality Assessment Based on Causal Inference
Position: A Roadmap to Pluralistic Alignment
VNN: Verification-Friendly Neural Networks with Hard Robustness Guarantees
Long Range Propagation on Continuous-Time Dynamic Graphs
Quality-Weighted Vendi Scores And Their Application To Diverse Experimental Design
SelfIE: Self-Interpretation of Large Language Model Embeddings
Keep the Momentum: Conservation Laws beyond Euclidean Gradient Flows
Expert Proximity as Surrogate Rewards for Single Demonstration Imitation Learning
Exploration-Driven Policy Optimization in RLHF: Theoretical Insights on Efficient Data Utilization
From Biased Selective Labels to Pseudo-Labels: An Expectation-Maximization Framework for Learning from Biased Decisions
Near-Optimal Regret in Linear MDPs with Aggregate Bandit Feedback
Online Adaptive Anomaly Thresholding with Confidence Sequences
Stochastic positional embeddings improve masked image modeling
Prompt-tuning Latent Diffusion Models for Inverse Problems
On the Tractability of SHAP Explanations under Markovian Distributions
Self-Correcting Self-Consuming Loops for Generative Model Training
Auto-Regressive Next-Token Predictors are Universal Learners
Gibbs Sampling of Continuous Potentials on a Quantum Computer
Model Alignment as Prospect Theoretic Optimization
DiracDiffusion: Denoising and Incremental Reconstruction with Assured Data-Consistency
On the Unexpected Effectiveness of Reinforcement Learning for Sequential Recommendation
Causal Customer Churn Analysis with Low-rank Tensor Block Hazard Model
Position: What Can Large Language Models Tell Us about Time Series Analysis
Cross-view Masked Diffusion Transformers for Person Image Synthesis
Partially Stochastic Infinitely Deep Bayesian Neural Networks
Learning Divergence Fields for Shift-Robust Graph Representations
No Dimensional Sampling Coresets for Classification
MagicPose: Realistic Human Poses and Facial Expressions Retargeting with Identity-aware Diffusion
Conditional Language Learning with Context
Connecting the Dots: Collaborative Fine-tuning for Black-Box Vision-Language Models
Unmasking Vulnerabilities: Cardinality Sketches under Adaptive Inputs
Online Variational Sequential Monte Carlo
Post-hoc Part-Prototype Networks
Learning to Remove Cuts in Integer Linear Programming
Do Language Models Exhibit the Same Cognitive Biases in Problem Solving as Human Learners?
Directly Denoising Diffusion Models
Differentially Private Decentralized Learning with Random Walks
RoboDreamer: Learning Compositional World Models for Robot Imagination
Learning Latent Structures in Network Games via Data-Dependent Gated-Prior Graph Variational Autoencoders
Fewer Truncations Improve Language Modeling
Accelerating Transformer Pre-training with 2:4 Sparsity
Parameter Estimation in DAGs from Incomplete Data via Optimal Transport
FedRC: Tackling Diverse Distribution Shifts Challenge in Federated Learning by Robust Clustering
Multi-group Learning for Hierarchical Groups
On the Feasibility of Single-Pass Full-Capacity Learning in Linear Threshold Neurons with Binary Input Vectors
HelmFluid: Learning Helmholtz Dynamics for Interpretable Fluid Prediction
Generalization to New Sequential Decision Making Tasks with In-Context Learning
Foundations of Testing for Finite-Sample Causal Discovery
DMTG: One-Shot Differentiable Multi-Task Grouping
Mean-field Chaos Diffusion Models
Spike Distance Function as a Learning Objective for Spike Prediction
Learning Modality Knowledge Alignment for Cross-Modality Transfer
TERD: A Unified Framework for Safeguarding Diffusion Models Against Backdoors
E$^2$GAN: Efficient Training of Efficient GANs for Image-to-Image Translation
CauDiTS: Causal Disentangled Domain Adaptation of Multivariate Time Series
Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation
Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data Flow and Per-Block Quantization
Adaptive Conformal Inference by Betting
Reinformer: Max-Return Sequence Modeling for Offline RL
Zero-Shot Unsupervised and Text-Based Audio Editing Using DDPM Inversion
On Hypothesis Transfer Learning of Functional Linear Models
Challenges in Training PINNs: A Loss Landscape Perspective
Studying K-FAC Heuristics by Viewing Adam through a Second-Order Lens
Rethinking Optimization and Architecture for Tiny Language Models
PruNeRF: Segment-Centric Dataset Pruning via 3D Spatial Consistency
Spider: A Unified Framework for Context-dependent Concept Segmentation
Privacy Attacks in Decentralized Learning
LoRAP: Transformer Sub-Layers Deserve Differentiated Structured Compression for Large Language Models
Major-Minor Mean Field Multi-Agent Reinforcement Learning
Learning Surrogates for Offline Black-Box Optimization via Gradient Matching
Is Epistemic Uncertainty Faithfully Represented by Evidential Deep Learning Methods?
BeigeMaps: Behavioral Eigenmaps for Reinforcement Learning from Images
Multiply-Robust Causal Change Attribution
Solving Hierarchical Information-Sharing Dec-POMDPs: An Extensive-Form Game Approach
Scaling Down Deep Learning with MNIST-1D
Position: Enforced Amnesia as a Way to Mitigate the Potential Risk of Silent Suffering in the Conscious AI
A fast algorithm to simulate nonlinear resistive networks
Align Your Steps: Optimizing Sampling Schedules in Diffusion Models
video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models
A Dynamical Model of Neural Scaling Laws
Inexact Newton-type Methods for Optimisation with Nonnegativity Constraints
Momentum Particle Maximum Likelihood
Efficient Exploration in Average-Reward Constrained Reinforcement Learning: Achieving Near-Optimal Regret With Posterior Sampling
LLM Maybe LongLM: SelfExtend LLM Context Window Without Tuning
Robustness of Deep Learning for Accelerated MRI: Benefits of Diverse Training Data
Interpreting and Improving Diffusion Models from an Optimization Perspective
How to Leverage Diverse Demonstrations in Offline Imitation Learning
Learning to Explore in POMDPs with Informational Rewards
A Linear Time and Space Local Point Cloud Geometry Encoder via Vectorized Kernel Mixture (VecKM)
Uniformly Stable Algorithms for Adversarial Training and Beyond
Mitigating Catastrophic Forgetting in Online Continual Learning by Modeling Previous Task Interrelations via Pareto Optimization
Position: Explain to Question not to Justify
A New Robust Partial p-Wasserstein-Based Metric for Comparing Distributions
Theoretical Analysis of Learned Database Operations under Distribution Shift through Distribution Learnability
Prospective Side Information for Latent MDPs
Theory of Consistency Diffusion Models: Distribution Estimation Meets Fast Sampling
Mechanistic Neural Networks for Scientific Machine Learning
Near-Optimal Reinforcement Learning with Self-Play under Adaptivity Constraints
GPT-4V(ision) is a Generalist Web Agent, if Grounded
Understanding Heterophily for Graph Neural Networks
Graph Out-of-Distribution Detection Goes Neighborhood Shaping
Emergent Equivariance in Deep Ensembles
The Balanced-Pairwise-Affinities Feature Transform
DiNADO: Norm-Disentangled Neurally-Decomposed Oracles for Controlling Language Models
Byzantine Resilient and Fast Federated Few-Shot Learning
Pausing Policy Learning in Non-stationary Reinforcement Learning
Estimating Unknown Population Sizes Using the Hypergeometric Distribution
AnyTool: Self-Reflective, Hierarchical Agents for Large-Scale API Calls
GiLOT: Interpreting Generative Language Models via Optimal Transport
On Interpolating Experts and Multi-Armed Bandits
PriorBoost: An Adaptive Algorithm for Learning from Aggregate Responses
From Words to Actions: Unveiling the Theoretical Underpinnings of LLM-Driven Autonomous Systems
Connecting the Dots: Is Mode-Connectedness the Key to Feasible Sample-Based Inference in Bayesian Neural Networks?
Listening to the noise: Blind Denoising with Gibbs Diffusion
Sampling in Unit Time with Kernel Fisher-Rao Flow
Dynamic Facility Location in High Dimensional Euclidean Spaces
Differentiable Annealed Importance Sampling Minimizes The Jensen-Shannon Divergence Between Initial and Target Distribution
LoRA Training in the NTK Regime has No Spurious Local Minima
DiffAug: Enhance Unsupervised Contrastive Learning with Domain-Knowledge-Free Diffusion-based Data Augmentation
Meta-Learners for Partially-Identified Treatment Effects Across Multiple Environments
Bivariate Causal Discovery using Bayesian Model Selection
Physics-Informed Neural Network Policy Iteration: Algorithms, Convergence, and Verification
Offline Actor-Critic Reinforcement Learning Scales to Large Models
Rethinking Adversarial Robustness in the Context of the Right to be Forgotten
Diffusion Posterior Sampling is Computationally Intractable
Learning to Infer Generative Template Programs for Visual Concepts
Masked Face Recognition with Generative-to-Discriminative Representations
A Global Geometric Analysis of Maximal Coding Rate Reduction
LayerMerge: Neural Network Depth Compression through Layer Pruning and Merging
Multi-View Clustering by Inter-cluster Connectivity Guided Reward
Graph-based Forecasting with Missing Data through Spatiotemporal Downsampling
Integrated Hardware Architecture and Device Placement Search
Isometric Representation Learning for Disentangled Latent Space of Diffusion Models
Towards Scalable and Versatile Weight Space Learning
Towards Compositionality in Concept Learning
Smoothness Adaptive Hypothesis Transfer Learning
Energy-Efficient Gaussian Processes Using Low-Precision Arithmetic
Interpreting Equivariant Representations
Semantic-Aware Human Object Interaction Image Generation
Particle Denoising Diffusion Sampler
Partial Optimality in the Linear Ordering Problem
Graph Automorphism Group Equivariant Neural Networks
Information Flow in Self-Supervised Learning
Can We Remove the Square-Root in Adaptive Gradient Methods? A Second-Order Perspective
Kernel Semi-Implicit Variational Inference
Learning to Play Atari in a World of Tokens
Provably Robust DPO: Aligning Language Models with Noisy Feedback
Position: $C^*$-Algebraic Machine Learning $-$ Moving in a New Direction
InferCept: Efficient Intercept Support for Augmented Large Language Model Inference
Minimax Optimality of Score-based Diffusion Models: Beyond the Density Lower Bound Assumptions
Perfect Alignment May be Poisonous to Graph Contrastive Learning
InterLUDE: Interactions between Labeled and Unlabeled Data to Enhance Semi-Supervised Learning
Incremental Topological Ordering and Cycle Detection with Predictions
The Good, The Bad, and Why: Unveiling Emotions in Generative AI
FedBAT: Communication-Efficient Federated Learning via Learnable Binarization
RVI-SAC: Average Reward Off-Policy Deep Reinforcement Learning
Dynamic Survival Analysis with Controlled Latent States
LLM-Empowered State Representation for Reinforcement Learning
Translating Subgraphs to Nodes Makes Simple GNNs Strong and Efficient for Subgraph Representation Learning
Langevin Policy for Safe Reinforcement Learning
InstructSpeech: Following Speech Editing Instructions via Large Language Models
Domain Generalisation via Imprecise Learning
Convergence Guarantees for the DeepWalk Embedding on Block Models
Exploring Intrinsic Dimension for Vision-Language Model Pruning
Neural Diffusion Models
Variational Partial Group Convolutions for Input-Aware Partial Equivariance of Rotations and Color-Shifts
Towards Efficient Spiking Transformer: a Token Sparsification Framework for Training and Inference Acceleration
High-Dimensional Geometric Streaming for Nearly Low Rank Data
Performance Bounds for Active Binary Testing with Information Maximization
Bootstrap AutoEncoders With Contrastive Paradigm for Self-supervised Gaze Estimation
Revisiting Scalable Hessian Diagonal Approximations for Applications in Reinforcement Learning
Stabilizing Policy Gradients for Stochastic Differential Equations via Consistency with Perturbation Process
SignSGD with Federated Defense: Harnessing Adversarial Attacks through Gradient Sign Decoding
On a Neural Implementation of Brenier's Polar Factorization
Triple Changes Estimator for Targeted Policies
A Rate-Distortion View of Uncertainty Quantification
Deep Networks Always Grok and Here is Why
A Distributional Analogue to the Successor Representation
Differentially Private Sum-Product Networks
TIC-TAC: A Framework For Improved Covariance Estimation In Deep Heteroscedastic Regression
Trustworthy Actionable Perturbations
Rethinking Independent Cross-Entropy Loss For Graph-Structured Data
Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality
Equilibrium of Data Markets with Externality
An Empirical Study of Realized GNN Expressiveness
Logistic Variational Bayes Revisited
Explorations of Self-Repair in Language Models
Interacting Diffusion Processes for Event Sequence Forecasting
Sparsest Models Elude Pruning: An Exposé of Pruning’s Current Capabilities
Fault Tolerant ML: Efficient Meta-Aggregation and Synchronous Training
A Unified Recipe for Deriving (Time-Uniform) PAC-Bayes Bounds
Attribution-based Explanations that Provide Recourse Cannot be Robust
Multi-class Probabilistic Bounds for Majority Vote Classifiers with Partially Labeled Data
Online Non-stochastic Control with Partial Feedback
T-Cal: An Optimal Test for the Calibration of Predictive Models
One Meta-tuned Transformer is What You Need for Few-shot Learning
Scalable Real-Time Recurrent Learning Using Columnar-Constructive Networks
KernelWarehouse: Rethinking the Design of Dynamic Convolution
Exploring the Low-Pass Filtering Behavior in Image Super-Resolution
NeWRF: A Deep Learning Framework for Wireless Radiation Field Reconstruction and Channel Prediction
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models
Pruner-Zero: Evolving Symbolic Pruning Metric From Scratch for Large Language Models
EquiPocket: an E(3)-Equivariant Geometric Graph Neural Network for Ligand Binding Site Prediction
Stochastic Localization via Iterative Posterior Sampling
Breaking through the learning plateaus of in-context learning in Transformer
Efficient PAC Learnability of Dynamical Systems Over Multilayer Networks
Provably Better Explanations with Optimized Aggregation of Feature Attributions
In-Context Language Learning: Architectures and Algorithms
Aligning Transformers with Weisfeiler-Leman
CHEMREASONER: Heuristic Search over a Large Language Model’s Knowledge Space using Quantum-Chemical Feedback
InfoNet: Neural Estimation of Mutual Information without Test-Time Optimization
Coresets for Multiple $\ell_p$ Regression
LEVI: Generalizable Fine-tuning via Layer-wise Ensemble of Different Views
Generalized Smooth Variational Inequalities: Methods with Adaptive Stepsizes
Automated Evaluation of Retrieval-Augmented Language Models with Task-Specific Exam Generation
Model-Based RL for Mean-Field Games is not Statistically Harder than Single-Agent RL
Probabilistic Modeling of Interpersonal Coordination Processes
PAGER: Accurate Failure Characterization in Deep Regression Models
SparseTSF: Modeling Long-term Time Series Forecasting with *1k* Parameters
Provably Efficient Partially Observable Risk-sensitive Reinforcement Learning with Hindsight Observation
Unveiling the Dynamics of Information Interplay in Supervised Learning
Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study
Understanding Forgetting in Continual Learning with Linear Regression
Unlocking the Power of Spatial and Temporal Information in Medical Multimodal Pre-training
Multi-Sender Persuasion: A Computational Perspective
Emergent Representations of Program Semantics in Language Models Trained on Programs
Approximate Nearest Neighbor Search with Window Filters
CLLMs: Consistency Large Language Models
Robustness of Nonlinear Representation Learning
Skill Set Optimization: Reinforcing Language Model Behavior via Transferable Skills
Parsimonious Learning-Augmented Approximations for Dense Instances of $\mathcal{NP}$-hard Problems
Reinforcement Learning within Tree Search for Fast Macro Placement
DéjàVu: KV-cache Streaming for Fast, Fault-tolerant Generative LLM Serving
Two-timescale Derivative Free Optimization for Performative Prediction with Markovian Data
Profile Reconstruction from Private Sketches
Trustless Audits without Revealing Data or Models
In-Context Learning Agents Are Asymmetric Belief Updaters
Online Speculative Decoding
Mitigating Label Noise on Graphs via Topological Sample Selection
Private Truly-Everlasting Robust-Prediction
Factored-Reward Bandits with Intermediate Observations
Pedestrian Attribute Recognition as Label-balanced Multi-label Learning
CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers
CosPGD: an efficient white-box adversarial attack for pixel-wise prediction tasks
Convergence of Some Convex Message Passing Algorithms to a Fixed Point
Improving Robustness to Multiple Spurious Correlations by Multi-Objective Optimization
Adapting Pretrained ViTs with Convolution Injector for Visuo-Motor Control
Characterizing Overfitting in Kernel Ridgeless Regression Through the Eigenspectrum
Optimal Kernel Choice for Score Function-based Causal Discovery
Weakly Convex Regularisers for Inverse Problems: Convergence of Critical Points and Primal-Dual Optimisation
Entropy-Reinforced Planning with Large Language Models for Drug Discovery
Evaluation of Trajectory Distribution Predictions with Energy Score
Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning
CarbonNovo: Joint Design of Protein Structure and Sequence Using a Unified Energy-based Model
Delving into Differentially Private Transformer
Minimum Norm Interpolation Meets The Local Theory of Banach Spaces
Incorporating probabilistic domain knowledge into deep multiple instance learning
Don't be so Negative! Score-based Generative Modeling with Oracle-assisted Guidance
Stationary Latent Weight Inference for Unreliable Observations from Online Test-Time Adaptation
How Do Nonlinear Transformers Learn and Generalize in In-Context Learning?
Reducing Fine-Tuning Memory Overhead by Approximate and Memory-Sharing Backpropagation
PANDA: Expanded Width-Aware Message Passing Beyond Rewiring
Federated Optimization with Doubly Regularized Drift Correction
Do Large Language Models Perform the Way People Expect? Measuring the Human Generalization Function
Image Clustering with External Guidance
From Coarse to Fine: Enable Comprehensive Graph Self-supervised Learning with Multi-granular Semantic Ensemble
Improving Gradient-Guided Nested Sampling for Posterior Inference
Neighboring Perturbations of Knowledge Editing on Large Language Models
Fair Resource Allocation in Multi-Task Learning
Causally Motivated Personalized Federated Invariant Learning with Shortcut-Averse Information-Theoretic Regularization
Hierarchical Novelty Detection via Fine-Grained Evidence Allocation
Improving Computational Complexity in Statistical Models with Local Curvature Information
Removing Spurious Concepts from Neural Network Representations via Joint Subspace Estimation
Refining Minimax Regret for Unsupervised Environment Design
DS-Agent: Automated Data Science by Empowering Large Language Models with Case-Based Reasoning
Disentangled 3D Scene Generation with Layout Learning
Feature Contamination: Neural Networks Learn Uncorrelated Features and Fail to Generalize
Two Fists, One Heart: Multi-Objective Optimization Based Strategy Fusion for Long-tailed Learning
A Near-Linear Time Approximation Algorithm for Beyond-Worst-Case Graph Clustering
Instruction Tuning for Secure Code Generation
Balancing Feature Similarity and Label Variability for Optimal Size-Aware One-shot Subset Selection
Hidden Traveling Waves bind Working Memory Variables in Recurrent Neural Networks
An Information-Theoretic Analysis of In-Context Learning
SpikeZIP-TF: Conversion is All You Need for Transformer-based SNN
Knowledge Transfer from Vision Foundation Models for Efficient Training of Small Task-specific Models
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
Invariant Risk Minimization Is A Total Variation Model
In-Context Principle Learning from Mistakes
Extending Test-Time Augmentation with Metamorphic Relations for Combinatorial Problems
Optimization without Retraction on the Random Generalized Stiefel Manifold
DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models (Exemplified as A Video Agent)
Reward-Free Kernel-Based Reinforcement Learning
Learning and Forgetting Unsafe Examples in Large Language Models
Why Do Animals Need Shaping? A Theory of Task Composition and Curriculum Learning
Parallel Affine Transformation Tuning of Markov Chain Monte Carlo
Efficient Low-Rank Matrix Estimation, Experimental Design, and Arm-Set-Dependent Low-Rank Bandits
ESNet: Evolution and Succession Network for High-Resolution Salient Object Detection
OTMatch: Improving Semi-Supervised Learning with Optimal Transport
Harnessing the Power of Neural Operators with Automatically Encoded Conservation Laws
Rethinking Momentum Knowledge Distillation in Online Continual Learning
ELF: Encoding Speaker-Specific Latent Speech Feature for Speech Synthesis
Arrows of Time for Large Language Models
Slow and Steady Wins the Race: Maintaining Plasticity with Hare and Tortoise Networks
Mobile Attention: Mobile-Friendly Linear-Attention for Vision Transformers
Rényi Pufferfish Privacy: General Additive Noise Mechanisms and Privacy Amplification by Iteration via Shift Reduction Lemmas
A Single-Loop Robust Policy Gradient Method for Robust Markov Decision Processes
Guidance with Spherical Gaussian Constraint for Conditional Diffusion
Graph Neural Networks with a Distribution of Parametrized Graphs
Memorization Through the Lens of Curvature of Loss Function Around Samples
MoMo: Momentum Models for Adaptive Learning Rates
OT-CLIP: Understanding and Generalizing CLIP via Optimal Transport
Risk-Sensitive Reward-Free Reinforcement Learning with CVaR
On the Minimal Degree Bias in Generalization on the Unseen for non-Boolean Functions
Prediction Accuracy of Learning in Games : Follow-the-Regularized-Leader meets Heisenberg
CuTS: Customizable Tabular Synthetic Data Generation
GeoMFormer: A General Architecture for Geometric Molecular Representation Learning
Interpretability Illusions in the Generalization of Simplified Models
RL-VLM-F: Reinforcement Learning from Vision Language Foundation Model Feedback
Unsupervised Evaluation of Code LLMs with Round-Trip Correctness
Towards Robust Model-Based Reinforcement Learning Against Adversarial Corruption
The Fundamental Limits of Least-Privilege Learning
Safe Exploration in Dose Finding Clinical Trials with Heterogeneous Participants
Minimally Modifying a Markov Game to Achieve Any Nash Equilibrium and Value
AD3: Implicit Action is the Key for World Models to Distinguish the Diverse Visual Distractors
Boosting Offline Optimizers with Surrogate Sensitivity
UP2ME: Univariate Pre-training to Multivariate Fine-tuning as a General-purpose Framework for Multivariate Time Series Analysis
Why Do You Grok? A Theoretical Analysis on Grokking Modular Addition
Accelerated Policy Gradient: On the Convergence Rates of the Nesterov Momentum for Reinforcement Learning
Transformers Implement Functional Gradient Descent to Learn Non-Linear Functions In Context
Robust Data-driven Prescriptiveness Optimization
KernelSHAP-IQ: Weighted Least Square Optimization for Shapley Interactions
Human vs. Generative AI in Content Creation Competition: Symbiosis or Conflict?
Generalist Equivariant Transformer Towards 3D Molecular Interaction Learning
Momentor: Advancing Video Large Language Model with Fine-Grained Temporal Reasoning
OLLIE: Imitation Learning from Offline Pretraining to Online Finetuning
One Prompt is not Enough: Automated Construction of a Mixture-of-Expert Prompts
Data Poisoning Attacks against Conformal Prediction
Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch
Latent variable model for high-dimensional point process with structured missingness
PolySketchFormer: Fast Transformers via Sketching Polynomial Kernels
Characterizing Large Language Model Geometry Helps Solve Toxicity Detection and Generation
Chasing Convex Functions with Long-term Constraints
DeepPolar: Inventing Nonlinear Large-Kernel Polar Codes via Deep Learning
Triplet Interaction Improves Graph Transformers: Accurate Molecular Graph Learning with Triplet Graph Transformers
Diffuse, Sample, Project: Plug-And-Play Controllable Graph Generation
Statistical Properties of Robust Satisficing
EvGGS: A Collaborative Learning Framework for Event-based Generalizable Gaussian Splatting
Visual Transformer with Differentiable Channel Selection: An Information Bottleneck Inspired Approach
MILP-FBGen: LP/MILP Instance Generation with Feasibility/Boundedness
Position: Fundamental Limitations of LLM Censorship Necessitate New Approaches
Executable Code Actions Elicit Better LLM Agents
BetterV: Controlled Verilog Generation with Discriminative Guidance
Fast Timing-Conditioned Latent Audio Diffusion
Training Greedy Policy for Proposal Batch Selection in Expensive Multi-Objective Combinatorial Optimization
In-Context Decision Transformer: Reinforcement Learning via Hierarchical Chain-of-Thought
Conformalized Survival Distributions: A Generic Post-Process to Increase Calibration
Self-Supervised Coarsening of Unstructured Grid with Automatic Differentiation
Generative Flows on Discrete State-Spaces: Enabling Multimodal Flows with Applications to Protein Co-Design
More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning
BAT: Learning to Reason about Spatial Sounds with Large Language Models
Monotone, Bi-Lipschitz, and Polyak-Łojasiewicz Networks
Selective Mixup Helps with Distribution Shifts, But Not (Only) because of Mixup
Feel-Good Thompson Sampling for Contextual Dueling Bandits
Understanding and Diagnosing Deep Reinforcement Learning
Beyond the ROC Curve: Classification Trees Using Cost-Optimal Curves, with Application to Imbalanced Datasets
Distributionally Robust Data Valuation
Learning Coverage Paths in Unknown Environments with Deep Reinforcement Learning
Unsupervised Representation Learning of Brain Activity via Bridging Voxel Activity and Functional Connectivity
Enhancing Value Function Estimation through First-Order State-Action Dynamics in Offline Reinforcement Learning
Double-Step Alternating Extragradient with Increasing Timescale Separation for Finding Local Minimax Points: Provable Improvements
Graph-based Time Series Clustering for End-to-End Hierarchical Forecasting
EVEREST: Efficient Masked Video Autoencoder by Removing Redundant Spatiotemporal Tokens
SPADE: Sparsity-Guided Debugging for Deep Neural Networks
Learning Latent Space Hierarchical EBM Diffusion Models
Image Restoration Through Generalized Ornstein-Uhlenbeck Bridge
Efficient Stochastic Approximation of Minimax Excess Risk Optimization
MEMORYLLM: Towards Self-Updatable Large Language Models
QORA: Zero-Shot Transfer via Interpretable Object-Relational Model Learning
Better & Faster Large Language Models via Multi-token Prediction
Subgoal-based Demonstration Learning for Formal Theorem Proving
Translation Equivariant Transformer Neural Processes
Efficient Denoising Diffusion via Probabilistic Masking
Agent-Specific Effects: A Causal Effect Propagation Analysis in Multi-Agent MDPs
ACM-MILP: Adaptive Constraint Modification via Grouping and Selection for Hardness-Preserving MILP Instance Generation
The Effect of Weight Precision on the Neuron Count in Deep ReLU Networks
BLO-SAM: Bi-level Optimization Based Finetuning of the Segment Anything Model for Overfitting-Preventing Semantic Segmentation
On the Emergence of Cross-Task Linearity in Pretraining-Finetuning Paradigm
Manifold Integrated Gradients: Riemannian Geometry for Feature Attribution
Superposition Prompting: Improving and Accelerating Retrieval-Augmented Generation
Visual Representation Learning with Stochastic Frame Prediction
Exploration and Anti-Exploration with Distributional Random Network Distillation
Stochastic Conditional Diffusion Models for Robust Semantic Image Synthesis
Position: Is machine learning good or bad for the natural sciences?
A New Theoretical Perspective on Data Heterogeneity in Federated Optimization
SaVeR: Optimal Data Collection Strategy for Safe Policy Evaluation in Tabular MDP
Candidate Pseudolabel Learning: Enhancing Vision-Language Models by Prompt Tuning with Unlabeled Data
Sample Complexity Bounds for Estimating Probability Divergences under Invariances
Pruned Pivot: Correlation Clustering Algorithm for Dynamic, Parallel, and Local Computation Models
Compositional Few-Shot Class-Incremental Learning
Fast Sampling-Based Sketches for Tensors
Latent Noise Segmentation: How Neural Noise Leads to the Emergence of Segmentation and Grouping
Position: Evolving AI Collectives Enhance Human Diversity and Enable Self-Regulation
On the Universality of Volume-Preserving and Coupling-Based Normalizing Flows
Less is More: on the Over-Globalizing Problem in Graph Transformers
Scaling Beyond the GPU Memory Limit for Large Mixture-of-Experts Model Training
Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference
Language Agents with Reinforcement Learning for Strategic Play in the Werewolf Game
Confronting Reward Overoptimization for Diffusion Models: A Perspective of Inductive and Primacy Biases
Balancing Similarity and Complementarity for Federated Learning
Two Heads are Actually Better than One: Towards Better Adversarial Robustness via Transduction and Rejection
FlashST: A Simple and Universal Prompt-Tuning Framework for Traffic Prediction
xT: Nested Tokenization for Larger Context in Large Images
CompeteAI: Understanding the Competition Dynamics of Large Language Model-based Agents
Joint Composite Latent Space Bayesian Optimization
CF-OPT: Counterfactual Explanations for Structured Prediction
A Dense Reward View on Aligning Text-to-Image Diffusion with Preference
Envisioning Outlier Exposure by Large Language Models for Out-of-Distribution Detection
Improved Differentially Private and Lazy Online Convex Optimization: Lower Regret without Smoothness Requirements
ETHER: Efficient Finetuning of Large-Scale Models with Hyperplane Reflections
Memoria: Resolving Fateful Forgetting Problem through Human-Inspired Memory Architecture
Better Safe than Sorry: Pre-training CLIP against Targeted Data Poisoning and Backdoor Attacks
Position: AI/ML Influencers Have a Place in the Academic Process
Characterizing ResNet's Universal Approximation Capability
Overcoming the Optimizer's Curse: Obtaining Realistic Prescriptions from Neural Networks
REMEDI: Corrective Transformations for Improved Neural Entropy Estimation
We use cookies to store which papers have been visited.
I agree
Successful Page Load
ICML uses cookies to remember that you are logged in. By using our websites, you agree to the placement of cookies.
Our Privacy Policy »
Accept Cookies