Skip to yearly menu bar
Skip to main content
Main Navigation
ICML
Help/FAQ
Contact ICML
Downloads
Code of Conduct
Create Profile
Journal To Conference Track
Diversity & Inclusion
Privacy Policy
Future Meetings
Press
Careers
My Stuff
Login
Select Year: (2024)
2024
2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2002
1996
IMLS Archives
Getting Started
Schedule
Tutorials
Main Conference
Invited Talks
Orals
Spotlight Posters
Awards
Test of Time Award
Papers
Workshops
Community
Affinity Events
Affinity Joint Poster Session
Socials
Town Hall / Business Meeting
Exhibitors
Organizers
Help
FAQ
RocketChat Help
RocketChat Desktop Client
Browse
Visualization
mini
compact
topic
detail
Showing papers for
.
×
×
title
author
topic
session
shuffle
by
serendipity
bookmarked first
visited first
not visited first
bookmarked but not visited
Enable Javascript in your browser to see the papers page.
Graph Structure Extrapolation for Out-of-Distribution Generalization
SuDA: Support-based Domain Adaptation for Sim2Real Hinge Joint Tracking with Flexible Sensors
Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations
Equivariant Diffusion for Crystal Structure Prediction
Split-Ensemble: Efficient OOD-aware Ensemble via Task and Model Splitting
Unlocking the Power of Spatial and Temporal Information in Medical Multimodal Pre-training
Interaction-based Retrieval-augmented Diffusion Models for Protein-specific 3D Molecule Generation
Drug Discovery with Dynamic Goal-aware Fragments
Beyond Sole Strength: Customized Ensembles for Generalized Vision-Language Models
DeCoOp: Robust Prompt Tuning with Out-of-Distribution Detection
Position: Do pretrained Transformers Learn In-Context by Gradient Descent?
Scale-Free Image Keypoints Using Differentiable Persistent Homology
Position: A Roadmap to Pluralistic Alignment
Centralized Selection with Preferences in the Presence of Biases
Operator SVD with Neural Networks via Nested Low-Rank Approximation
BLO-SAM: Bi-level Optimization Based Finetuning of the Segment Anything Model for Overfitting-Preventing Semantic Segmentation
Learning to Infer Generative Template Programs for Visual Concepts
Online Learning in CMDPs: Handling Stochastic and Adversarial Constraints
Does Label Smoothing Help Deep Partial Label Learning?
The Merit of River Network Topology for Neural Flood Forecasting
Multi-Agent Reinforcement Learning Meets Leaf Sequencing in Radiotherapy
Ambiguity-Aware Abductive Learning
Predictive Dynamic Fusion
Reference Neural Operators: Learning the Smooth Dependence of Solutions of PDEs on Geometric Deformations
Towards General Neural Surrogate Solvers with Specialized Neural Accelerators
Log Neural Controlled Differential Equations: The Lie Brackets Make A Difference
TSLANet: Rethinking Transformers for Time Series Representation Learning
BiE: Bi-Exponent Block Floating-Point for Large Language Models Quantization
SleepFM: Multi-modal Representation Learning for Sleep Across Brain Activity, ECG and Respiratory Signals
CauDiTS: Causal Disentangled Domain Adaptation of Multivariate Time Series
Predicting Dose-Response Curves with Deep Neural Networks
Position: Considerations for Differentially Private Learning with Large-Scale Public Pretraining
Exploring the Enigma of Neural Dynamics Through A Scattering-Transform Mixer Landscape for Riemannian Manifold
Practical Performance Guarantees for Pipelined DNN Inference
Revealing Vision-Language Integration in the Brain with Multimodal Networks
Listenable Maps for Audio Classifiers
Mollification Effects of Policy Gradient Methods
FedREDefense: Defending against Model Poisoning Attacks for Federated Learning using Model Update Reconstruction Error
To Cool or not to Cool? Temperature Network Meets Large Foundation Models via DRO
Fine-grained Classes and How to Find Them
Fast Algorithms for Hypergraph PageRank with Applications to Semi-Supervised Learning
Bayesian Knowledge Distillation: A Bayesian Perspective of Distillation with Uncertainty Quantification
Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution
A2Q+: Improving Accumulator-Aware Weight Quantization
How Smooth Is Attention?
HyperFields: Towards Zero-Shot Generation of NeRFs from Text
Layerwise Proximal Replay: A Proximal Point Method for Online Continual Learning
Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers
Non-confusing Generation of Customized Concepts in Diffusion Models
How Learning by Reconstruction Produces Uninformative Features For Perception
Unified Generation, Reconstruction, and Representation: Generalized Diffusion with Adaptive Latent Encoding-Decoding
SMaRt: Improving GANs with Score Matching Regularity
E$^2$GAN: Efficient Training of Efficient GANs for Image-to-Image Translation
What’s the score? Automated Denoising Score Matching for Nonlinear Diffusions
Position: Measure Dataset Diversity, Don't Just Claim It
Causal Representation Learning from Multiple Distributions: A General Setting
The Balanced-Pairwise-Affinities Feature Transform
On the Recoverability of Causal Relations from Temporally Aggregated I.I.D. Data
Detecting and Identifying Selection Structure in Sequential Data
Feature Distribution on Graph Topology Mediates the Effect of Graph Convolution: Homophily Perspective
Unsupervised Parameter-free Simplicial Representation Learning with Scattering Transforms
Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning
How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis
CaRiNG: Learning Temporal Causal Representation under Non-Invertible Generation Process
Compositional Capabilities of Autoregressive Transformers: A Study on Synthetic, Interpretable Tasks
Tag-LLM: Repurposing General-Purpose LLMs for Specialized Domains
Position: Key Claims in LLM Research Have a Long Tail of Footnotes
Understanding Reasoning Ability of Language Models From the Perspective of Reasoning Paths Aggregation
Reward-Free Kernel-Based Reinforcement Learning
SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks
Conditional Language Learning with Context
${\rm E}(3)$-Equivariant Actor-Critic Methods for Cooperative Multi-Agent Reinforcement Learning
Agent Instructs Large Language Models to be General Zero-Shot Reasoners
Improving Factuality and Reasoning in Language Models through Multiagent Debate
MC-GTA: Metric-Constrained Model-Based Clustering using Goodness-of-fit Tests with Autocorrelations
Prototypical Transformer As Unified Motion Learners
USTAD: Unified Single-model Training Achieving Diverse Scores for Information Retrieval
From Neurons to Neutrons: A Case Study in Interpretability
Erasing the Bias: Fine-Tuning Foundation Models for Semi-Supervised Learning
Fool Your (Vision and) Language Model with Embarrassingly Simple Permutations
Measuring Stochastic Data Complexity with Boltzmann Influence Functions
Exploring the Low-Pass Filtering Behavior in Image Super-Resolution
No Free Prune: Information-Theoretic Barriers to Pruning at Initialization
Confidence Aware Inverse Constrained Reinforcement Learning
The Benefits of Reusing Batches for Gradient Descent in Two-Layer Networks: Breaking the Curse of Information and Leap Exponents
Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning
Convergence Guarantees for the DeepWalk Embedding on Block Models
How Spurious Features are Memorized: Precise Analysis for Random and NTK Features
The Illusion of State in State-Space Models
Amortized Equation Discovery in Hybrid Dynamical Systems
Neural NeRF Compression
LPGD: A General Framework for Backpropagation through Embedded Optimization Layers
MADA: Meta-Adaptive Optimizers Through Hyper-Gradient Descent
Spectral Preconditioning for Gradient Methods on Graded Non-convex Functions
Symmetric Matrix Completion with ReLU Sampling
Supervised Matrix Factorization: Local Landscape Analysis and Applications
Riemannian coordinate descent algorithms on matrix manifolds
A New Branch-and-Bound Pruning Framework for $\ell_0$-Regularized Problems
Faster Sampling via Stochastic Gradient Proximal Sampler
Polygonal Unadjusted Langevin Algorithms: Creating stable and efficient adaptive algorithms for neural networks
Bayesian Optimization of Function Networks with Partial Evaluations
Dynamic Byzantine-Robust Learning: Adapting to Switching Byzantine Workers
Accelerating Federated Learning with Quick Distributed Mean Estimation
Taylor Videos for Action Recognition
Acquiring Diverse Skills using Curriculum Reinforcement Learning with Mixture of Experts
Rich-Observation Reinforcement Learning with Continuous Latent Dynamics
Position: Automatic Environment Shaping is the Next Frontier in RL
Distributional Bellman Operators over Mean Embeddings
RVI-SAC: Average Reward Off-Policy Deep Reinforcement Learning
A Graph is Worth $K$ Words: Euclideanizing Graph using Pure Transformer
Combining Experimental and Historical Data for Policy Evaluation
How Uniform Random Weights Induce Non-uniform Bias: Typical Interpolating Neural Networks Generalize with Narrow Teachers
Online Isolation Forest
Bridging Environments and Language with Rendering Functions and Vision-Language Models
Fair Risk Control: A Generalized Framework for Calibrating Multi-group Fairness Risks
Limited Preference Aided Imitation Learning from Imperfect Demonstrations
Neural Jump-Diffusion Temporal Point Processes
HGAP: Boosting Permutation Invariant and Permutation Equivariant in Multi-Agent Reinforcement Learning via Graph Attention Network
Sample-Efficient Multiagent Reinforcement Learning with Reset Replay
Integrating Global Context Contrast and Local Sensitivity for Blind Image Quality Assessment
Position: Benchmarking is Limited in Reinforcement Learning Research
Position: Foundation Agents as the Paradigm Shift for Decision Making
Learning Causal Dynamics Models in Object-Oriented Environments
Density Ratio Estimation with Doubly Strong Robustness
Robust Inverse Graphics via Probabilistic Inference
Latent Optimal Paths by Gumbel Propagation for Variational Bayesian Dynamic Programming
Sparse Inducing Points in Deep Gaussian Processes: Enhancing Modeling with Denoising Diffusion Variational Inference
Physics and Lie symmetry informed Gaussian processes
Amortized Variational Deep Kernel Learning
Positive and Unlabeled Learning with Controlled Probability Boundary Fence
Federated Combinatorial Multi-Agent Multi-Armed Bandits
Copula-Nested Spectral Kernel Network
Energy-Efficient Gaussian Processes Using Low-Precision Arithmetic
Stable Differentiable Causal Discovery
Adaptively Learning to Select-Rank in Online Platforms
REMEDI: Corrective Transformations for Improved Neural Entropy Estimation
Beyond ELBOs: A Large-Scale Evaluation of Variational Methods for Sampling
Non-parametric Online Change Point Detection on Riemannian Manifolds
Bayesian Program Learning by Decompiling Amortized Knowledge
Active Statistical Inference
Provably Neural Active Learning Succeeds via Prioritizing Perplexing Samples
Pseudo-Calibration: Improving Predictive Uncertainty Estimation in Unsupervised Domain Adaptation
Differentially Private Domain Adaptation with Theoretical Guarantees
Non-Vacuous Generalization Bounds for Large Language Models
Optimal Coresets for Low-Dimensional Geometric Median
Compositional Few-Shot Class-Incremental Learning
A Fine-grained Analysis of Fitted Q-evaluation: Beyond Parametric Models
Data-efficient Large Vision Models through Sequential Autoregression
Revealing the Dark Secrets of Extremely Large Kernel ConvNets on Robustness
Criterion Collapse and Loss Distribution Control
EDISON: Enhanced Dictionary-Induced Tensorized Incomplete Multi-View Clustering with Gaussian Error Rank Minimization
Decoupling Learning and Decision-Making: Breaking the $\mathcal{O}(\sqrt{T})$ Barrier in Online Resource Allocation with First-Order Methods
Exploration by Optimization with Hybrid Regularizers: Logarithmic Regret with Adversarial Robustness in Partial Monitoring
Random Exploration in Bayesian Optimization: Order-Optimal Regret and Computational Efficiency
Hierarchical Integral Probability Metrics: A distance on random probability measures with low sample complexity
Eluder-based Regret for Stochastic Contextual MDPs
Near-Optimal Regret in Linear MDPs with Aggregate Bandit Feedback
Don’t Label Twice: Quantity Beats Quality when Comparing Binary Classifiers on a Budget
From Geometry to Causality- Ricci Curvature and the Reliability of Causal Inference on Networks
Boximator: Generating Rich and Controllable Motions for Video Synthesis
Longitudinal Targeted Minimum Loss-based Estimation with Temporal-Difference Heterogeneous Transformer
Correlation-Induced Label Prior for Semi-Supervised Multi-Label Learning
Probabilistic Conceptual Explainers: Trustworthy Conceptual Explanations for Vision Foundation Models
Conditional Common Entropy for Instrumental Variable Testing and Partial Identification
Modular Learning of Deep Causal Generative Models for High-dimensional Causal Inference
A Generative Approach for Treatment Effect Estimation under Collider Bias: From an Out-of-Distribution Perspective
Robust Sparse Estimation for Gaussians with Optimal Error under Huber Contamination
Sparse is Enough in Fine-tuning Pre-trained Large Language Models
Neuro-Visualizer: A Novel Auto-Encoder-Based Loss Landscape Visualization Method With an Application in Knowledge-Guided Machine Learning
Efficient Online Set-valued Classification with Bandit Feedback
On Multi-Armed Bandit with Impatient Arms
PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs
Reducing sequential change detection to sequential estimation
Learning Universal Predictors
Federated Neuro-Symbolic Learning
Unleashing the Power of Meta-tuning for Few-shot Generalization Through Sparse Interpolated Experts
Auditing Private Prediction
Differentially private exact recovery for stochastic block models
Differentially Private Representation Learning via Image Captioning
How to Make the Gradients Small Privately: Improved Rates for Differentially Private Non-Convex Optimization
Ditto: Quantization-aware Secure Inference of Transformers upon MPC
Privacy Profiles for Private Selection
Position: On the Possibilities of AI-Generated Text Detection
Fast Adversarial Attacks on Language Models In One GPU Minute
What Would Gauss Say About Representations? Probing Pretrained Image Models using Synthetic Gaussian Benchmarks
Fair Off-Policy Learning from Observational Data
Patchscopes: A Unifying Framework for Inspecting Hidden Representations of Language Models
Learning-Efficient Yet Generalizable Collaborative Filtering for Item Recommendation
InterLUDE: Interactions between Labeled and Unlabeled Data to Enhance Semi-Supervised Learning
Total Variation Floodgate for Variable Importance Inference in Classification
Minimally Modifying a Markov Game to Achieve Any Nash Equilibrium and Value
Understanding Inter-Concept Relationships in Concept-Based Models
Understanding Retrieval-Augmented Task Adaptation for Vision-Language Models
Semantically-correlated memories in a dense associative model
Self-cognitive Denoising in the Presence of Multiple Noisy Label Sources
Transformers, parallel computation, and logarithmic depth
Probabilistic Time Series Modeling with Decomposable Denoising Diffusion Model
On the Expressive Power of Spectral Invariant Graph Neural Networks
Implicit Compressibility of Overparametrized Neural Networks Trained with Heavy-Tailed SGD
Reflective Policy Optimization
Junk DNA Hypothesis: Pruning Small Pre-Trained Weights $\textit{Irreversibly}$ and $\textit{Monotonically}$ Impairs ``Difficult" Downstream Tasks in LLMs
The Emergence of Reproducibility and Consistency in Diffusion Models
Switchable Decision: Dynamic Neural Generation Networks
Trust Regions for Explanations via Black-Box Probabilistic Certification
Collaborative Heterogeneous Causal Inference Beyond Meta-analysis
Provably Robust DPO: Aligning Language Models with Noisy Feedback
Collage: Light-Weight Low-Precision Strategy for LLM Training
Is Kernel Prediction More Powerful than Gating in Convolutional Neural Networks?
Towards Realistic Model Selection for Semi-supervised Learning
Sequential Kernel Goodness-of-fit Testing
MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities
Provably Efficient Exploration in Quantum Reinforcement Learning with Logarithmic Worst-Case Regret
Auto-Regressive Next-Token Predictors are Universal Learners
Efficient Policy Evaluation with Offline Data Informed Behavior Policy Design
Sampling is as easy as keeping the consistency: convergence guarantee for Consistency Models
Mean-field Chaos Diffusion Models
Completing Visual Objects via Bridging Generation and Segmentation
Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning?
diff History for Neural Language Agents
SILVER: Single-loop variance reduction and application to federated learning
Projecting Molecules into Synthesizable Chemical Spaces
Adaptive Sampling of k-Space in Magnetic Resonance for Rapid Pathology Prediction
Chain of Code: Reasoning with a Language Model-Augmented Code Emulator
Revisiting Scalable Hessian Diagonal Approximations for Applications in Reinforcement Learning
Position: Data-driven Discovery with Large Generative Models
UniCorn: A Unified Contrastive Learning Approach for Multi-view Molecular Representation Learning
KernelWarehouse: Rethinking the Design of Dynamic Convolution
Surface-VQMAE: Vector-quantized Masked Auto-encoders on Molecular Surfaces
Residual Quantization with Implicit Neural Codebooks
Near-Optimal Reinforcement Learning with Self-Play under Adaptivity Constraints
Hierarchical State Space Models for Continuous Sequence-to-Sequence Modeling
DistiLLM: Towards Streamlined Distillation for Large Language Models
Q-Align: Teaching LMMs for Visual Scoring via Discrete Text-Defined Levels
ESNet: Evolution and Succession Network for High-Resolution Salient Object Detection
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
Learning Low-dimensional Latent Dynamics from High-dimensional Observations: Non-asymptotics and Lower Bounds
MAGNOLIA: Matching Algorithms via GNNs for Online Value-to-go Approximation
Multi-Factor Adaptive Vision Selection for Egocentric Video Question Answering
In-Context Language Learning: Architectures and Algorithms
Global Reinforcement Learning : Beyond Linear and Convex Rewards via Submodular Semi-gradient Methods
$\bf{\Phi}_\textrm{Flow}$: Differentiable Simulations for PyTorch, TensorFlow and Jax
Geometric Active Exploration in Markov Decision Processes: the Benefit of Abstraction
Extending Adversarial Attacks to Produce Adversarial Class Probability Distributions
Rethinking Transformers in Solving POMDPs
Using Uncertainty Quantification to Characterize and Improve Out-of-Domain Learning for PDEs
Equivariant Graph Neural Operator for Modeling 3D Dynamics
CATS: Enhancing Multivariate Time Series Forecasting by Constructing Auxiliary Time Series as Exogenous Variables
Learning Optimal Projection for Forecast Reconciliation of Hierarchical Time Series
Generalist Equivariant Transformer Towards 3D Molecular Interaction Learning
Modelling Microbial Communities with Graph Neural Networks
CW Complex Hypothesis for Image Data
Efficient and Effective Time-Series Forecasting with Spiking Neural Networks
EquiAV: Leveraging Equivariance for Audio-Visual Contrastive Learning
AutoOS: Make Your OS More Powerful by Exploiting Large Language Models
Cross-view Masked Diffusion Transformers for Person Image Synthesis
An Embodied Generalist Agent in 3D World
Harnessing Hierarchical Label Distribution Variations in Test Agnostic Long-tail Recognition
Feature Contamination: Neural Networks Learn Uncorrelated Features and Fail to Generalize
Wasserstein Wormhole: Scalable Optimal Transport Distance with Transformer
Short-Long Convolutions Help Hardware-Efficient Linear Attention to Focus on Long Sequences
Overcoming Data and Model heterogeneities in Decentralized Federated Learning via Synthetic Anchors
Trust the Model Where It Trusts Itself - Model-Based Actor-Critic with Uncertainty-Aware Rollout Adaption
The Surprising Effectiveness of Skip-Tuning in Diffusion Sampling
Compositional Image Decomposition with Diffusion Models
CCM: Real-Time Controllable Visual Content Creation Using Text-to-Image Consistency Models
Decomposed Linear Dynamical Systems (dLDS) for learning the latent components of neural dynamics
Improving Adversarial Energy-Based Model via Diffusion Process
Guidance with Spherical Gaussian Constraint for Conditional Diffusion
Feedback Efficient Online Fine-Tuning of Diffusion Models
Editing Partially Observable Networks via Graph Diffusion Models
Cooperative Graph Neural Networks
What Improves the Generalization of Graph Transformers? A Theoretical Dive into the Self-attention and Positional Encoding
Enhancing Implicit Shape Generators Using Topological Regularizations
PAC-Bayesian Generalization Bounds for Knowledge Graph Representation Learning
A Statistical Framework for Data-dependent Retrieval-Augmented Models
Comparing Graph Transformers via Positional Encodings
Position: Future Directions in the Theory of Graph Machine Learning
MLAgentBench: Evaluating Language Agents on Machine Learning Experimentation
MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models
Automated Evaluation of Retrieval-Augmented Language Models with Task-Specific Exam Generation
Image Hijacks: Adversarial Images can Control Generative Models at Runtime
CLLMs: Consistency Large Language Models
Accelerating Iterative Retrieval-augmented Language Model Serving with Speculation
Evaluating Quantized Large Language Models
SiBBlInGS: Similarity-driven Building-Block Inference using Graphs across States
Improving Antibody Humanness Prediction using Patent Data
Tell, Don't Show: Language Guidance Eases Transfer Across Domains in Images and Videos
Benchmarking and Building Long-Context Retrieval Models with LoCo and M2-BERT
Algorithm of Thoughts: Enhancing Exploration of Ideas in Large Language Models
Reason for Future, Act for Now: A Principled Architecture for Autonomous LLM Agents
CogBench: a large language model walks into a psychology lab
DRED: Zero-Shot Transfer in Reinforcement Learning via Data-Regularised Environment Design
Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback
Protein Conformation Generation via Force-Guided SE(3) Diffusion Models
ExCP: Extreme LLM Checkpoint Compression via Weight-Momentum Joint Shrinking
Dissecting Multimodality in VideoQA Transformer Models by Impairing Modality Fusion
Rethinking Optimization and Architecture for Tiny Language Models
Repoformer: Selective Retrieval for Repository-Level Code Completion
MEMORYLLM: Towards Self-Updatable Large Language Models
Language Models with Conformal Factuality Guarantees
Pruner-Zero: Evolving Symbolic Pruning Metric From Scratch for Large Language Models
PICLe: Eliciting Diverse Behaviors from Large Language Models with Persona In-Context Learning
Learning Cognitive Maps from Transformer Representations for Efficient Planning in Partially Observed Environments
Bottleneck-Minimal Indexing for Generative Document Retrieval
HarmonyDream: Task Harmonization Inside World Models
Monotone Individual Fairness
Enhancing Sufficient Dimension Reduction via Hellinger Correlation
Tilt your Head: Activating the Hidden Spatial-Invariance of Classifiers
Not Just Pretty Pictures: Toward Interventional Data Augmentation Using Text-to-Image Generators
Keep the Momentum: Conservation Laws beyond Euclidean Gradient Flows
When Representations Align: Universality in Representation Learning Dynamics
Timer: Generative Pre-trained Transformers Are Large Time Series Models
Towards Causal Foundation Model: on Duality between Optimal Balancing and Attention
Towards Efficient Spiking Transformer: a Token Sparsification Framework for Training and Inference Acceleration
No Wrong Turns: The Simple Geometry Of Neural Networks Optimization Paths
Understanding the Training Speedup from Sampling with Approximate Losses
Optimal Hessian/Jacobian-Free Nonconvex-PL Bilevel Optimization
RAUCA: A Novel Physical Adversarial Attack on Vehicle Detectors via Robust and Accurate Camouflage Generation
Position: Cracking the Code of Cascading Disparity Towards Marginalized Communities
Mitigating Label Noise on Graphs via Topological Sample Selection
GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer
Tackling Prevalent Conditions in Unsupervised Combinatorial Optimization: Cardinality, Minimum, Covering, and More
Measures of diversity and space-filling designs for categorical data
Understanding Adam Optimizer via Online Learning of Updates: Adam is FTRL in Disguise
Differentiability and Optimization of Multiparameter Persistent Homology
Non-clairvoyant Scheduling with Partial Predictions
LQER: Low-Rank Quantization Error Reconstruction for LLMs
Privacy-Preserving Embedding via Look-up Table Evaluation with Fully Homomorphic Encryption
How Interpretable Are Interpretable Graph Neural Networks?
Understanding Unimodal Bias in Multimodal Deep Linear Networks
PcLast: Discovering Plannable Continuous Latent States
Efficient Exploration in Average-Reward Constrained Reinforcement Learning: Achieving Near-Optimal Regret With Posterior Sampling
Proactive DP: A Multiple Target Optimization Framework for DP-SGD
Reducing Fine-Tuning Memory Overhead by Approximate and Memory-Sharing Backpropagation
Best of Both Worlds Guarantees for Smoothed Online Quadratic Optimization
MusicRL: Aligning Music Generation to Human Preferences
Planning, Fast and Slow: Online Reinforcement Learning with Action-Free Offline Data via Multiscale Planners
Efficient Value Iteration for s-rectangular Robust Markov Decision Processes
An Information Theoretic Approach to Interaction-Grounded Learning
Multi-View Stochastic Block Models
Tackling Non-Stationarity in Reinforcement Learning via Causal-Origin Representation
Interpretability Illusions in the Generalization of Simplified Models
Robust Inverse Constrained Reinforcement Learning under Model Misspecification
Model-based Reinforcement Learning for Parameterized Action Spaces
Confidence-aware Contrastive Learning for Selective Classification
Model-based Reinforcement Learning for Confounded POMDPs
Contrastive Representation for Data Filtering in Cross-Domain Offline Reinforcement Learning
Adaptive Horizon Actor-Critic for Policy Learning in Contact-Rich Differentiable Simulation
Stop Regressing: Training Value Functions via Classification for Scalable Deep RL
Differentially Private Worst-group Risk Minimization
In value-based deep reinforcement learning, a pruned network is a good network
Rate-Optimal Policy Optimization for Linear Markov Decision Processes
Towards Interpretable Deep Local Learning with Successive Gradient Reconciliation
Multi-Agent Reinforcement Learning with Hierarchical Coordination for Emergency Responder Stationing
Diffusion Posterior Sampling is Computationally Intractable
Accelerated Policy Gradient for s-rectangular Robust MDPs with Large State Spaces
Reinforcement Learning within Tree Search for Fast Macro Placement
Rethinking the Flat Minima Searching in Federated Learning
Random matrix theory improved Fréchet mean of symmetric positive definite matrices
From Biased Selective Labels to Pseudo-Labels: An Expectation-Maximization Framework for Learning from Biased Decisions
Graph Attention Retrospective
ACM-MILP: Adaptive Constraint Modification via Grouping and Selection for Hardness-Preserving MILP Instance Generation
Winner-takes-all learners are geometry-aware conditional density estimators
O$n$ Learning Deep O($n$)-Equivariant Hyperspheres
Universal Consistency of Wide and Deep ReLU Neural Networks and Minimax Optimal Convergence Rates for Kolmogorov-Donoho Optimal Function Classes
To Each (Textual Sequence) Its Own: Improving Memorized-Data Unlearning in Large Language Models
Implicit Bias of Policy Gradient in Linear Quadratic Control: Extrapolation to Unseen Initial States
How Transformers Learn Causal Structure with Gradient Descent
Learning High-Frequency Functions Made Easy with Sinusoidal Positional Encoding
Algorithmic Stability Unleashed: Generalization Bounds with Unbounded Losses
Private Gradient Descent for Linear Regression: Tighter Error Bounds and Instance-Specific Uncertainty Estimation
Predicting and Interpreting Energy Barriers of Metallic Glasses with Graph Neural Networks
On the Asymptotic Distribution of the Minimum Empirical Risk
ReDiffuser: Reliable Decision-Making Using a Diffuser with Confidence Estimation
No Double Descent in Principal Component Regression: A High-Dimensional Analysis
Generalization Analysis for Multi-Label Learning
Efficient Pareto Manifold Learning with Low-Rank Structure
Understanding the Impact of Introducing Constraints at Inference Time on Generalization Error
Is Temperature Sample Efficient for Softmax Gaussian Mixture of Experts?
Agnostic Learning of Mixed Linear Regressions with EM and AM Algorithms
$H$-Consistency Guarantees for Regression
Online Learning with Bounded Recall
Factored-Reward Bandits with Intermediate Observations
Noise-Adaptive Confidence Sets for Linear Bandits and Application to Bayesian Optimization
Online Learning under Budget and ROI Constraints via Weak Adaptivity
Finite Time Logarithmic Regret Bounds for Self-Tuning Regulation
Randomized Confidence Bounds for Stochastic Partial Monitoring
A Unified Adaptive Testing System Enabled by Hierarchical Structure Search
Run-Time Task Composition with Safety Semantics
Reweighted Solutions for Weighted Low Rank Approximation
Jacobian Regularizer-based Neural Granger Causality
CosPGD: an efficient white-box adversarial attack for pixel-wise prediction tasks
COPAL: Continual Pruning in Large Language Generative Models
Mind the Boundary: Coreset Selection via Reconstructing the Decision Boundary
ATraDiff: Accelerating Online Reinforcement Learning with Imaginary Trajectories
On a Combinatorial Problem Arising in Machine Teaching
Understanding MLP-Mixer as a wide and sparse MLP
AST-T5: Structure-Aware Pretraining for Code Generation and Understanding
Expand-and-Cluster: Parameter Recovery of Neural Networks
PDHG-Unrolled Learning-to-Optimize Method for Large-Scale Linear Programming
What Will My Model Forget? Forecasting Forgotten Examples in Language Model Refinement
Regularizing with Pseudo-Negatives for Continual Self-Supervised Learning
Subgoal-based Demonstration Learning for Formal Theorem Proving
Enhancing Cross-Modal Fine-Tuning with Gradually Intermediate Modality Generation
Towards the Theory of Unsupervised Federated Learning: Non-asymptotic Analysis of Federated EM Algorithms
Meta-Reinforcement Learning Robust to Distributional Shift Via Performing Lifelong In-Context Learning
Multi-Source Conformal Inference Under Distribution Shift
Probabilistic Modeling of Interpersonal Coordination Processes
Mean Estimation in the Add-Remove Model of Differential Privacy
Differentially Private Bias-Term Fine-tuning of Foundation Models
Beyond the Calibration Point: Mechanism Comparison in Differential Privacy
Low-Cost High-Power Membership Inference Attacks
Differentially Private Sum-Product Networks
PID: Prompt-Independent Data Protection Against Latent Diffusion Models
Compact Optimality Verification for Optimization Proxies
Robust Universal Adversarial Perturbations
Position: Standardization of Behavioral Use Clauses is Necessary for the Adoption of Responsible Licensing of AI
Sobolev Space Regularised Pre Density Models
PrE-Text: Training Language Models on Private Federated Data in the Age of LLMs
AI Alignment with Changing and Influenceable Reward Functions
Plug-in Performative Optimization
TERD: A Unified Framework for Safeguarding Diffusion Models Against Backdoors
Extracting Training Data From Document-Based VQA Models
INViT: A Generalizable Routing Problem Solver with Invariant Nested View Transformer
A Multimodal Automated Interpretability Agent
EVEREST: Efficient Masked Video Autoencoder by Removing Redundant Spatiotemporal Tokens
Position: Building Guardrails for Large Language Models Requires Systematic Design
MaxMin-RLHF: Alignment with Diverse Human Preferences
AI Control: Improving Safety Despite Intentional Subversion
Indirectly Parameterized Concrete Autoencoders
Distinguishing the Knowable from the Unknowable with Language Models
Large Scale Dataset Distillation with Domain Shift
Watermarks in the Sand: Impossibility of Strong Watermarking for Language Models
DNCs Require More Planning Steps
High-dimensional Linear Bandits with Knapsacks
$\texttt{MoE-RBench}$: Towards Building Reliable Language Models with Sparse Mixture-of-Experts
Sparse Cocktail: Every Sparse Pattern Every Sparse Ratio All At Once
Compositional Curvature Bounds for Deep Neural Networks
Evolution-Inspired Loss Functions for Protein Representation Learning
A Space Group Symmetry Informed Network for O(3) Equivariant Crystal Tensor Prediction
Promises and Pitfalls of Generative Masked Language Modeling: Theoretical Framework and Practical Guidelines
Two Heads Are Better Than One: Boosting Graph Sparse Training via Semantic and Topological Awareness
Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark
Position: TrustLLM: Trustworthiness in Large Language Models
Compute Better Spent: Replacing Dense Layers with Structured Matrices
Single-Trajectory Distributionally Robust Reinforcement Learning
Orthogonal Bootstrap: Efficient Simulation of Input Uncertainty
Training-Free Long-Context Scaling of Large Language Models
LoRA+: Efficient Low Rank Adaptation of Large Models
OxyGenerator: Reconstructing Global Ocean Deoxygenation Over a Century with Deep Learning
Stability and Generalization of Stochastic Compositional Gradient Descent Algorithms
Gaussian Plane-Wave Neural Operator for Electron Density Estimation
Explain Temporal Black-Box Models via Functional Decomposition
Simplicity Bias via Global Convergence of Sharpness Minimization
Iterative Data Smoothing: Mitigating Reward Overfitting and Overoptimization in RLHF
A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts
Memory Efficient Neural Processes via Constant Memory Attention Block
Generalization Analysis of Stochastic Weight Averaging with General Sampling
What Can Transformer Learn with Varying Depth? Case Studies on Sequence Learning Tasks
Hierarchical Novelty Detection via Fine-Grained Evidence Allocation
Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity
Position: Enforced Amnesia as a Way to Mitigate the Potential Risk of Silent Suffering in the Conscious AI
A3S: A General Active Clustering Method with Pairwise Constraints
Fast Decision Boundary based Out-of-Distribution Detector
Prompting is a Double-Edged Sword: Improving Worst-Group Robustness of Foundation Models
Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
Offline Actor-Critic Reinforcement Learning Scales to Large Models
Graph Neural Stochastic Diffusion for Estimating Uncertainty in Node Classification
Flexible Residual Binarization for Image Super-Resolution
Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality
Kernel Semi-Implicit Variational Inference
Intersectional Unfairness Discovery
Towards General Algorithm Discovery for Combinatorial Optimization: Learning Symbolic Branching Policy from Bipartite Graph
Accelerating PDE Data Generation via Differential Operator Action in Solution Space
REST: Efficient and Accelerated EEG Seizure Analysis through Residual State Updates
eCeLLM: Generalizing Large Language Models for E-commerce from Large-scale, High-quality Instruction Data
Context-Guided Diffusion for Out-of-Distribution Molecular and Protein Design
Adaptive Robust Learning using Latent Bernoulli Variables
Machine Vision Therapy: Multimodal Large Language Models Can Enhance Visual Robustness via Denoising In-Context Learning
Efficient Black-box Adversarial Attacks via Bayesian Optimization Guided by a Function Prior
Language Generation with Strictly Proper Scoring Rules
AdsorbDiff: Adsorbate Placement via Conditional Denoising Diffusion
Superpoint Gaussian Splatting for Real-Time High-Fidelity Dynamic Scene Reconstruction
Prompt-based Visual Alignment for Zero-shot Policy Transfer
MFTN: A Multi-scale Feature Transfer Network Based on IMatchFormer for Hyperspectral Image Super-Resolution
Stereo Risk: A Continuous Modeling Approach to Stereo Matching
SCoRe: Submodular Combinatorial Representation Learning
Saliency strikes back: How filtering out high frequencies improves white-box explanations
On the Implicit Bias of Adam
From Fourier to Neural ODEs: Flow Matching for Modeling Complex Systems
FESSNC: Fast Exponentially Stable and Safe Neural Controller
GaussianPro: 3D Gaussian Splatting with Progressive Propagation
QuIP$\#$: Even Better LLM Quantization with Hadamard Incoherence and Lattice Codebooks
Parameterized Physics-informed Neural Networks for Parameterized PDEs
An Analysis of Linear Time Series Forecasting Models
PPFLOW: Target-Aware Peptide Design with Torsional Flow Matching
VideoPrism: A Foundational Visual Encoder for Video Understanding
Learning High-Order Relationships of Brain Regions
AegisFL: Efficient and Flexible Privacy-Preserving Byzantine-Robust Cross-silo Federated Learning
Coprocessor Actor Critic: A Model-Based Reinforcement Learning Approach For Adaptive Brain Stimulation
Knowledge Distillation with Auxiliary Variable
A Neural-Guided Dynamic Symbolic Network for Exploring Mathematical Expressions from Data
Rotational Equilibrium: How Weight Decay Balances Learning Across Neural Networks
Clifford-Steerable Convolutional Neural Networks
Learning Latent Structures in Network Games via Data-Dependent Gated-Prior Graph Variational Autoencoders
Risk-Sensitive Reward-Free Reinforcement Learning with CVaR
Universality of Linear Recurrences Followed by Non-linear Projections: Finite-Width Guarantees and Benefits of Complex Eigenvalues
D-Flow: Differentiating through Flows for Controlled Generation
Bias of Stochastic Gradient Descent or the Architecture: Disentangling the Effects of Overparameterization of Neural Networks
Sign Gradient Descent-based Neuronal Dynamics: ANN-to-SNN Conversion Beyond ReLU Network
Viewing Transformers Through the Lens of Long Convolutions Layers
BiSHop: Bi-Directional Cellular Learning for Tabular Data with Generalized Sparse Modern Hopfield Model
Do Efficient Transformers Really Save Computation?
Bridging discrete and continuous state spaces: Exploring the Ehrenfest process in time-continuous diffusion models
Accelerating Parallel Sampling of Diffusion Models
Light and Optimal Schrödinger Bridge Matching
Vague Prototype-Oriented Diffusion Model for Multi-Class Anomaly Detection
Accelerating Convergence of Score-Based Diffusion Models, Provably
Leverage Class-Specific Accuracy to Guide Data Generation for Improving Image Classification
Position: Compositional Generative Modeling: A Single Model is Not All You Need
Prompt-tuning Latent Diffusion Models for Inverse Problems
Semantic-Aware Human Object Interaction Image Generation
Knowledge Graphs Can be Learned with Just Intersection Features
Quantum Positional Encodings for Graph Neural Networks
On the Generalization of Equivariant Graph Neural Networks
Graph Neural Networks Use Graphs When They Shouldn't
On the Second-Order Convergence of Biased Policy Gradient Algorithms
The Expressive Power of Path-Based Graph Neural Networks
FedBPT: Efficient Federated Black-box Prompt Tuning for Large Language Models
Dual Operating Modes of In-Context Learning
Riemannian Preconditioned LoRA for Fine-Tuning Foundation Models
In-Context Unlearning: Language Models as Few-Shot Unlearners
KV-Runahead: Scalable Causal LLM Inference by Parallel Key-Value Cache Generation
Assessing Large Language Models on Climate Information
Why Larger Language Models Do In-context Learning Differently?
InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks
RLVF: Learning from Verbal Feedback without Overgeneralization
BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
Linguistic Calibration of Long-Form Generations
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models
Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning
InferCept: Efficient Intercept Support for Augmented Large Language Model Inference
From Classification Accuracy to Proper Scoring Rules: Elicitability of Probabilistic Top List Predictions
ALERT-Transformer: Bridging Asynchronous and Synchronous Machine Learning for Real-Time Event-based Spatio-Temporal Data
Interpreting Equivariant Representations
Position: Stop Making Unscientific AGI Performance Claims
LCA-on-the-Line: Benchmarking Out of Distribution Generalization with Class Taxonomies
Breaking the Barrier: Enhanced Utility and Robustness in Smoothed DRL Agents
Few-shot Adaptation to Distribution Shifts By Mixing Source and Target Embeddings
Stochastic Conditional Diffusion Models for Robust Semantic Image Synthesis
Fair Federated Learning via the Proportional Veto Core
InstructRetro: Instruction Tuning post Retrieval-Augmented Pretraining
Simulation of Graph Algorithms with Looped Transformers
Relational Learning in Pre-Trained Models: A Theory from Hypergraph Recovery Perspective
Beyond Point Prediction: Score Matching-based Pseudolikelihood Estimation of Neural Marked Spatio-Temporal Point Process
On the Identifiability of Switching Dynamical Systems
ReLUs Are Sufficient for Learning Implicit Neural Representations
On the Last-Iterate Convergence of Shuffling Gradient Methods
Structured Inverse-Free Natural Gradient Descent: Memory-Efficient & Numerically-Stable KFAC
On The Complexity of First-Order Methods in Stochastic Bilevel Optimization
Seizing Serendipity: Exploiting the Value of Past Success in Off-Policy Actor-Critic
A Universal Transfer Theorem for Convex Optimization Algorithms Using Inexact First-order Oracles
Multiplicative Weights Update, Area Convexity and Random Coordinate Descent for Densest Subgraph Problems
Verifying message-passing neural networks via topology-based bounds tightening
A Dynamic Algorithm for Weighted Submodular Cover Problem
Accelerated Algorithms for Constrained Nonconvex-Nonconcave Min-Max Optimization and Comonotone Inclusion
Autonomous Sparse Mean-CVaR Portfolio Optimization
On Convergence of Incremental Gradient for Non-convex Smooth Functions
Unraveling the Impact of Heterophilic Structures on Graph Positive-Unlabeled Learning
What is the Long-Run Distribution of Stochastic Gradient Descent? A Large Deviations Analysis
Differentiable Distributionally Robust Optimization Layers
Infinite-Horizon Distributionally Robust Regret-Optimal Control
High-Probability Convergence for Composite and Distributed Stochastic Minimization and Variational Inequalities with Heavy-Tailed Noise
Quantum Algorithm for Online Exp-concave Optimization
Towards AutoAI: Optimizing a Machine Learning System with Black-box and Differentiable Components
Clustered Federated Learning via Gradient-based Partitioning
DFlow: A Generative Model Combining Denoising AutoEncoder and Normalizing Flow for High Fidelity Waveform Generation
A Federated Stochastic Multi-level Compositional Minimax Algorithm for Deep AUC Maximization
Offline-Boosted Actor-Critic: Adaptively Blending Optimal Historical Behaviors in Deep Off-Policy RL
SparseTSF: Modeling Long-term Time Series Forecasting with *1k* Parameters
The Max-Min Formulation of Multi-Objective Reinforcement Learning: From Theory to a Model-Free Algorithm
Momentum for the Win: Collaborative Federated Reinforcement Learning across Heterogeneous Environments
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
Degeneration-free Policy Optimization: RL Fine-Tuning for Language Models without Degeneration
Multiply Robust Estimation for Local Distribution Shifts with Multiple Domains
Causal Action Influence Aware Counterfactual Data Augmentation
Q-value Regularized Transformer for Offline Reinforcement Learning
Do Transformer World Models Give Better Policy Gradients?
Robust Classification via a Single Diffusion Model
Floating Anchor Diffusion Model for Multi-motif Scaffolding
Mimicking Better by Matching the Approximate Action Distribution
Fast Peer Adaptation with Context-aware Exploration
Adversarial Attacks on Combinatorial Multi-Armed Bandits
Provably Efficient Reinforcement Learning for Adversarial Restless Multi-Armed Bandits with Unknown Transitions and Bandit Feedback
Policy Evaluation for Variance in Average Reward Reinforcement Learning
Sliced-Wasserstein Estimation with Spherical Harmonics as Control Variates
A Single-Loop Robust Policy Gradient Method for Robust Markov Decision Processes
DOGE: Domain Reweighting with Generalization Estimation
Path-Guided Particle-based Sampling
An amortized approach to non-linear mixed-effects modeling based on neural posterior estimation
Improving Diffusion Models for Inverse Problems Using Optimal Posterior Covariance
Liouville Flow Importance Sampler
Variational Inference with Coverage Guarantees in Simulation-Based Inference
Robust Graph Matching when Nodes are Corrupt
Online Variational Sequential Monte Carlo
Understanding Stochastic Natural Gradient Variational Inference
Category-Aware Active Domain Adaptation
Network Tight Community Detection
MILP-FBGen: LP/MILP Instance Generation with Feasibility/Boundedness
A General Theory for Softmax Gating Multinomial Logistic Mixture of Experts
Peeking with PEAK: Sequential, Nonparametric Composite Hypothesis Tests for Means of Multiple Data Streams
Lessons from Generalization Error Analysis of Federated Learning: You May Communicate Less Often!
Towards Generalization beyond Pointwise Learning: A Unified Information-theoretic Perspective
Online conformal prediction with decaying step sizes
Stochastic Bandits with ReLU Neural Networks
Active Ranking and Matchmaking, with Perfect Matchings
Maestro: Uncovering Low-Rank Structures via Trainable Decomposition
Near-Linear Time Approximation Algorithms for k-means with Outliers
Inferring Change Points in High-Dimensional Linear Regression via Approximate Message Passing
How Flawed Is ECE? An Analysis via Logit Smoothing
Provable Risk-Sensitive Distributional Reinforcement Learning with General Function Approximation
Studying K-FAC Heuristics by Viewing Adam through a Second-Order Lens
Federated Offline Reinforcement Learning: Collaborative Single-Policy Coverage Suffices
A Dual-module Framework for Counterfactual Estimation over Time
Contamination-Resilient Anomaly Detection via Adversarial Learning on Partially-Observed Normal and Anomalous Data
Geometry-Aware Instrumental Variable Regression
Position: Machine Learning-powered Assessments of the EU Digital Services Act Aid Quantify Policy Impacts on Online Harms
Multi-View Clustering by Inter-cluster Connectivity Guided Reward
Fast Co-Training under Weak Dependence via Stream-Based Active Learning
An Unsupervised Approach for Periodic Source Detection in Time Series
Harnessing Neural Unit Dynamics for Effective and Scalable Class-Incremental Learning
Neural Tangent Kernels for Axis-Aligned Tree Ensembles
Consistent Long-Term Forecasting of Ergodic Dynamical Systems
RMIB: Representation Matching Information Bottleneck for Matching Text Representations
Learning in Feature Spaces via Coupled Covariances: Asymmetric Kernel SVD and Nyström method
Discovering Features with Synergistic Interactions in Multiple Views
Decouple then Classify: A Dynamic Multi-view Labeling Strategy with Shared and Specific Information
Cross-domain Open-world Discovery
New Bounds on the Cohesion of Complete-link and Other Linkage Methods for Agglomerative Clustering
Time Series Diffusion in the Frequency Domain
Relaxed Quantile Regression: Prediction Intervals for Asymmetric Noise
Bayesian Adaptation of Network Depth and Width for Continual Learning
Inferring Dynamic Networks from Marginals with Iterative Proportional Fitting
Curated LLM: Synergy of LLMs and Data Curation for tabular augmentation in low-data regimes
How Universal Polynomial Bases Enhance Spectral Graph Neural Networks: Heterophily, Over-smoothing, and Over-squashing
Data Engineering for Scaling Language Models to 128K Context
Unsupervised Episode Generation for Graph Meta-learning
Localizing Task Information for Improved Model Merging and Compression
Position: A Call to Action for a Human-Centered AutoML Paradigm
Low-Rank Similarity Mining for Multimodal Dataset Distillation
Differentiable Mapper for Topological Optimization of Data Representation
Position: The Reasonable Person Standard for AI
Position: Social Environment Design Should be Further Developed for AI-based Policy-Making
Make-A-Shape: a Ten-Million-scale 3D Shape Model
The Privacy Power of Correlated Noise in Decentralized Learning
A New Linear Scaling Rule for Private Adaptive Hyperparameter Optimization
Self-Driven Entropy Aggregation for Byzantine-Robust Heterogeneous Federated Learning
Interpreting and Improving Large Language Models in Arithmetic Calculation
On Discrete Prompt Optimization for Diffusion Models
Incorporating Information into Shapley Values: Reweighting via a Maximum Entropy Approach
Balanced Data, Imbalanced Spectra: Unveiling Class Disparities with Spectral Imbalance
Verification of Machine Unlearning is Fragile
AttnLRP: Attention-Aware Layer-Wise Relevance Propagation for Transformers
The good, the bad and the ugly sides of data augmentation: An implicit spectral regularization perspective
Optimistic Multi-Agent Policy Gradient
Benchmarking Deletion Metrics with the Principled Explanations
Counterfactual Metarules for Local and Global Recourse
Position: A Safe Harbor for AI Evaluation and Red Teaming
Don't trust your eyes: on the (un)reliability of feature visualizations
OMPO: A Unified Framework for RL under Policy and Dynamics Shifts
Learning Decision Trees and Forests with Algorithmic Recourse
FRAPPÉ: A Group Fairness Framework for Post-Processing Everything
Bringing Motion Taxonomies to Continuous Domains via GPLVM on Hyperbolic manifolds
Large Language Models are Geographically Biased
Generative Flows on Discrete State-Spaces: Enabling Multimodal Flows with Applications to Protein Co-Design
Position: Social Choice Should Guide AI Alignment in Dealing with Diverse Human Feedback
Domain-wise Data Acquisition to Improve Performance under Distribution Shift
SPABA: A Single-Loop and Probabilistic Stochastic Bilevel Algorithm Achieving Optimal Sample Complexity
Detecting Any instruction-to-answer interaction relationship:Universal Instruction-to-Answer Navigator for Med-VQA
Vector Quantization Pretraining for EEG Time Series with Random Projection and Phase Alignment
Privately Learning Smooth Distributions on the Hypercube by Projections
Survival Kernets: Scalable and Interpretable Deep Kernel Survival Analysis with an Accuracy Guarantee
Chain-of-Thought Predictive Control
A Statistical Theory of Regularization-Based Continual Learning
Position: Relational Deep Learning - Graph Representation Learning on Relational Databases
Adversarially Robust Deep Multi-View Clustering: A Novel Attack and Defense Framework
FlashST: A Simple and Universal Prompt-Tuning Framework for Traffic Prediction
Promoting External and Internal Equities Under Ex-Ante/Ex-Post Metrics in Online Resource Allocation
Q-Star Meets Scalable Posterior Sampling: Bridging Theory and Practice via HyperAgent
Evaluating and Analyzing Relationship Hallucinations in Large Vision-Language Models
BAGEL: Bootstrapping Agents by Guiding Exploration with Language
Receptive Fields As Experts in Convolutional Neural Architectures
AMPA: Adaptive Mixed Precision Allocation for Low-Bit Integer Training
MultiMax: Sparse and Multi-Modal Attention Learning
Advancing Dynamic Sparse Training by Exploring Optimization Opportunities
SpikeZIP-TF: Conversion is All You Need for Transformer-based SNN
Bootstrapping Fisher Market Equilibrium and First-Price Pacing Equilibrium
LESS: Selecting Influential Data for Targeted Instruction Tuning
Private Vector Mean Estimation in the Shuffle Model: Optimal Rates Require Many Messages
A Provable Decision Rule for Out-of-Distribution Detection
DetKDS: Knowledge Distillation Search for Object Detectors
Probabilistic Forecasting with Stochastic Interpolants and Föllmer Processes
Locality-Sensitive Hashing-Based Efficient Point Transformer with Applications in High-Energy Physics
Outlier-aware Slicing for Post-Training Quantization in Vision Transformer
Critical feature learning in deep neural networks
On the Embedding Collapse when Scaling up Recommendation Models
Position: Opportunities Exist for Machine Learning in Magnetic Fusion Energy
Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-constraint
Fine-Grained Causal Dynamics Learning with Quantization for Improving Robustness in Reinforcement Learning
On dimensionality of feature vectors in MPNNs
NDOT: Neuronal Dynamics-based Online Training for Spiking Neural Networks
Causal-IQA: Towards the Generalization of Image Quality Assessment Based on Causal Inference
Grokking Group Multiplication with Cosets
Learning Coverage Paths in Unknown Environments with Deep Reinforcement Learning
Critical windows: non-asymptotic theory for feature emergence in diffusion models
DUPLEX: Dual GAT for Complex Embedding of Directed Graphs
SHINE: Shielding Backdoors in Deep Reinforcement Learning
Sample-Efficient Robust Multi-Agent Reinforcement Learning in the Face of Environmental Uncertainty
In-Context Reinforcement Learning for Variable Action Spaces
Byzantine Resilient and Fast Federated Few-Shot Learning
Test-Time Model Adaptation with Only Forward Passes
Task-aware Orthogonal Sparse Network for Exploring Shared Knowledge in Continual Learning
A Global Geometric Analysis of Maximal Coding Rate Reduction
An LLM Compiler for Parallel Function Calling
LLM-Empowered State Representation for Reinforcement Learning
The Stronger the Diffusion Model, the Easier the Backdoor: Data Poisoning to Induce Copyright BreachesWithout Adjusting Finetuning Pipeline
Rethinking Specificity in SBDD: Leveraging Delta Score and Energy-Guided Diffusion
MolCRAFT: Structure-Based Drug Design in Continuous Parameter Space
Learning to Predict Mutational Effects of Protein-Protein Interactions by Microenvironment-aware Hierarchical Prompt Learning
Pre-Training Protein Bi-level Representation Through Span Mask Strategy On 3D Protein Chains
NeuralIndicator: Implicit Surface Reconstruction from Neural Indicator Priors
Cascade-CLIP: Cascaded Vision-Language Embeddings Alignment for Zero-Shot Semantic Segmentation
Vectorized Conditional Neural Fields: A Framework for Solving Time-dependent Parametric Partial Differential Equations
ED-Copilot: Reduce Emergency Department Wait Time with Language Model Diagnostic Assistance
Contextualized Policy Recovery: Modeling and Interpreting Medical Decisions with Adaptive Imitation Learning
Contrastive Learning for Clinical Outcome Prediction with Partial Data Sources
Language Models as Science Tutors
AttNS: Attention-Inspired Numerical Solving For Limited Data Scenarios
ILILT: Implicit Learning of Inverse Lithography Technologies
Removing Spurious Concepts from Neural Network Representations via Joint Subspace Estimation
Dynamic Survival Analysis with Controlled Latent States
Sample as you Infer: Predictive Coding with Langevin Dynamics
Identifiability Matters: Revealing the Hidden Recoverable Condition in Unbiased Learning to Rank
Offline Multi-Objective Optimization
Auctionformer: A Unified Deep Learning Algorithm for Solving Equilibrium Strategies in Auction Games
Autoformalizing Euclidean Geometry
Applying language models to algebraic topology: generating simplicial cycles using multi-labeling in Wu's formula
Exploring Training on Heterogeneous Data with Mixture of Low-rank Adapters
Multi-layer Rehearsal Feature Augmentation for Class-Incremental Learning
DNA-SE: Towards Deep Neural-Nets Assisted Semiparametric Estimation
Improving Open-Ended Text Generation via Adaptive Decoding
TimeX++: Learning Time-Series Explanations with Information Bottleneck
SIN: Selective and Interpretable Normalization for Long-Term Time Series Forecasting
SFC: Achieve Accurate Fast Convolution under Low-precision Arithmetic
Smooth Min-Max Monotonic Networks
An Effective Dynamic Gradient Calibration Method for Continual Learning
BECoTTA: Input-dependent Online Blending of Experts for Continual Test-time Adaptation
How do Transformers Perform In-Context Autoregressive Learning ?
Why do Variational Autoencoders Really Promote Disentanglement?
Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs
A Geometric Explanation of the Likelihood OOD Detection Paradox
Proteus: Exploring Protein Structure Generation for Enhanced Designability and Efficiency
StrokeNUWA—Tokenizing Strokes for Vector Graphic Synthesis
DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents
CARTE: Pretraining and Transfer for Tabular Learning
Weisfeiler-Leman at the margin: When more expressivity matters
Disentangled Continual Graph Neural Architecture Search with Invariant Modular Supernet
Swallowing the Bitter Pill: Simplified Scalable Conformer Generation
Parameter Efficient Quasi-Orthogonal Fine-Tuning via Givens Rotation
DoRA: Weight-Decomposed Low-Rank Adaptation
Position: Mission Critical – Satellite Data is a Distinct Modality in Machine Learning
Efficient Algorithms for Empirical Group Distributionally Robust Optimization and Beyond
Towards Efficient Exact Optimization of Language Model Alignment
Position: Open-Endedness is Essential for Artificial Superhuman Intelligence
TroVE: Inducing Verifiable and Efficient Toolboxes for Solving Programmatic Tasks
HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding
StrWAEs to Invariant Representations
Interpretable Deep Clustering for Tabular Data
CaM: Cache Merging for Memory-efficient LLMs Inference
NExT-GPT: Any-to-Any Multimodal LLM
Position: Will we run out of data? Limits of LLM scaling based on human-generated data
Ameliorate Spurious Correlations in Dataset Condensation
A Resilient and Accessible Distribution-Preserving Watermark for Large Language Models
MLLM-as-a-Judge: Assessing Multimodal LLM-as-a-Judge with Vision-Language Benchmark
Q-Probe: A Lightweight Approach to Reward Maximization for Language Models
GliDe with a CaPE: A Low-Hassle Method to Accelerate Speculative Decoding
video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models
COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability
Interpreting and Improving Diffusion Models from an Optimization Perspective
Gated Linear Attention Transformers with Hardware-Efficient Training
Partially Stochastic Infinitely Deep Bayesian Neural Networks
Do Topological Characteristics Help in Knowledge Distillation?
Partial Multi-View Multi-Label Classification via Semantic Invariance Learning and Prototype Modeling
Generalizing Knowledge Graph Embedding with Universal Orthogonal Parameterization
Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models
Position: Exploring the Robustness of Pipeline-Parallelism-Based Decentralized Training
LLark: A Multimodal Instruction-Following Language Model for Music
DiffAug: Enhance Unsupervised Contrastive Learning with Domain-Knowledge-Free Diffusion-based Data Augmentation
Efficient Contrastive Learning for Fast and Accurate Inference on Graphs
Deep Equilibrium Models are Almost Equivalent to Not-so-deep Explicit Models for High-dimensional Gaussian Mixtures
Data-Efficient Learning via Clustering-Based Sensitivity Sampling: Foundation Models and Beyond
High-Probability Bound for Non-Smooth Non-Convex Stochastic Optimization with Heavy Tails
Can We Remove the Square-Root in Adaptive Gradient Methods? A Second-Order Perspective
Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF
Deep Functional Factor Models: Forecasting High-Dimensional Functional Time Series via Bayesian Nonparametric Factorization
Scalable Multiple Kernel Clustering: Learning Clustering Structure from Expectation
One Size Fits All for Semantic Shifts: Adaptive Prompt Tuning for Continual Learning
SparQ Attention: Bandwidth-Efficient LLM Inference
OSSCAR: One-Shot Structured Pruning in Vision and Language Models with Combinatorial Optimization
Adaptive Stabilization Based on Machine Learning for Column Generation
Random Scaling and Momentum for Non-smooth Non-convex Optimization
Boundary Exploration for Bayesian Optimization With Unknown Physical Constraints
Efficient Stochastic Approximation of Minimax Excess Risk Optimization
Bridging Model Heterogeneity in Federated Learning via Uncertainty-based Asymmetrical Reciprocity Learning
Shifted Interpolation for Differential Privacy
Offline Training of Language Model Agents with Functions as Learnable Weights
HAMLET: Graph Transformer Neural Operator for Partial Differential Equations
Empowering Graph Invariance Learning with Deep Spurious Infomax
Test-Time Regret Minimization in Meta Reinforcement Learning
Provable Representation with Efficient Planning for Partially Observable Reinforcement Learning
Adaptive-Gradient Policy Optimization: Enhancing Policy Learning in Non-Smooth Differentiable Simulations
Averaging $n$-step Returns Reduces Variance in Reinforcement Learning
HarmoDT: Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning
Temporal Logic Specification-Conditioned Decision Transformer for Offline Safe Reinforcement Learning
Simple Ingredients for Offline Reinforcement Learning
BRAIn: Bayesian Reward-conditioned Amortized Inference for natural language generation from feedback
On the Sample Complexity and Metastability of Heavy-tailed Policy Search in Continuous Control
Learning to Model the World With Language
Foundation Policies with Hilbert Representations
Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models
StyDeSty: Min-Max Stylization and Destylization for Single Domain Generalization
Subequivariant Reinforcement Learning in 3D Multi-Entity Physical Environments
Enhancing Class-Imbalanced Learning with Pre-Trained Guidance through Class-Conditional Knowledge Distillation
Learning Divergence Fields for Shift-Robust Graph Representations
Hard Tasks First: Multi-Task Reinforcement Learning Through Task Scheduling
How Graph Neural Networks Learn: Lessons from Training Dynamics
Model-Free Robust $\phi$-Divergence Reinforcement Learning Using Both Offline and Online Data
Learning from Integral Losses in Physics Informed Neural Networks
FightLadder: A Benchmark for Competitive Multi-Agent Reinforcement Learning
Sampling in Unit Time with Kernel Fisher-Rao Flow
Improving Sample Efficiency of Model-Free Algorithms for Zero-Sum Markov Games
Controlling Behavioral Diversity in Multi-Agent Reinforcement Learning
Risk Estimation in a Markov Cost Process: Lower and Upper Bounds
On Universally Optimal Algorithms for A/B Testing
Hieros: Hierarchical Imagination on Structured State Space Sequence World Models
Private Heterogeneous Federated Learning Without a Trusted Server Revisited: Error-Optimal and Communication-Efficient Algorithms for Convex Losses
Leveraging Self-Consistency for Data-Efficient Amortized Bayesian Inference
Scaling Laws for the Value of Individual Data Points in Machine Learning
The Relative Value of Prediction in Algorithmic Decision Making
Practical Hamiltonian Monte Carlo on Riemannian Manifolds via Relativity Theory
Estimating the Permanent by Nesting Importance Sampling
Particle Denoising Diffusion Sampler
Unified Training of Universal Time Series Forecasting Transformers
A Unified Framework for Learning with Nonlinear Model Classes from Arbitrary Linear Samples
Neural Collapse for Cross-entropy Class-Imbalanced Learning with Unconstrained ReLU Features Model
A Study of First-Order Methods with a Deterministic Relative-Error Gradient Oracle
Asymptotics of Learning with Deep Structured (Random) Features
A General Framework for Sequential Decision-Making under Adaptivity Constraints
WorkArena: How Capable are Web Agents at Solving Common Knowledge Work Tasks?
Learning Iterative Reasoning through Energy Diffusion
Fundamental Limitations of Alignment in Large Language Models
On The Statistical Complexity of Offline Decision-Making
Learning the Uncertainty Sets of Linear Control Systems via Set Membership: A Non-asymptotic Analysis
Towards Scalable and Versatile Weight Space Learning
Safe Reinforcement Learning using Finite-Horizon Gradient-based Estimation
Probability Distribution of Hypervolume Improvement in Bi-objective Bayesian Optimization
Interplay of ROC and Precision-Recall AUCs: Theoretical Limits and Practical Implications in Binary Classification
A Tensor Decomposition Perspective on Second-order RNNs
Exploration-Driven Policy Optimization in RLHF: Theoretical Insights on Efficient Data Utilization
Low-Rank Bandits via Tight Two-to-Infinity Singular Subspace Recovery
On Interpolating Experts and Multi-Armed Bandits
Rethinking Momentum Knowledge Distillation in Online Continual Learning
Conformalized Survival Distributions: A Generic Post-Process to Increase Calibration
Online Algorithms with Uncertainty-Quantified Predictions
Conformal Validity Guarantees Exist for Any Data Distribution (and How to Find Them)
Minimizing $f$-Divergences by Interpolating Velocity Fields
Evaluating Instrument Validity using the Principle of Independent Mechanisms
Understanding Forgetting in Continual Learning with Linear Regression
Closing the Gap: Achieving Global Convergence (Last Iterate) of Actor-Critic under Markovian Sampling with Neural Network Parametrization
Prediction-powered Generalization of Causal Inferences
Improving Interpretation Faithfulness for Vision Transformers
Position: The Causal Revolution Needs Scientific Pragmatism
Locally Differentially Private Decentralized Stochastic Bilevel Optimization with Guaranteed Convergence Accuracy
On the sample complexity of conditional independence testing with Von Mises estimator with application to causal discovery
Barrier Algorithms for Constrained Non-Convex Optimization
Foundations of Testing for Finite-Sample Causal Discovery
Learning Mixtures of Gaussian Processes through Random Projection
Position: $C^*$-Algebraic Machine Learning $-$ Moving in a New Direction
Beyond the Norms: Detecting Prediction Errors in Regression Models
Stochastic Weakly Convex Optimization beyond Lipschitz Continuity
Byzantine-Robust Federated Learning: Impact of Client Subsampling and Local Updates
Fast and Sample Efficient Multi-Task Representation Learning in Stochastic Contextual Bandits
Wukong: Towards a Scaling Law for Large-Scale Recommendation
Provable Interactive Learning with Hindsight Instruction Feedback
Naive Bayes Classifiers over Missing Data: Decision and Poisoning
Learning with 3D rotations, a hitchhiker's guide to SO(3)
Detecting Influence Structures in Multi-Agent Reinforcement Learning
Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks
Diffusive Gibbs Sampling
Language Models as Semantic Indexers
Discounted Adaptive Online Learning: Towards Better Regularization
Rapid Learning without Catastrophic Forgetting in the Morris Water Maze
LEVI: Generalizable Fine-tuning via Layer-wise Ensemble of Different Views
Online Adaptive Anomaly Thresholding with Confidence Sequences
Learning to Continually Learn with the Bayesian Principle
Sliced Wasserstein with Random-Path Projecting Directions
Causality Based Front-door Defense Against Backdoor Attack on Language Models
Distribution Alignment Optimization through Neural Collapse for Long-tailed Classification
Optimal Transport for Structure Learning Under Missing Data
Parameter Estimation in DAGs from Incomplete Data via Optimal Transport
Graph External Attention Enhanced Transformer
AlphaZero-Like Tree-Search can Guide Large Language Model Decoding and Training
Helpful or Harmful Data? Fine-tuning-free Shapley Attribution for Explaining Language Model Predictions
Privacy Preserving Adaptive Experiment Design
Neural Collapse meets Differential Privacy: Curious behaviors of NoisyGD with Near-Perfect Representation Learning
Equilibrium of Data Markets with Externality
OLLIE: Imitation Learning from Offline Pretraining to Online Finetuning
Pluvial Flood Emulation with Hydraulics-informed Message Passing
How to Leverage Diverse Demonstrations in Offline Imitation Learning
Multi-Sender Persuasion: A Computational Perspective
Provable Privacy with Non-Private Pre-Processing
A Persuasive Approach to Combating Misinformation
From Generalization Analysis to Optimization Designs for State Space Models
Linear Explanations for Individual Neurons
A General Framework for Learning from Weak Supervision
Should we be going MAD? A Look at Multi-Agent Debate Strategies for LLMs
Data-free Distillation of Diffusion Models with Bootstrapping
Estimating Distributional Treatment Effects in Randomized Experiments: Machine Learning for Variance Reduction
Discovering Environments with XRM
Better & Faster Large Language Models via Multi-token Prediction
CaPS: Collaborative and Private Synthetic Data Generation from Distributed Sources
Quality Diversity through Human Feedback: Towards Open-Ended Diversity-Driven Optimization
Beyond Implicit Bias: The Insignificance of SGD Noise in Online Learning
Inexact Newton-type Methods for Optimisation with Nonnegativity Constraints
Learning Shadow Variable Representation for Treatment Effect Estimation under Collider Bias
Provable Benefits of Local Steps in Heterogeneous Federated Learning for Neural Networks: A Feature Learning Perspective
Conformalized Adaptive Forecasting of Heterogeneous Trajectories
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Fast White-Box Adversarial Streaming Without a Random Oracle
Position: The Platonic Representation Hypothesis
Keypoint-based Progressive Chain-of-Thought Distillation for LLMs
Uncertainty Estimation by Density Aware Evidential Deep Learning
Investigating Pre-Training Objectives for Generalization in Vision-Based Reinforcement Learning
Transferable Facial Privacy Protection against Blind Face Restoration via Domain-Consistent Adversarial Obfuscation
Seesaw: Compensating for Nonlinear Reduction with Linear Computations for Private Inference
X-Oscar: A Progressive Framework for High-quality Text-guided 3D Animatable Avatar Generation
Getting the most out of your tokenizer for pre-training and domain adaptation
Absolute Policy Optimization: Enhancing Lower Probability Bound of Performance with High Confidence
Stochastic Quantum Sampling for Non-Logconcave Distributions and Estimating Partition Functions
Exact Conversion of In-Context Learning to Model Weights in Linearized-Attention Transformers
Privacy Attacks in Decentralized Learning
Revisit the Essence of Distilling Knowledge through Calibration
MF-CLR: Multi-Frequency Contrastive Learning Representation for Time Series
How Language Model Hallucinations Can Snowball
Guiding LLMs The Right Way: Fast, Non-Invasive Constrained Generation
Use Your INSTINCT: INSTruction optimization for LLMs usIng Neural bandits Coupled with Transformers
An Improved Finite-time Analysis of Temporal Difference Learning with Deep Neural Networks
Subsampling is not Magic: Why Large Batch Sizes Work for Differentially Private Stochastic Optimisation
Improved Communication-Privacy Trade-offs in $L_2$ Mean Estimation under Streaming Differential Privacy
A Contextual Combinatorial Bandit Approach to Negotiation
Various Lengths, Constant Speed: Efficient Language Modeling with Lightning Attention
CF-OPT: Counterfactual Explanations for Structured Prediction
Eureka-Moments in Transformers: Multi-Step Tasks Reveal Softmax Induced Optimization Problems
Learning Solution-Aware Transformers for Efficiently Solving Quadratic Assignment Problem
Simplicity Bias of Two-Layer Networks beyond Linearly Separable Data
Convergence and Trade-Offs in Riemannian Gradient Descent and Riemannian Proximal Point
On the Independence Assumption in Neurosymbolic Learning
Towards Modular LLMs by Building and Reusing a Library of LoRAs
Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference
FRAG: Frequency Adapting Group for Diffusion Video Editing
Sliding Down the Stairs: How Correlated Latent Variables Accelerate Learning with Neural Networks
End-to-End Neuro-Symbolic Reinforcement Learning with Textual Explanations
Learning Latent Dynamic Robust Representations for World Models
Mastering Zero-Shot Interactions in Cooperative and Competitive Simultaneous Games
Tight Partial Identification of Causal Effects with Marginal Distribution of Unmeasured Confounders
IOI: Invisible One-Iteration Adversarial Attack on No-Reference Image- and Video-Quality Metrics
On the Universality of Volume-Preserving and Coupling-Based Normalizing Flows
Counterfactual Reasoning for Multi-Label Image Classification via Patching-Based Training
Emergent Equivariance in Deep Ensembles
PhAST: Physics-Aware, Scalable, and Task-Specific GNNs for Accelerated Catalyst Design
Causal Inference from Competing Treatments
Optimal Recurrent Network Topologies for Dynamical Systems Reconstruction
Information-Directed Pessimism for Offline Reinforcement Learning
Towards Certified Unlearning for Deep Neural Networks
A Bias-Variance-Covariance Decomposition of Kernel Scores for Generative Models
Fully-Dynamic Approximate Decision Trees With Worst-Case Update Time Guarantees
S$\Omega$I: Score-based O-INFORMATION Estimation
$f$-Divergence Based Classification: Beyond the Use of Cross-Entropy
Privacy Backdoors: Stealing Data with Corrupted Pretrained Models
Decomposable Submodular Maximization in Federated Setting
AlphaFold Meets Flow Matching for Generating Protein Ensembles
Improving Instruction Following in Language Models through Proxy-Based Uncertainty Estimation
Learning Pseudo-Contractive Denoisers for Inverse Problems
Nonsmooth Implicit Differentiation: Deterministic and Stochastic Convergence Rates
Embarrassingly Parallel GFlowNets
Improving fine-grained understanding in image-text pre-training
Controlled Decoding from Language Models
Prediction Accuracy of Learning in Games : Follow-the-Regularized-Leader meets Heisenberg
Scalable AI Safety via Doubly-Efficient Debate
Generalization Error of Graph Neural Networks in the Mean-field Regime
Confronting Reward Overoptimization for Diffusion Models: A Perspective of Inductive and Primacy Biases
A fast algorithm to simulate nonlinear resistive networks
Advancing DRL Agents in Commercial Fighting Games: Training, Integration, and Agent-Human Alignment
Token-level Direct Preference Optimization
Logistic Variational Bayes Revisited
FedCal: Achieving Local and Global Calibration in Federated Learning via Aggregated Parameterized Scaler
Conformal Prediction with Learned Features
On the Error-Propagation of Inexact Hotelling's Deflation for Principal Component Analysis
RODEO: Robust Outlier Detection via Exposing Adaptive Out-of-Distribution Samples
Fair Data Representation for Machine Learning at the Pareto Frontier
Position: Scarce Resource Allocations That Rely On Machine Learning Should Be Randomized
Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation
Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications
Feature Importance Disparities for Data Bias Investigations
Evaluating Model Bias Requires Characterizing its Mistakes
Dense Reward for Free in Reinforcement Learning from Human Feedback
GeoReasoner: Geo-localization with Reasoning in Street Views using a Large Vision-Language Model
GeoAB: Towards Realistic Antibody Design and Reliable Affinity Maturation
Data-Efficient Molecular Generation with Hierarchical Textual Inversion
Generative Enzyme Design Guided by Functionally Important Sites and Small-Molecule Substrates
Structure-based drug design by denoising voxel grids
A Unified Recipe for Deriving (Time-Uniform) PAC-Bayes Bounds
Two-timescale Derivative Free Optimization for Performative Prediction with Markovian Data
Hierarchical Neural Operator Transformer with Learnable Frequency-aware Loss Prior for Arbitrary-scale Super-resolution
Scribble-Supervised Semantic Segmentation with Prototype-based Feature Augmentation
Adaptively Perturbed Mirror Descent for Learning in Games
TimeMIL: Advancing Multivariate Time Series Classification via a Time-aware Multiple Instance Learning
Deep Stochastic Mechanics
Position: Quo Vadis, Unsupervised Time Series Anomaly Detection?
MorphGrower: A Synchronized Layer-by-layer Growing Approach for Plausible Neuronal Morphology Generation
Multi-Region Markovian Gaussian Process: An Efficient Method to Discover Directional Communications Across Multiple Brain Regions
Enforcing Constraints in RNA Secondary Structure Predictions: A Post-Processing Framework Based on the Assignment Problem
SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization
Attention Meets Post-hoc Interpretability: A Mathematical Perspective
Uniform Memory Retrieval with Larger Capacity for Modern Hopfield Models
Going beyond Compositions, DDPMs Can Produce Zero-Shot Interpolations
Switched Flow Matching: Eliminating Singularities via Switching ODEs
An Independence-promoting Loss for Music Generation with Language Models
Bayesian Power Steering: An Effective Approach for Domain Adaptation of Diffusion Models
Towards Neural Architecture Search through Hierarchical Generative Modeling
Directly Denoising Diffusion Models
Learning to Explore in POMDPs with Informational Rewards
MusicFlow: Cascaded Flow Matching for Text Guided Music Generation
CKGConv: General Graph Convolution with Continuous Kernels
Graph Positional and Structural Encoder
Sign is Not a Remedy: Multiset-to-Multiset Message Passing for Learning on Heterophilic Graphs
An Intrinsic Vector Heat Network
In-Context Principle Learning from Mistakes
Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibration
RoSA: Accurate Parameter-Efficient Fine-Tuning via Robust Adaptation
R2E: Turning any Github Repository into a Programming Agent Environment
Exploring the LLM Journey from Cognition to Expression with Linear Representations
Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
Understanding Finetuning for Factual Knowledge Extraction
To the Max: Reinventing Reward in Reinforcement Learning
Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition
Can AI Assistants Know What They Don't Know?
Learning from Students: Applying t-Distributions to Explore Accurate and Efficient Formats for LLMs
Using Left and Right Brains Together: Towards Vision and Language Planning
In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation
Discovering Symmetry Breaking in Physical Systems with Relaxed Group Convolution
Deep Regression Representation Learning with Topology
Weakly-Supervised Residual Evidential Learning for Multi-Instance Uncertainty Estimation
Fine-grained Local Sensitivity Analysis of Standard Dot-Product Self-Attention
Fault Tolerant ML: Efficient Meta-Aggregation and Synchronous Training
BadPart: Unified Black-box Adversarial Patch Attacks against Pixel-wise Regression Tasks
Modeling Caption Diversity in Contrastive Vision-Language Pretraining
Position: Categorical Deep Learning is an Algebraic Theory of All Architectures
TabLog: Test-Time Adaptation for Tabular Data Using Logic Rules
In-context Learning on Function Classes Unveiled for Transformers
A Provably Effective Method for Pruning Experts in Fine-tuned Sparse Mixture-of-Experts
Learning to Compile Programs to Neural Networks
Delving into the Convergence of Generalized Smooth Minimax Optimization
Finite Smoothing Algorithm for High-Dimensional Support Vector Machines and Quantile Regression
Lookbehind-SAM: k steps back, 1 step forward
Moreau Envelope for Nonconvex Bi-Level Optimization: A Single-Loop and Hessian-Free Solution Strategy
Submodular framework for structured-sparse optimal transport
OptiMUS: Scalable Optimization Modeling with (MI)LP Solvers and Large Language Models
Scalable Online Exploration via Coverability
SeMOPO: Learning High-quality Model and Policy from Low-quality Offline Visual Datasets
A Unified View of FANOVA: A Comprehensive Bayesian Framework for Component Selection and Estimation
Scaling Tractable Probabilistic Circuits: A Systems Perspective
Optimal Batched Linear Bandits
Sampling-based Multi-dimensional Recalibration
Nonparametric Teaching of Implicit Neural Representations
Nesting Particle Filters for Experimental Design in Dynamical Systems
Minimum-Norm Interpolation Under Covariate Shift
Can a Few Decide for Many? The Metric Distortion of Sortition
On Least Square Estimation in Softmax Gating Mixture of Experts
A Theory of Non-Linear Feature Learning with One Gradient Step in Two-Layer Neural Networks
Generalization Bounds for Heavy-Tailed SDEs through the Fractional Fokker-Planck Equation
Uniformly Stable Algorithms for Adversarial Training and Beyond
Online Learning in Betting Markets: Profit versus Prediction
Estimating Canopy Height at Scale
Incentivized Learning in Principal-Agent Bandit Games
Improved Dimensionality Dependence for Zeroth-Order Optimisation over Cross-Polytopes
On Online Experimentation without Device Identifiers
Generalizing Orthogonalization for Models with Non-Linearities
TIC-TAC: A Framework For Improved Covariance Estimation In Deep Heteroscedastic Regression
RankSEG: A Consistent Ranking-based Framework for Segmentation
Dynamic Spectral Clustering with Provable Approximation Guarantee
UGrid: An Efficient-And-Rigorous Neural Multigrid Solver for Linear PDEs
Merging Multi-Task Models via Weight-Ensembling Mixture of Experts
Causal Inference out of Control: Estimating Performativity without Treatment Randomization
Safety Fine-Tuning at (Almost) No Cost: A Baseline for Vision Large Language Models
Tuning-free Estimation and Inference of Cumulative Distribution Function under Local Differential Privacy
Making Old Things New: A Unified Algorithm for Differentially Private Clustering
MaSS: Multi-attribute Selective Suppression for Utility-preserving Data Transformation from an Information-theoretic Perspective
Sparse Dimensionality Reduction Revisited
NExT-Chat: An LMM for Chat, Detection and Segmentation
Thermometer: Towards Universal Calibration for Large Language Models
Privacy-Preserving Instructions for Aligning Large Language Models
Membership Inference Attacks on Diffusion Models via Quantile Regression
Purify Unlearnable Examples via Rate-Constrained Variational Autoencoders
Relational DNN Verification With Cross Executional Bound Refinement
Adaptive Group Personalization for Federated Mutual Transfer Learning
One for All: A Universal Generator for Concept Unlearnability via Multi-Modal Alignment
Zero-Shot Reinforcement Learning via Function Encoders
Value-Evolutionary-Based Reinforcement Learning
Improving Accuracy-robustness Trade-off via Pixel Reweighted Adversarial Training
Harmonic Self-Conditioned Flow Matching for joint Multi-Ligand Docking and Binding Site Design
An Efficient Self-Learning Framework For Interactive Spoken Dialog Systems
Using AI Uncertainty Quantification to Improve Human Decision-Making
Sparse-IFT: Sparse Iso-FLOP Transformations for Maximizing Training Efficiency
Multiply-Robust Causal Change Attribution
Classification Under Strategic Self-Selection
Principled Gradient-Based MCMC for Conditional Sampling of Text
Parameter-Efficient Fine-Tuning with Controls
Implicit meta-learning may lead language models to trust more reliable sources
LSEnet: Lorentz Structural Entropy Neural Network for Deep Graph Clustering
Graphon Mean Field Games with a Representative Player: Analysis and Learning Algorithm
ArtWhisperer: A Dataset for Characterizing Human-AI Interactions in Artistic Creations
Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision
Reinforcement Learning and Regret Bounds for Admission Control
Efficient Algorithms for Sum-Of-Minimum Optimization
Hyperbolic Optimizer as a Dynamical System
Learning Modality Knowledge Alignment for Cross-Modality Transfer
Impact of Decentralized Learning on Player Utilities in Stackelberg Games
Provable Contrastive Continual Learning
A Unified Linear Programming Framework for Offline Reward Learning from Human Demonstrations and Feedback
Position: Application-Driven Innovation in Machine Learning
Mastering Robot Manipulation with Multimodal Prompts through Pretraining and Multi-task Fine-tuning
Learning Temporal Distances: Contrastive Successor Features Can Provide a Metric Structure for Decision-Making
S3O: A Dual-Phase Approach for Reconstructing Dynamic Shape and Skeleton of Articulated Objects from Single Monocular Video
SurfPro: Functional Protein Design Based on Continuous Surface
In-context Vectors: Making In Context Learning More Effective and Controllable Through Latent Space Steering
From Vision to Audio and Beyond: A Unified Model for Audio-Visual Representation and Generation
A Linear Time and Space Local Point Cloud Geometry Encoder via Vectorized Kernel Mixture (VecKM)
Experts Don't Cheat: Learning What You Don't Know By Predicting Pairs
Diving into Underwater: Segment Anything Model Guided Underwater Salient Instance Segmentation and A Large-scale Dataset
An Explicit Frame Construction for Normalizing 3D Point Clouds
DiffDA: a Diffusion model for weather-scale Data Assimilation
DecisionNCE: Embodied Multimodal Representations via Implicit Preference Learning
VinT-6D: A Large-Scale Object-in-hand Dataset from Vision, Touch and Proprioception
Few-Shot Unsupervised Implicit Neural Shape Representation Learning with Spatial Adversaries
Fast-Slow Test-Time Adaptation for Online Vision-and-Language Navigation
The Pitfalls of Next-Token Prediction
Conditionally-Conjugate Gaussian Process Factor Analysis for Spike Count Data via Data Augmentation
Diversified Batch Selection for Training Acceleration
Sharpness-Aware Data Generation for Zero-shot Quantization
SyCoCa: Symmetrizing Contrastive Captioners with Attentive Masking for Multimodal Alignment
TimeSiam: A Pre-Training Framework for Siamese Time-Series Modeling
Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation
Better Safe than Sorry: Pre-training CLIP against Targeted Data Poisoning and Backdoor Attacks
Outlier-Efficient Hopfield Layers for Large Transformer-Based Models
Rolling Diffusion Models
Compositional Text-to-Image Generation with Dense Blob Representations
Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
Faster Streaming and Scalable Algorithms for Finding Directed Dense Subgraphs in Large Graphs
Learning the Target Network in Function Space
HumanTOMATO: Text-aligned Whole-body Motion Generation
A Neural-Preconditioned Poisson Solver for Mixed Dirichlet and Neumann Boundary Conditions
Weisfeiler Leman for Euclidean Equivariant Machine Learning
LoCoCo: Dropping In Convolutions for Long Context Compression
Topological Neural Networks go Persistent, Equivariant, and Continuous
GNNs Also Deserve Editing, and They Need It More Than Once
High-Order Contrastive Learning with Fine-grained Comparative Levels for Sparse Ordinal Tensor Completion
Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models
Triadic-OCD: Asynchronous Online Change Detection with Provable Robustness, Optimality, and Convergence
Rethinking Generative Large Language Model Evaluation for Semantic Comprehension
Adaptive Text Watermark for Large Language Models
SPP: Sparsity-Preserved Parameter-Efficient Fine-Tuning for Large Language Models
Pragmatic Feature Preferences: Learning Reward-Relevant Preferences from Human Input
Nonlinear Filtering with Brenier Optimal Transport Maps
Position: Towards Unified Alignment Between Agents, Humans, and Environment
ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models
Codebook Features: Sparse and Discrete Interpretability for Neural Networks
Magicoder: Empowering Code Generation with OSS-Instruct
A Closer Look at the Limitations of Instruction Tuning
Trustworthy Alignment of Retrieval-Augmented Large Language Models via Reinforcement Learning
Discovering Bias in Latent Space: An Unsupervised Debiasing Approach
Causal Customer Churn Analysis with Low-rank Tensor Block Hazard Model
SelfIE: Self-Interpretation of Large Language Model Embeddings
STEER: Assessing the Economic Rationality of Large Language Models
Efficient Error Certification for Physics-Informed Neural Networks
Structure-Aware E(3)-Invariant Molecular Conformer Aggregation Networks
Benign Overfitting in Adversarial Training of Neural Networks
Enhancing Vision Transformer: Amplifying Non-Linearity in Feedforward Network Module
Generalization Bound and New Algorithm for Clean-Label Backdoor Attack
Collapse-Aware Triplet Decoupling for Adversarially Robust Image Retrieval
Robustness of Deep Learning for Accelerated MRI: Benefits of Diverse Training Data
Uncertainty-Aware Reward-Free Exploration with General Function Approximation
On the Convergence of Projected Bures-Wasserstein Gradient Descent under Euclidean Strong Convexity
On the Effectiveness of Supervision in Asymmetric Non-Contrastive Learning
An Infinite-Width Analysis on the Jacobian-Regularised Training of a Neural Network
Mean-field Analysis on Two-layer Neural Networks from a Kernel Perspective
The Effect of Weight Precision on the Neuron Count in Deep ReLU Networks
DiNADO: Norm-Disentangled Neurally-Decomposed Oracles for Controlling Language Models
Deep Networks Always Grok and Here is Why
Loss Shaping Constraints for Long-Term Time Series Forecasting
Contextual Feature Selection with Conditional Stochastic Gates
MGit: A Model Versioning and Management System
What is Dataset Distillation Learning?
A Rate-Distortion View of Uncertainty Quantification
Prodigy: An Expeditiously Adaptive Parameter-Free Learner
Sarah Frank-Wolfe: Methods for Constrained Optimization with Best Rates and Practical Features
Revisiting Inexact Fixed-Point Iterations for Min-Max Problems: Stochasticity and Structured Nonconvexity
How to Escape Sharp Minima with Random Perturbations
Non-convex Stochastic Composite Optimization with Polyak Momentum
Posterior Sampling-Based Bayesian Optimization with Tighter Bayesian Regret Bounds
MOKD: Cross-domain Finetuning for Few-shot Classification via Maximizing Optimized Kernel Dependence
A Doubly Recursive Stochastic Compositional Gradient Descent Method for Federated Multi-Level Compositional Optimization
Federated Representation Learning in the Under-Parameterized Regime
Decentralized Convex Finite-Sum Optimization with Better Dependence on Condition Numbers
Towards a Better Theoretical Understanding of Independent Subnetwork Training
Fast, Scalable, Warm-Start Semidefinite Programming with Spectral Bundling and Sketching
Deep Demonstration Tracing: Learning Generalizable Imitator Policy for Runtime Imitation from a Single Demonstration
Boosting Offline Optimizers with Surrogate Sensitivity
Bayesian Exploration Networks
Diffusion Model-Augmented Behavioral Cloning
Hybrid Reinforcement Learning from Offline Observation Alone
Bayesian Regret Minimization in Offline Bandits
Dynamic Evaluation of Large Language Models by Meta Probing Agents
Distributed Bilevel Optimization with Communication Compression
Discovering Multiple Solutions from a Single Task in Offline Reinforcement Learning
Augmenting Decision with Hypothesis in Reinforcement Learning
Constrained Ensemble Exploration for Unsupervised Skill Discovery
Provably Efficient Long-Horizon Exploration in Monte Carlo Tree Search through State Occupancy Regularization
Regularized Q-learning through Robust Averaging
Probabilistic Routing for Graph-Based Approximate Nearest Neighbor Search
Simulation-Based Inference with Quantile Regression
Transitional Uncertainty with Layered Intermediate Predictions
Gaussian Processes on Cellular Complexes
Debiased Distribution Compression
An Interpretable Evaluation of Entropy-based Novelty of Generative Models
A Differentiable Partially Observable Generalized Linear Model with Forward-Backward Message Passing
AnyTool: Self-Reflective, Hierarchical Agents for Large-Scale API Calls
Covert Malicious Finetuning: Challenges in Safeguarding LLM Adaptation
Symmetry Induces Structure and Constraint of Learning
Transformers Learn Nonlinear Features In Context: Nonconvex Mean-field Dynamics on the Attention Landscape
UniAudio: Towards Universal Audio Generation with Large Language Models
When and How Does In-Distribution Label Help Out-of-Distribution Detection?
More Flexible PAC-Bayesian Meta-Learning by Learning Learning Algorithms
Prometheus: Out-of-distribution Fluid Dynamics Modeling with Disentangled Graph ODE
On Statistical Learning Theory for Distributional Inputs
The Non-linear $F$-Design and Applications to Interactive Learning
Collaborative Learning with Different Labeling Functions
HGCN2SP: Hierarchical Graph Convolutional Network for Two-Stage Stochastic Programming
DS-Agent: Automated Data Science by Empowering Large Language Models with Case-Based Reasoning
Guarantees for Nonlinear Representation Learning: Non-identical Covariates, Dependent Data, Fewer Samples
Online Matrix Completion: A Collaborative Approach with Hott Items
Matroid Semi-Bandits in Sublinear Time
Non-stationary Online Convex Optimization with Arbitrary Delays
Improved Stability and Generalization Guarantees of the Decentralized SGD Algorithm
Autaptic Synaptic Circuit Enhances Spatio-temporal Predictive Learning of Spiking Neural Networks
Towards Understanding the Word Sensitivity of Attention Layers: A Study via Random Features
Gradient Compressed Sensing: A Query-Efficient Gradient Estimator for High-Dimensional Zeroth-Order Optimization
Position: Tensor Networks are a Valuable Asset for Green AI
Learning with Adaptive Resource Allocation
A Fixed-Point Approach for Causal Generative Modeling
Reinformer: Max-Return Sequence Modeling for Offline RL
Improving Context Understanding in Multimodal Large Language Models via Multimodal Composition Learning
Listwise Reward Estimation for Offline Preference-based Reinforcement Learning
Two-Stage Shadow Inclusion Estimation: An IV Approach for Causal Inference under Latent Confounding and Collider Bias
Identification and Estimation for Nonignorable Missing Data: A Data Fusion Approach
Continuous Treatment Effects with Surrogate Outcomes
Score identity Distillation: Exponentially Fast Distillation of Pretrained Diffusion Models for One-Step Generation
Quality-Diversity Actor-Critic: Learning High-Performing and Diverse Behaviors via Value and Successor Features Critics
Scalable Real-Time Recurrent Learning Using Columnar-Constructive Networks
ReconBoost: Boosting Can Achieve Modality Reconcilement
Compressing Large Language Models by Joint Sparsification and Quantization
SimPro: A Simple Probabilistic Framework Towards Realistic Long-Tailed Semi-Supervised Learning
A Field Guide for Pacing Budget and ROS Constraints
Surprisingly Strong Performance Prediction with Neural Graph Features
EiG-Search: Generating Edge-Induced Subgraphs for GNN Explanation in Linear Time
Adversarially Robust Hypothesis Transfer Learning
Bayesian Design Principles for Offline-to-Online Reinforcement Learning
Learning Graph Representation via Graph Entropy Maximization
Representation Surgery: Theory and Practice of Affine Steering
Iterative Regularized Policy Optimization with Imperfect Demonstrations
Position: Rethinking Post-Hoc Search-Based Neural Approaches for Solving Large-Scale Traveling Salesman Problems
Can Mamba Learn How To Learn? A Comparative Study on In-Context Learning Tasks
LeaPformer: Enabling Linear Transformers for Autoregressive and Simultaneous Tasks via Learned Proportions
Neurodegenerative Brain Network Classification via Adaptive Diffusion with Temporal Regularization
Provably Efficient Partially Observable Risk-sensitive Reinforcement Learning with Hindsight Observation
Position: Technical Research and Talent is Needed for Effective AI Governance
One-Shot Strategic Classification Under Unknown Costs
Optimizing Watermarks for Large Language Models
Improved Modelling of Federated Datasets using Mixtures-of-Dirichlet-Multinomials
MOMENT: A Family of Open Time-series Foundation Models
Adaptive Advantage-Guided Policy Regularization for Offline Reinforcement Learning
ViP: A Differentially Private Foundation Model for Computer Vision
Random Latent Exploration for Deep Reinforcement Learning
Improving Equivariant Graph Neural Networks on Large Geometric Graphs via Virtual Nodes Learning
DPZero: Private Fine-Tuning of Language Models without Backpropagation
Better Locally Private Sparse Estimation Given Multiple Samples Per User
Revisiting Character-level Adversarial Attacks for Language Models
Offline Inverse RL: New Solution Concepts and Provably Efficient Algorithms
An Empirical Examination of Balancing Strategy for Counterfactual Estimation on Time Series
Prompting4Debugging: Red-Teaming Text-to-Image Diffusion Models by Finding Problematic Prompts
IBD-PSC: Input-level Backdoor Detection via Parameter-oriented Scaling Consistency
Subhomogeneous Deep Equilibrium Models
KISA: A Unified Keyframe Identifier and Skill Annotator for Long-Horizon Robotics Demonstrations
Iterative Search Attribution for Deep Neural Networks
Finding NEM-U: Explaining unsupervised representation learning through neural network generated explanation masks
Be Your Own Neighborhood: Detecting Adversarial Examples by the Neighborhood Relations Built on Self-Supervised Learning
Improving Prototypical Visual Explanations with Reward Reweighing, Reselection, and Retraining
Deconstructing the Goldilocks Zone of Neural Network Initialization
Position: AI-Powered Autonomous Weapons Risk Geopolitical Instability and Threaten AI Research
On the Generalization of Stochastic Gradient Descent with Momentum
Scaling Exponents Across Parameterizations and Optimizers
Outlier-robust Kalman Filtering through Generalised Bayes
Aligned Objective for Soft-Pseudo-Label Generation in Supervised Learning
FreeBind: Free Lunch in Unified Multimodal Space via Knowledge Fusion
Learning Constraints from Offline Demonstrations via Superior Distribution Correction Estimation
Meta Evidential Transformer for Few-Shot Open-Set Recognition
Selecting Large Language Model to Fine-tune via Rectified Scaling Law
Position: Video as the New Language for Real-World Decision Making
Mechanistic Design and Scaling of Hybrid Architectures
Straight-Through Meets Sparse Recovery: the Support Exploration Algorithm
Adaptive Feature Selection for No-Reference Image Quality Assessment by Mitigating Semantic Noise Sensitivity
Improving Sharpness-Aware Minimization by Lookahead
A Hierarchical Adaptive Multi-Task Reinforcement Learning Framework for Multiplier Circuit Design
Controllable Prompt Tuning For Balancing Group Distributional Robustness
Quantum Algorithms and Lower Bounds for Finite-Sum Optimization
RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content
PIDformer: Transformer Meets Control Theory
Towards efficient deep spiking neural networks construction with spiking activity based pruning
Toward Adaptive Reasoning in Large Language Models with Thought Rollback
Rejuvenating image-GPT as Strong Visual Representation Learners
Mitigating Catastrophic Forgetting in Online Continual Learning by Modeling Previous Task Interrelations via Pareto Optimization
Towards Global Optimality for Practical Average Reward Reinforcement Learning without Mixing Time Oracles
Reducing Item Discrepancy via Differentially Private Robust Embedding Alignment for Privacy-Preserving Cross Domain Recommendation
LASER: Linear Compression in Wireless Distributed Optimization
Efficient World Models with Context-Aware Tokenization
Batch Singular Value Polarization and Weighted Semantic Augmentation for Universal Domain Adaptation
MS-TIP: Imputation Aware Pedestrian Trajectory Prediction
Enhancing Trajectory Prediction through Self-Supervised Waypoint Distortion Prediction
Overestimation, Overfitting, and Plasticity in Actor-Critic: the Bitter Lesson of Reinforcement Learning
How Does Goal Relabeling Improve Sample Efficiency?
GiLOT: Interpreting Generative Language Models via Optimal Transport
Token-Specific Watermarking with Enhanced Detectability and Semantic Coherence for Large Language Models
How Deep Do We Need: Accelerating Training and Inference of Neural ODEs via Control Perspective
Consistent Submodular Maximization
Robustly Learning Single-Index Models via Alignment Sharpness
Delaunay Graph: Addressing Over-Squashing and Over-Smoothing Using Delaunay Triangulation
Matrix Information Theory for Self-Supervised Learning
MLIP: Efficient Multi-Perspective Language-Image Pretraining with Exhaustive Data Utilization
LLaGA: Large Language and Graph Assistant
Random features models: a way to study the success of naive imputation
From Yes-Men to Truth-Tellers: Addressing Sycophancy in Large Language Models with Pinpoint Tuning
Do Large Code Models Understand Programming Concepts? Counterfactual Analysis for Code Predicates
Learning Multiple Secrets in Mastermind
When is Transfer Learning Possible?
Generalized Neural Collapse for a Large Number of Classes
Flextron: Many-in-One Flexible Large Language Model
A Universal Class of Sharpness-Aware Minimization Algorithms
Mean Field Langevin Actor-Critic: Faster Convergence and Global Optimality beyond Lazy Learning
Neural Networks Learn Statistics of Increasing Complexity
Careful with that Scalpel: Improving Gradient Surgery with an EMA
A New Computationally Efficient Algorithm to solve Feature Selection for Functional Data Classification in High-dimensional Spaces
Evolving Subnetwork Training for Large Language Models
On the Nonlinearity of Layer Normalization
The Computational Complexity of Finding Second-Order Stationary Points
Position: Optimization in SciML Should Employ the Function Space Geometry
Automated Loss function Search for Class-imbalanced Node Classification
Convergence and Complexity Guarantee for Inexact First-order Riemannian Optimization Algorithms
Enhancing Storage and Computational Efficiency in Federated Multimodal Learning for Large-Scale Models
Community-Invariant Graph Contrastive Learning
$\mathtt{VITS}$ : Variational Inference Thompson Sampling for contextual bandits
MLI Formula: A Nearly Scale-Invariant Solution with Noise Perturbation
Universal Gradient Methods for Stochastic Convex Optimization
The Good, The Bad, and Why: Unveiling Emotions in Generative AI
Can Gaussian Sketching Converge Faster on a Preconditioned Landscape?
Navigating Complexity: Toward Lossless Graph Condensation via Expanding Window Matching
MH-pFLID: Model Heterogeneous personalized Federated Learning via Injection and Distillation for Medical Data Analysis
Activation-Descent Regularization for Input Optimization of ReLU Networks
CoLoRA: Continuous low-rank adaptation for reduced implicit neural modeling of parameterized partial differential equations
Quality-Diversity with Limited Resources
Accelerating Heterogeneous Federated Learning with Closed-form Classifiers
Generative Modeling on Manifolds Through Mixture of Riemannian Diffusion Processes
A Primal-Dual Algorithm for Offline Constrained Reinforcement Learning with Linear MDPs
CogDPM: Diffusion Probabilistic Models via Cognitive Predictive Coding
Full-Atom Peptide Design based on Multi-modal Flow Matching
Creative Text-to-Audio Generation via Synthesizer Programming
IW-GAE: Importance weighted group accuracy estimation for improved calibration and model selection in unsupervised domain adaptation
Compress Clean Signal from Noisy Raw Image: A Self-Supervised Approach
Parameter-Dependent Competitive Analysis for Online Capacitated Coverage Maximization through Boostings and Attenuations
Leveraging VLM-Based Pipelines to Annotate 3D Objects
Kepler codebook
Quantum Implicit Neural Representations
Graph Out-of-Distribution Detection Goes Neighborhood Shaping
Implicit Regularization in Feedback Alignment Learning Mechanisms for Neural Networks
GroupCover: A Secure, Efficient and Scalable Inference Framework for On-device Model Protection based on TEEs
Equivariant Frames and the Impossibility of Continuous Canonicalization
Certifiably Byzantine-Robust Federated Conformal Prediction
Position: Beyond Personhood: Agency, Accountability, and the Limits of Anthropomorphic Ethical Analysis
A Touch, Vision, and Language Dataset for Multimodal Alignment
PerceptAnon: Exploring the Human Perception of Image Anonymization Beyond Pseudonymization for GDPR
Bivariate Causal Discovery using Bayesian Model Selection
Learning to Route Among Specialized Experts for Zero-Shot Generalization
SpikeLM: Towards General Spike-Driven Language Modeling via Elastic Bi-Spiking Mechanisms
Biharmonic Distance of Graphs and its Higher-Order Variants: Theoretical Properties with Applications to Centrality and Clustering
Regression Learning with Limited Observations of Multivariate Outcomes and Features
Diffusion Models Demand Contrastive Guidance for Adversarial Purification to Advance
Prompting a Pretrained Transformer Can Be a Universal Approximator
Diffusion-based Missing-view Generation With the Application on Incomplete Multi-view Clustering
Model Assessment and Selection under Temporal Distribution Shift
Achieving Margin Maximization Exponentially Fast via Progressive Norm Rescaling
Auto-Linear Phenomenon in Subsurface Imaging
Borda Regret Minimization for Generalized Linear Dueling Bandits
Robust Data-driven Prescriptiveness Optimization
Learning a Diffusion Model Policy from Rewards via Q-Score Matching
Hybrid Inverse Reinforcement Learning
High-Dimensional Geometric Streaming for Nearly Low Rank Data
Bootstrap AutoEncoders With Contrastive Paradigm for Self-supervised Gaze Estimation
Look Ahead or Look Around? A Theoretical Comparison Between Autoregressive and Masked Pretraining
SceneCraft: An LLM Agent for Synthesizing 3D Scenes as Blender Code
New Sample Complexity Bounds for Sample Average Approximation in Heavy-Tailed Stochastic Programming
Position: A Call for Embodied AI
Prompt Sketching for Large Language Models
DiffStitch: Boosting Offline Reinforcement Learning with Diffusion-based Trajectory Stitching
Estimating Barycenters of Distributions with Neural Optimal Transport
A Theory of Fault-Tolerant Learning
On the Duality Between Sharpness-Aware Minimization and Adversarial Training
Coarse-To-Fine Tensor Trains for Compact Visual Representations
IIANet: An Intra- and Inter-Modality Attention Network for Audio-Visual Speech Separation
Variational Linearized Laplace Approximation for Bayesian Deep Learning
Encodings for Prediction-based Neural Architecture Search
Counterfactual Image Editing
Effects of Exponential Gaussian Distribution on (Double Sampling) Randomized Smoothing
Iterated Denoising Energy Matching for Sampling from Boltzmann Densities
Simultaneous identification of models and parameters of scientific simulators
Adaptive Hierarchical Certification for Segmentation using Randomized Smoothing
Efficient Contextual Bandits with Uninformed Feedback Graphs
Ensemble Pruning for Out-of-distribution Generalization
SqueezeLLM: Dense-and-Sparse Quantization
DataFreeShield: Defending Adversarial Attacks without Training Data
TVE: Learning Meta-attribution for Transferable Vision Explainer
Less is More: on the Over-Globalizing Problem in Graph Transformers
When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models
SSL4Q: Semi-Supervised Learning of Quantum Data with Application to Quantum State Classification
DITTO: Diffusion Inference-Time T-Optimization for Music Generation
CHAI: Clustered Head Attention for Efficient LLM Inference
Effective Federated Graph Matching
Graph Distillation with Eigenbasis Matching
Model Tailor: Mitigating Catastrophic Forgetting in Multi-modal Large Language Models
StableSSM: Alleviating the Curse of Memory in State-space Models through Stable Reparameterization
Off-policy Evaluation Beyond Overlap: Sharp Partial Identification Under Smoothness
Bounded and Uniform Energy-based Out-of-distribution Detection for Graphs
Faithfulness Measurable Masked Language Models
3D Geometric Shape Assembly via Efficient Point Cloud Matching
Learning 1-Bit Tiny Object Detector with Discriminative Feature Refinement
Risk-Sensitive Policy Optimization via Predictive CVaR Policy Gradient
DAG-Based Column Generation for Adversarial Team Games
Safe and Robust Subgame Exploitation in Imperfect Information Games
LayerMerge: Neural Network Depth Compression through Layer Pruning and Merging
Reinforcement Learning from Reachability Specifications: PAC Guarantees with Expected Conditional Distance
Evolution of Heuristics: Towards Efficient Automatic Algorithm Design Using Large Language Model
On the Calibration of Human Pose Estimation
Agnostic Interactive Imitation Learning: New Theory and Practical Algorithms
Bagged Deep Image Prior for Recovering Images in the Presence of Speckle Noise
NExT: Teaching Large Language Models to Reason about Code Execution
Mapping the Multiverse of Latent Representations
Robust Learning-Augmented Dictionaries
Prompt-guided Precise Audio Editing with Diffusion Models
UPAM: Unified Prompt Attack in Text-to-Image Generation Models Against Both Textual Filters and Visual Checkers
A Fresh Take on Stale Embeddings: Improving Dense Retriever Training with Corrector Networks
Graph Automorphism Group Equivariant Neural Networks
Ranking-based Client Imitation Selection for Efficient Federated Learning
Dynamic Anisotropic Smoothing for Noisy Derivative-Free Optimization
A Sober Look at LLMs for Material Discovery: Are They Actually Good for Bayesian Optimization Over Molecules?
Improving Gradient-Guided Nested Sampling for Posterior Inference
Self-Infilling Code Generation
Progressive Inference: Explaining Decoder-Only Sequence Classification Models Using Intermediate Predictions
Reflected Flow Matching
Time-Series Forecasting for Out-of-Distribution Generalization Using Invariant Learning
Classification under Nuisance Parameters and Generalized Label Shift in Likelihood-Free Inference
Latent Logic Tree Extraction for Event Sequence Explanation from LLMs
FrameQuant: Flexible Low-Bit Quantization for Transformers
Dynamic Metric Embedding into lp Space
Configurable Mirror Descent: Towards a Unification of Decision Making
Coactive Learning for Large Language Models using Implicit User Feedback
ETHER: Efficient Finetuning of Large-Scale Models with Hyperplane Reflections
Generative Active Learning for Long-tailed Instance Segmentation
How to Explore with Belief: State Entropy Maximization in POMDPs
QUEST: Query-Aware Sparsity for Efficient Long-Context LLM Inference
Potential Based Diffusion Motion Planning
KnowFormer: Revisiting Transformers for Knowledge Graph Reasoning
Bring Your Own (Non-Robust) Algorithm to Solve Robust MDPs by Estimating The Worst Kernel
Dirichlet Flow Matching with Applications to DNA Sequence Design
AquaLoRA: Toward White-box Protection for Customized Stable Diffusion Models via Watermark LoRA
CRoFT: Robust Fine-Tuning with Concurrent Optimization for OOD Generalization and Open-Set OOD Detection
Restoring balance: principled under/oversampling of data for optimal classification
Faster Maximum Inner Product Search in High Dimensions
Bespoke Non-Stationary Solvers for Fast Sampling of Diffusion and Flow Models
Residual-Conditioned Optimal Transport: Towards Structure-Preserving Unpaired and Paired Image Restoration
Implicit Representations for Constrained Image Segmentation
On the Maximal Local Disparity of Fairness-Aware Classifiers
Manifold Integrated Gradients: Riemannian Geometry for Feature Attribution
Combinatorial Multivariant Multi-Armed Bandits with Applications to Episodic Reinforcement Learning and Beyond
Reprompting: Automated Chain-of-Thought Prompt Inference Through Gibbs Sampling
Cell2Sentence: Teaching Large Language Models the Language of Biology
Robust Yet Efficient Conformal Prediction Sets
Double Stochasticity Gazes Faster: Snap-Shot Decentralized Stochastic Gradient Tracking Methods
Probabilistic Subgoal Representations for Hierarchical Reinforcement Learning
Refining Minimax Regret for Unsupervised Environment Design
EvoRainbow: Combining Improvements in Evolutionary Reinforcement Learning for Policy Search
From Self-Attention to Markov Models: Unveiling the Dynamics of Generative Transformers
Learning in Deep Factor Graphs with Gaussian Belief Propagation
Kernel Debiased Plug-in Estimation: Simultaneous, Automated Debiasing without Influence Functions for Many Target Parameters
Retrieval-Augmented Score Distillation for Text-to-3D Generation
Rethinking DP-SGD in Discrete Domain: Exploring Logistic Distribution in the Realm of signSGD
Is Epistemic Uncertainty Faithfully Represented by Evidential Deep Learning Methods?
Improving Group Robustness on Spurious Correlation Requires Preciser Group Inference
Mol-AE: Auto-Encoder Based Molecular Representation Learning With 3D Cloze Test Objective
Roping in Uncertainty: Robustness and Regularization in Markov Games
DIDI: Diffusion-Guided Diversity for Offline Behavioral Generation
Learning from Streaming Data when Users Choose
DFA-RAG: Conversational Semantic Router for Large Language Model with Definite Finite Automaton
AND: Audio Network Dissection for Interpreting Deep Acoustic Models
Learning Surrogates for Offline Black-Box Optimization via Gradient Matching
A Theoretical Analysis of Backdoor Poisoning Attacks in Convolutional Neural Networks
Diffusion Language Models Are Versatile Protein Learners
Self-Rewarding Language Models
Lightweight Image Super-Resolution via Flexible Meta Pruning
Image Fusion via Vision-Language Model
Can Machines Learn the True Probabilities?
Rethinking Data Shapley for Data Selection Tasks: Misleads and Merits
Theoretical insights for diffusion guidance: A case study for Gaussian mixture models
Split-and-Denoise: Protect large language model inference with local differential privacy
Self-attention Networks Localize When QK-eigenspectrum Concentrates
ODIM: Outlier Detection via Likelihood of Under-Fitted Generative Models
Stealthy Imitation: Reward-guided Environment-free Policy Stealing
Piecewise Constant and Linear Regression Trees: An Optimal Dynamic Programming Approach
State-Free Inference of State-Space Models: The *Transfer Function* Approach
STELLA: Continual Audio-Video Pre-training with SpatioTemporal Localized Alignment
Feature Reuse and Scaling: Understanding Transfer Learning with Protein Language Models
Polynomial-based Self-Attention for Table Representation Learning
Reward Shaping for Reinforcement Learning with An Assistant Reward Agent
Improving SAM Requires Rethinking its Optimization Formulation
Robust Stable Spiking Neural Networks
Improving Token-Based World Models with Parallel Observation Prediction
Emergence of In-Context Reinforcement Learning from Noise Distillation
Adaptive Proximal Gradient Methods Are Universal Without Approximation
Constrained Exploration via Reflected Replica Exchange Stochastic Gradient Langevin Dynamics
Cross-Domain Policy Adaptation by Capturing Representation Mismatch
Beyond Individual Input for Deep Anomaly Detection on Tabular Data
Optimal bounds for $\ell_p$ sensitivity sampling via $\ell_2$ augmentation
Turnstile $\ell_p$ leverage score sampling with applications
RoboCodeX: Multimodal Code Generation for Robotic Behavior Synthesis
Algorithm and Hardness for Dynamic Attention Maintenance in Large Language Models
Stealing part of a production language model
See More Details: Efficient Image Super-Resolution by Experts Mining
Accelerating Convergence in Bayesian Few-Shot Classification
Language Models Represent Beliefs of Self and Others
Referee Can Play: An Alternative Approach to Conditional Generation via Model Inversion
Learning with Partial-Label and Unlabeled Data: A Uniform Treatment for Supervision Redundancy and Insufficiency
Learning Linear Block Error Correction Codes
Optimally Improving Cooperative Learning in a Social Setting
A Near-Linear Time Approximation Algorithm for Beyond-Worst-Case Graph Clustering
Accelerating Legacy Numerical Solvers by Non-intrusive Gradient-based Meta-solving
Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression
Perturb-and-Project: Differentially Private Similarities and Marginals
Coarse-to-Fine Highlighting: Reducing Knowledge Hallucination in Large Language Models
APT: Adaptive Pruning and Tuning Pretrained Language Models for Efficient Training and Inference
GATE: How to Keep Out Intrusive Neighbors
SciBench: Evaluating College-Level Scientific Problem-Solving Abilities of Large Language Models
DPN: Decoupling Partition and Navigation for Neural Solvers of Min-max Vehicle Routing Problems
Approximate Nearest Neighbor Search with Window Filters
Zero-Shot Unsupervised and Text-Based Audio Editing Using DDPM Inversion
Stability-Informed Initialization of Neural Ordinary Differential Equations
Distilling Morphology-Conditioned Hypernetworks for Efficient Universal Morphology Control
UPOCR: Towards Unified Pixel-Level OCR Interface
Candidate Pseudolabel Learning: Enhancing Vision-Language Models by Prompt Tuning with Unlabeled Data
Self-Attention through Kernel-Eigen Pair Sparse Variational Gaussian Processes
GALA3D: Towards Text-to-3D Complex Scene Generation via Layout-guided Generative Gaussian Splatting
Weighted distance nearest neighbor condensing
Position: Explain to Question not to Justify
DiJiang: Efficient Large Language Models through Compact Kernelization
Domain Generalisation via Imprecise Learning
EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty
Position: Near to Mid-term Risks and Opportunities of Open-Source Generative AI
Projection-Free Variance Reduction Methods for Stochastic Constrained Multi-Level Compositional Optimization
Temporal Spiking Neural Networks with Synaptic Delay for Graph Reasoning
Multimodal Prototyping for cancer survival prediction
Binning as a Pretext Task: Improving Self-Supervised Learning in Tabular Domains
Premise Order Matters in Reasoning with Large Language Models
Scalable and Flexible Causal Discovery with an Efficient Test for Adjacency
Debating with More Persuasive LLMs Leads to More Truthful Answers
Recurrent Distance Filtering for Graph Representation Learning
ULTRAFEEDBACK: Boosting Language Models with Scaled AI Feedback
When Will Gradient Regularization Be Harmful?
Neural Tangent Kernels Motivate Cross-Covariance Graphs in Neural Networks
Position: An Inner Interpretability Framework for AI Inspired by Lessons from Cognitive Neuroscience
How do Large Language Models Navigate Conflicts between Honesty and Helpfulness?
Stationarity without mean reversion in improper Gaussian processes
Is Inverse Reinforcement Learning Harder than Standard Reinforcement Learning? A Theoretical Perspective
Representation Surgery for Multi-Task Model Merging
SelfVC: Voice Conversion With Iterative Refinement using Self Transformations
Implicit Bias of AdamW: $\ell_\infty$-Norm Constrained Optimization
Optimal Exact Recovery in Semi-Supervised Learning: A Study of Spectral Methods and Graph Convolutional Networks
Bipartite Matching in Massive Graphs: A Tight Analysis of EDCS
In-context Convergence of Transformers
Online Linear Regression in Dynamic Environments via Discounting
Learning Associative Memories with Gradient Descent
Double Momentum Method for Lower-Level Constrained Bilevel Optimization
Asymptotically Optimal and Computationally Efficient Average Treatment Effect Estimation in A/B testing
Analysis for Abductive Learning and Neural-Symbolic Reasoning Shortcuts
Environment Design for Inverse Reinforcement Learning
Two-sided Competing Matching Recommendation Markets With Quota and Complementary Preferences Constraints
Position: Insights from Survey Methodology can Improve Training Data
Causal Effect Identification in LiNGAM Models with Latent Confounders
ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy
Optimal Kernel Choice for Score Function-based Causal Discovery
Generating Chain-of-Thoughts with a Pairwise-Comparison Approach to Searching for the Most Promising Intermediate Thought
Learning Causal Domain-Invariant Temporal Dynamics for Few-Shot Action Recognition
Rethinking Guidance Information to Utilize Unlabeled Samples: A Label Encoding Perspective
DFD: Distilling the Feature Disparity Differently for Detectors
Observable Propagation: Uncovering Feature Vectors in Transformers
Diffusion Rejection Sampling
Defining Neural Network Architecture through Polytope Structures of Datasets
Sparse Model Inversion: Efficient Inversion of Vision Transformers for Data-Free Applications
ReGAL: Refactoring Programs to Discover Generalizable Abstractions
No-Regret Reinforcement Learning in Smooth MDPs
Learning Decision Policies with Instrumental Variables through Double Machine Learning
Stochastic Q-learning for Large Discrete Action Spaces
Causal Discovery with Fewer Conditional Independence Tests
Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment
Deeper or Wider: A Perspective from Optimal Generalization Error with Sobolev Loss
Revisiting the Role of Language Priors in Vision-Language Models
DéjàVu: KV-cache Streaming for Fast, Fault-tolerant Generative LLM Serving
Offline Imitation from Observation via Primal Wasserstein State Occupancy Matching
Convex and Bilevel Optimization for Neural-Symbolic Inference and Learning
Trainable Transformer in Transformer
Stochastic Optimization with Arbitrary Recurrent Data Sampling
Doubly Robust Causal Effect Estimation under Networked Interference via Targeted Learning
Position: Scaling Simulation is Neither Necessary Nor Sufficient for In-the-Wild Robot Manipulation
Understanding the Effects of Iterative Prompting on Truthfulness
Federated Continual Learning via Prompt-based Dual Knowledge Transfer
Successor Features for Efficient Multi-Subject Controlled Text Generation
SAM as the Guide: Mastering Pseudo-Label Refinement in Semi-Supervised Referring Expression Segmentation
Non-Asymptotic Analysis for Single-Loop (Natural) Actor-Critic with Compatible Function Approximation
IM-Unpack: Training and Inference with Arbitrarily Low Precision Integers
A Sparsity Principle for Partially Observable Causal Representation Learning
PARCv2: Physics-aware Recurrent Convolutional Neural Networks for Spatiotemporal Dynamics Modeling
Easing Concept Bleeding in Diffusion via Entity Localization and Anchoring
DSD-DA: Distillation-based Source Debiasing for Domain Adaptive Object Detection
Slicedit: Zero-Shot Video Editing With Text-to-Image Diffusion Models Using Spatio-Temporal Slices
Differentiable Combinatorial Scheduling at Scale
First-Order Manifold Data Augmentation for Regression Learning
An Online Optimization Perspective on First-Order and Zero-Order Decentralized Nonsmooth Nonconvex Stochastic Optimization
Prospector Heads: Generalized Feature Attribution for Large Models & Data
Class-Imbalanced Graph Learning without Class Rebalancing
Statistically Optimal Generative Modeling with Maximum Deviation from the Empirical Distribution
Efficient Exploration for LLMs
Defense against Backdoor Attack on Pre-trained Language Models via Head Pruning and Attention Normalization
Let Go of Your Labels with Unsupervised Transfer
LAGMA: LAtent Goal-guided Multi-Agent Reinforcement Learning
Implicit Representations via Operator Learning
Inverse-Variance Weighting for Estimation of Heterogeneous Treatment Effects
Trustworthy Actionable Perturbations
Online Matching with Stochastic Rewards: Provable Better Bound via Adversarial Reinforcement Learning
Convergence of Online Learning Algorithm for a Mixture of Multiple Linear Regressions
Language Agent Tree Search Unifies Reasoning, Acting, and Planning in Language Models
Zero-Sum Positional Differential Games as a Framework for Robust Reinforcement Learning: Deep Q-Learning Approach
Connect Later: Improving Fine-tuning for Robustness with Targeted Augmentations
Purifying Quantization-conditioned Backdoors via Layer-wise Activation Correction with Distribution Approximation
Random Masking Finds Winning Tickets for Parameter Efficient Fine-tuning
Learning to Explore for Stochastic Gradient MCMC
Spectral Phase Transition and Optimal PCA in Block-Structured Spiked Models
Graph Neural PDE Solvers with Conservation and Similarity-Equivariance
Equivariant Deep Weight Space Alignment
Watermark Stealing in Large Language Models
Behavior Generation with Latent Actions
Generative Marginalization Models
On the Role of Edge Dependency in Graph Generative Models
Transferring Knowledge From Large Foundation Models to Small Downstream Models
Balancing Similarity and Complementarity for Federated Learning
BetterV: Controlled Verilog Generation with Discriminative Guidance
Scaling Speech Technology to 1,000+ Languages
Envisioning Outlier Exposure by Large Language Models for Out-of-Distribution Detection
Weakly Convex Regularisers for Inverse Problems: Convergence of Critical Points and Primal-Dual Optimisation
Unveiling the Dynamics of Information Interplay in Supervised Learning
FedLMT: Tackling System Heterogeneity of Federated Learning via Low-Rank Model Training with Theoretical Guarantees
Position: Why Tabular Foundation Models Should Be a Research Priority
Is In-Context Learning in Large Language Models Bayesian? A Martingale Perspective
Graph As Point Set
On Prompt-Driven Safeguarding for Large Language Models
A Language Model’s Guide Through Latent Space
Plug-and-Play image restoration with Stochastic deNOising REgularization
Characterizing Truthfulness in Large Language Model Generations with Local Intrinsic Dimension
Feasibility Consistent Representation Learning for Safe Reinforcement Learning
Graph Generation with Diffusion Mixture
Self-Supervised Interpretable End-to-End Learning via Latent Functional Modularity
MathScale: Scaling Instruction Tuning for Mathematical Reasoning
Scalable Wasserstein Gradient Flow for Generative Modeling through Unbalanced Optimal Transport
Representing Molecules as Random Walks Over Interpretable Grammars
A Nearly Optimal Single Loop Algorithm for Stochastic Bilevel Optimization under Unbounded Smoothness
Denoising Autoregressive Representation Learning
DRCT: Diffusion Reconstruction Contrastive Training towards Universal Detection of Diffusion Generated Images
Efficient Adaptation in Mixed-Motive Environments via Hierarchical Opponent Modeling and Planning
Repeat After Me: Transformers are Better than State Space Models at Copying
Calibration Bottleneck: Over-compressed Representations are Less Calibratable
Learning Adaptive and View-Invariant Vision Transformer for Real-Time UAV Tracking
MD tree: a model-diagnostic tree grown on loss landscape
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal
Riemannian Accelerated Zeroth-order Algorithm: Improved Robustness and Lower Query Complexity
Improved Generalization of Weight Space Networks via Augmentations
Executable Code Actions Elicit Better LLM Agents
Stability and Multigroup Fairness in Ranking with Uncertain Predictions
Large Language Models Can Automatically Engineer Features for Few-Shot Tabular Learning
Small-loss Adaptive Regret for Online Convex Optimization
Synergistic Integration of Coordinate Network and Tensorial Feature for Improving Neural Radiance Fields from Sparse Inputs
VNN: Verification-Friendly Neural Networks with Hard Robustness Guarantees
Quality-Weighted Vendi Scores And Their Application To Diverse Experimental Design
Knowledge Transfer from Vision Foundation Models for Efficient Training of Small Task-specific Models
Stochastic positional embeddings improve masked image modeling
Gibbs Sampling of Continuous Potentials on a Quantum Computer
Model-Based Minimum Bayes Risk Decoding for Text Generation
On the Unexpected Effectiveness of Reinforcement Learning for Sequential Recommendation
Learning to Remove Cuts in Integer Linear Programming
Scene Graph Generation Strategy with Co-occurrence Knowledge and Learnable Term Frequency
Differentially Private Decentralized Learning with Random Walks
BBox-Adapter: Lightweight Adapting for Black-Box Large Language Models
Socialized Learning: Making Each Other Better Through Multi-Agent Collaboration
Multi-group Learning for Hierarchical Groups
Adaptive Conformal Inference by Betting
Challenges in Training PINNs: A Loss Landscape Perspective
Gradual Divergence for Seamless Adaptation: A Novel Domain Incremental Learning Method
PruNeRF: Segment-Centric Dataset Pruning via 3D Spatial Consistency
Efficient Non-stationary Online Learning by Wavelets with Applications to Online Distribution Shift Adaptation
Recovering the Pre-Fine-Tuning Weights of Generative Models
Explaining Graph Neural Networks via Structure-aware Interaction Index
Generative Conditional Distributions by Neural (Entropic) Optimal Transport
BeigeMaps: Behavioral Eigenmaps for Reinforcement Learning from Images
Scaling Down Deep Learning with MNIST-1D
Solving Hierarchical Information-Sharing Dec-POMDPs: An Extensive-Form Game Approach
Align Your Steps: Optimizing Sampling Schedules in Diffusion Models
A Dynamical Model of Neural Scaling Laws
A New Robust Partial p-Wasserstein-Based Metric for Comparing Distributions
Theory of Consistency Diffusion Models: Distribution Estimation Meets Fast Sampling
Mechanistic Neural Networks for Scientific Machine Learning
Understanding Heterophily for Graph Neural Networks
From Words to Actions: Unveiling the Theoretical Underpinnings of LLM-Driven Autonomous Systems
Connecting the Dots: Is Mode-Connectedness the Key to Feasible Sample-Based Inference in Bayesian Neural Networks?
Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation
Reshape and Adapt for Output Quantization (RAOQ): Quantization-aware Training for In-memory Computing Systems
Meta-Learners for Partially-Identified Treatment Effects Across Multiple Environments
Forget Sharpness: Perturbed Forgetting of Model Biases Within SAM Dynamics
Graph-based Forecasting with Missing Data through Spatiotemporal Downsampling
Challenges and Considerations in the Evaluation of Bayesian Causal Discovery
Fast Timing-Conditioned Latent Audio Diffusion
CodeIt: Self-Improving Language Models with Prioritized Hindsight Replay
Information Flow in Self-Supervised Learning
Unbiased Multi-Label Learning from Crowdsourced Annotations
Learning to Play Atari in a World of Tokens
Incremental Topological Ordering and Cycle Detection with Predictions
Learning Reward for Robot Skills Using Large Language Models via Self-Alignment
Langevin Policy for Safe Reinforcement Learning
GRATH: Gradual Self-Truthifying for Large Language Models
InstructSpeech: Following Speech Editing Instructions via Large Language Models
How Far Can Fairness Constraints Help Recover From Biased Data?
Neural Diffusion Models
Variational Partial Group Convolutions for Input-Aware Partial Equivariance of Rotations and Color-Shifts
Conformal Prediction for Deep Classifier via Label Ranking
The Role of Learning Algorithms in Collective Action
Open-Vocabulary Calibration for Fine-tuned CLIP
Proactive Detection of Voice Cloning with Localized Watermarking
Mitigating Privacy Risk in Membership Inference by Convex-Concave Loss
Rethinking Independent Cross-Entropy Loss For Graph-Structured Data
Rethinking Decision Transformer via Hierarchical Reinforcement Learning
Multi-Patch Prediction: Adapting Language Models for Time Series Representation Learning
Stacking Deep Set Networks and Pooling by Quantiles
Stability Evaluation through Distributional Perturbation Analysis
High-Dimensional Kernel Methods under Covariate Shift: Data-Dependent Implicit Regularization
Geometry-Calibrated DRO: Combating Over-Pessimism with Free Energy Implications
Benign Overfitting in Two-Layer ReLU Convolutional Neural Networks for XOR Data
A decoder-only foundation model for time-series forecasting
Accelerating Look-ahead in Bayesian Optimization: Multilevel Monte Carlo is All you Need
ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL
Learning to Intervene on Concept Bottlenecks
HexGen: Generative Inference of Large Language Model over Heterogeneous Environment
Defense against Model Extraction Attack by Bayesian Active Watermarking
Sample Average Approximation for Conditional Stochastic Optimization with Dependent Data
RoboDreamer: Learning Compositional World Models for Robot Imagination
Memory Consolidation Enables Long-Context Video Understanding
Disentangled Graph Self-supervised Learning for Out-of-Distribution Generalization
On PI Controllers for Updating Lagrange Multipliers in Constrained Optimization
DMTG: One-Shot Differentiable Multi-Task Grouping
Switching the Loss Reduces the Cost in Batch Reinforcement Learning
Efficient Precision and Recall Metrics for Assessing Generative Models using Hubness-aware Sampling
Scaling Laws for Fine-Grained Mixture of Experts
Partial Optimality in the Linear Ordering Problem
Genie: Generative Interactive Environments
MVMoE: Multi-Task Vehicle Routing Solver with Mixture-of-Experts
A Probabilistic Approach to Learning the Degree of Equivariance in Steerable CNNs
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Soft Prompt Recovers Compressed LLMs, Transferably
Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference
KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache
Major-Minor Mean Field Multi-Agent Reinforcement Learning
Robust Multi-Task Learning with Excess Risks
Consistent Diffusion Meets Tweedie: Training Exact Ambient Diffusion Models with Noisy Data
Harmony in Diversity: Merging Neural Networks with Canonical Correlation Analysis
3D-VLA: A 3D Vision-Language-Action Generative World Model
Stationary Latent Weight Inference for Unreliable Observations from Online Test-Time Adaptation
Feasible Reachable Policy Iteration
Evaluation of LLMs on Syntax-Aware Code Fill-in-the-Middle Tasks
Online Speculative Decoding
Image Restoration Through Generalized Ornstein-Uhlenbeck Bridge
Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads
State-Constrained Zero-Sum Differential Games with One-Sided Information
Projection-Free Online Convex Optimization with Time-Varying Constraints
Just Cluster It: An Approach for Exploration in High-Dimensions using Clustering and Pre-Trained Representations
PointMC: Multi-instance Point Cloud Registration based on Maximal Cliques
Gradient-based Visual Explanation for Transformer-based CLIP
IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation
SAM-E: Leveraging Visual Foundation Model with Sequence Imitation for Embodied Manipulation
Embodied CoT Distillation From LLM To Off-the-shelf Agents
Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast
Active Preference Learning for Large Language Models
Open Ad Hoc Teamwork with Cooperative Game Theory
Ai-sampler: Adversarial Learning of Markov kernels with involutive maps
Spider: A Unified Framework for Context-dependent Concept Segmentation
Preventing Model Collapse in Gaussian Process Latent Variable Models
Modeling Language Tokens as Functionals of Semantic Fields
Attribute Based Interpretable Evaluation Metrics for Generative Models
NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models
Debiased Offline Representation Learning for Fast Online Adaptation in Non-stationary Dynamics
A Geometric Decomposition of Finite Games: Convergence vs. Recurrence under Exponential Weights
Online bipartite matching with imperfect advice
Accurate LoRA-Finetuning Quantization of LLMs via Information Retention
Stability and Generalization for Stochastic Recursive Momentum-based Algorithms for (Strongly-)Convex One to $K$-Level Stochastic Optimizations
Nash Incentive-compatible Online Mechanism Learning via Weakly Differentially Private Online Learning
Online Resource Allocation with Non-Stationary Customers
Vanilla Bayesian Optimization Performs Great in High Dimensions
On the Emergence of Cross-Task Linearity in Pretraining-Finetuning Paradigm
Quantum Theory and Application of Contextual Optimal Transport
Policy Learning for Balancing Short-Term and Long-Term Rewards
Local Causal Structure Learning in the Presence of Latent Variables
On the Minimal Degree Bias in Generalization on the Unseen for non-Boolean Functions
Position: Embracing Negative Results in Machine Learning
Federated Self-Explaining GNNs with Anti-shortcut Augmentations
Can Implicit Bias Imply Adversarial Robustness?
Initial Guessing Bias: How Untrained Networks Favor Some Classes
SignSGD with Federated Defense: Harnessing Adversarial Attacks through Gradient Sign Decoding
Graph Neural Network Explanations are Fragile
Momentum Particle Maximum Likelihood
Federated Full-Parameter Tuning of Billion-Sized Language Models with Communication Cost under 18 Kilobytes
Position: Do Not Explain Vision Models Without Context
Long Range Propagation on Continuous-Time Dynamic Graphs
Visual-Text Cross Alignment: Refining the Similarity Score in Vision-Language Models
Efficient Mixture Learning in Black-Box Variational Inference
Tilting the Odds at the Lottery: the Interplay of Overparameterisation and Curricula in Neural Networks
Improved Bounds for Pure Private Agnostic Learning: Item-Level and User-Level Privacy
Translating Subgraphs to Nodes Makes Simple GNNs Strong and Efficient for Subgraph Representation Learning
Performance Bounds for Active Binary Testing with Information Maximization
TENG: Time-Evolving Natural Gradient for Solving PDEs With Deep Neural Nets Toward Machine Precision
On the Feasibility of Single-Pass Full-Capacity Learning in Linear Threshold Neurons with Binary Input Vectors
Unsupervised Concept Discovery Mitigates Spurious Correlations
Probabilistic Constrained Reinforcement Learning with Formal Interpretability
Complexity Matters: Feature Learning in the Presence of Spurious Correlations
Data Poisoning Attacks against Conformal Prediction
Hybrid$^2$ Neural ODE Causal Modeling and an Application to Glycemic Response
Neural SPH: Improved Neural Modeling of Lagrangian Fluid Dynamics
Rethinking Adversarial Robustness in the Context of the Right to be Forgotten
Decomposing and Editing Predictions by Modeling Model Computation
GFlowNet Training by Policy Gradients
Deciphering RNA Secondary Structure Prediction: A Probabilistic K-Rook Matching Perspective
Parameter-Efficient Fine-Tuning with Discrete Fourier Transform
Neural Collapse in Multi-label Learning with Pick-all-label Loss
Attack-free Evaluating and Enhancing Adversarial Robustness on Categorical Data
RNAFlow: RNA Structure & Sequence Design via Inverse Folding-Based Flow Matching
Graph2Tac: Online Representation Learning of Formal Math Concepts
Causal Representation Learning Made Identifiable by Grouping of Observational Variables
Neuro-Symbolic Temporal Point Processes
Generalized Sobolev Transport for Probability Measures on a Graph
Improving Neural Logic Machines via Failure Reflection
Differentiable Model Scaling using Differentiable Topk
Position: Bayesian Deep Learning is Needed in the Age of Large-Scale AI
Position: Topological Deep Learning is the New Frontier for Relational Learning
Bridging Mini-Batch and Asymptotic Analysis in Contrastive Learning: From InfoNCE to Kernel-Based Losses
Reparameterized Importance Sampling for Robust Variational Bayesian Neural Networks
Position: What Can Large Language Models Tell Us about Time Series Analysis
Private and Federated Stochastic Convex Optimization: Efficient Strategies for Centralized Systems
Towards a Self-contained Data-driven Global Weather Forecasting Framework
Fundamental Limits of Distributed Covariance Matrix Estimation Under Communication Constraints
Bayesian Uncertainty for Gradient Aggregation in Multi-Task Learning
Early Time Classification with Accumulated Accuracy Gap Control
Theoretical Guarantees for Variational Inference with Fixed-Variance Mixture of Gaussians
A connection between Tempering and Entropic Mirror Descent
Zeroth-Order Methods for Constrained Nonconvex Nonsmooth Stochastic Optimization
On Positivity Condition for Causal Inference
A Minimaximalist Approach to Reinforcement Learning from Human Feedback
Disparate Impact on Group Accuracy of Linearization for Private Inference
Standardized Interpretable Fairness Measures for Continuous Risk Scores
FedBAT: Communication-Efficient Federated Learning via Learnable Binarization
MS$^3$D: A RG Flow-Based Regularization for GAN Training with Limited Data
Diffusion Tempering Improves Parameter Estimation with Probabilistic Integrators for Ordinary Differential Equations
On Mechanistic Knowledge Localization in Text-to-Image Generative Models
Multi-class Probabilistic Bounds for Majority Vote Classifiers with Partially Labeled Data
SAMformer: Unlocking the Potential of Transformers in Time Series Forecasting with Sharpness-Aware Minimization and Channel-Wise Attention
An Empirical Study Into What Matters for Calibrating Vision-Language Models
A Bayesian Approach to Online Planning
An Iterative Min-Min Optimization Method for Sparse Bayesian Learning
Autoencoding Conditional Neural Processes for Representation Learning
Exact Soft Analytical Side-Channel Attacks using Tractable Circuits
Spike Distance Function as a Learning Objective for Spike Prediction
Differentiable Annealed Importance Sampling Minimizes The Jensen-Shannon Divergence Between Initial and Target Distribution
$S^2$IP-LLM: Semantic Space Informed Prompt Learning with LLM for Time Series Forecasting
Recurrent Early Exits for Federated Learning with Heterogeneous Clients
Subgraphormer: Unifying Subgraph GNNs and Graph Transformers via Graph Products
Networked Inequality: Preferential Attachment Bias in Graph Neural Network Link Prediction
ProtoGate: Prototype-based Neural Networks with Global-to-local Feature Selection for Tabular Biomedical Data
Tilt and Average : Geometric Adjustment of the Last Layer for Recalibration
Hyperbolic Active Learning for Semantic Segmentation under Domain Shift
How to Trace Latent Generative Model Generated Images without Artificial Watermark?
StableMask: Refining Causal Masking in Decoder-only Transformer
The WMDP Benchmark: Measuring and Reducing Malicious Use with Unlearning
Improving Generalization in Offline Reinforcement Learning via Adversarial Data Splitting
Slow and Steady Wins the Race: Maintaining Plasticity with Hare and Tortoise Networks
Effect-Invariant Mechanisms for Policy Generalization
Compression of Structured Data with Autoencoders: Provable Benefit of Nonlinearities and Depth
Fourier Controller Networks for Real-Time Decision-Making in Embodied Learning
DiffFPR: Diffusion Prior for Oversampled Fourier Phase Retrieval
Disentanglement Learning via Topology
Position: Why We Must Rethink Empirical Research in Machine Learning
A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity
I/O Complexity of Attention, or How Optimal is FlashAttention?
Box Facets and Cut Facets of Lifted Multicut Polytopes
Adaptive Observation Cost Control for Variational Quantum Eigensolvers
Expert Proximity as Surrogate Rewards for Single Demonstration Imitation Learning
Towards Theoretical Understandings of Self-Consuming Generative Models
Reward Model Learning vs. Direct Policy Optimization: A Comparative Analysis of Learning from Human Preferences
Agent-Specific Effects: A Causal Effect Propagation Analysis in Multi-Agent MDPs
Graph Mixup on Approximate Gromov–Wasserstein Geodesics
Beyond Regular Grids: Fourier-Based Neural Operators on Arbitrary Domains
Graph-enhanced Large Language Models in Asynchronous Plan Reasoning
Understanding the Learning Dynamics of Alignment with Human Feedback
Perfect Alignment May be Poisonous to Graph Contrastive Learning
Probabilistic Inference in Language Models via Twisted Sequential Monte Carlo
A Computational Framework for Solving Wasserstein Lagrangian Flows
HelmFluid: Learning Helmholtz Dynamics for Interpretable Fluid Prediction
Think Before You Act: Decision Transformers with Working Memory
Language-guided Skill Learning with Temporal Variational Inference
PARDEN, Can You Repeat That? Defending against Jailbreaks via Repetition
On the Trajectory Regularity of ODE-based Diffusion Sampling
On Which Nodes Does GCN Fail? Enhancing GCN From the Node Perspective
Generalization Analysis of Deep Non-linear Matrix Completion
Variational Schrödinger Diffusion Models
Structure Your Data: Towards Semantic Graph Counterfactuals
Solving Poisson Equations using Neural Walk-on-Spheres
Chasing Convex Functions with Long-term Constraints
Improving Neural Additive Models with Bayesian Principles
ACPO: A Policy Optimization Algorithm for Average MDPs with Constraints
Diagnosing the Compositional Knowledge of Vision Language Models from a Game-Theoretic View
Energy-Guided Diffusion Sampling for Offline-to-Online Reinforcement Learning
Layerwise Change of Knowledge in Neural Networks
LLM and Simulation as Bilevel Optimizers: A New Paradigm to Advance Physical Scientific Discovery
ContPhy: Continuum Physical Concept Learning and Reasoning from Videos
RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning via Generative Simulation
On the Tractability of SHAP Explanations under Markovian Distributions
FedSC: Provable Federated Self-supervised Learning with Spectral Contrastive Objective over Non-i.i.d. Data
StackSight: Unveiling WebAssembly through Large Language Models and Neurosymbolic Chain-of-Thought Decompilation
Leveraging Attractor Dynamics in Spatial Navigation for Better Language Parsing
High-Performance Temporal Reversible Spiking Neural Networks with $\mathcal{O}(L)$ Training Memory and $\mathcal{O}(1)$ Inference Cost
What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation
Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data Flow and Per-Block Quantization
Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling
Do Models Explain Themselves? Counterfactual Simulatability of Natural Language Explanations
Asymptotics of feature learning in two-layer networks after one gradient-step
Truly No-Regret Learning in Constrained MDPs
Testing the Feasibility of Linear Programs with Bandit Feedback
Handling Heterogeneous Curvatures in Bandit LQR Control
Resisting Stochastic Risks in Diffusion Planners with the Trajectory Aggregation Tree
Generalization in Kernel Regression Under Realistic Assumptions
Dynamic Facility Location in High Dimensional Euclidean Spaces
Towards Resource-friendly, Extensible and Stable Incomplete Multi-view Clustering
Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning
Individual Fairness in Graph Decomposition
Leveraging (Biased) Information: Multi-armed Bandits with Offline Data
Discrete Latent Perspective Learning for Segmentation and Detection
ERQ: Error Reduction for Post-Training Quantization of Vision Transformers
Revisiting the Power of Prompt for Visual Tuning
On The Fairness Impacts of Hardware Selection in Machine Learning
Simple linear attention language models balance the recall-throughput tradeoff
Learning-Rate-Free Stochastic Optimization over Riemannian Manifolds
Auto-Encoding Morph-Tokens for Multimodal LLM
Optimal Acceleration for Minimax and Fixed-Point Problems is Not Unique
Locally Estimated Global Perturbations are Better than Local Perturbations for Federated Sharpness-aware Minimization
Position: Intent-aligned AI Systems Must Optimize for Agency Preservation
Stochastic Interpolants with Data-Dependent Couplings
Improved Operator Learning by Orthogonal Attention
A Circuit Domain Generalization Framework for Efficient Logic Synthesis in Chip Design
Optimal Eye Surgeon: Finding image priors through sparse generators at initialization
Re-Dock: Towards Flexible and Realistic Molecular Docking with Diffusion Bridge
Minimax Optimality of Score-based Diffusion Models: Beyond the Density Lower Bound Assumptions
Finite Volume Features, Global Geometry Representations, and Residual Training for Deep Learning-based CFD Simulation
Transolver: A Fast Transformer Solver for PDEs on General Geometries
Batch and match: black-box variational inference with a score-based divergence
Sharp Rates in Dependent Learning Theory: Avoiding Sample Size Deflation for the Square Loss
Best Arm Identification for Stochastic Rising Bandits
Predictive Linear Online Tracking for Unknown Targets
DySLIM: Dynamics Stable Learning by Invariant Measure for Chaotic Systems
Equivariance via Minimal Frame Averaging for More Symmetries and Efficiency
Sample-specific Masks for Visual Reprogramming-based Prompting
Vocabulary for Universal Approximation: A Linguistic Perspective of Mapping Compositions
Double Variance Reduction: A Smoothing Trick for Composite Optimization Problems without First-Order Gradient
Refined Coreset Selection: Towards Minimal Coreset Size under Model Performance Constraints
Position: Reinforcement Learning in Dynamic Treatment Regimes Needs Critical Reexamination
WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
ULAREF: A Unified Label Refinement Framework for Learning with Inaccurate Supervision
Sparse and Structured Hopfield Networks
Position: Graph Foundation Models Are Already Here
QuRating: Selecting High-Quality Data for Training Language Models
By Tying Embeddings You Are Assuming the Distributional Hypothesis
Code as Reward: Empowering Reinforcement Learning with VLMs
Mixtures of Experts Unlock Parameter Scaling for Deep RL
Novel Spectral Algorithms for the Partial Credit Model
How Deep Networks Learn Sparse and Hierarchical Data: the Sparse Random Hierarchy Model
Regression with Multi-Expert Deferral
Pricing with Contextual Elasticity and Heteroscedastic Valuation
Adaptive Online Experimental Design for Causal Discovery
QBMK: Quantum-based Matching Kernels for Un-attributed Graphs
Quasi-Monte Carlo Features for Kernel Approximation
Self-Alignment of Large Language Models via Monopolylogue-based Social Scene Simulation
Principled Preferential Bayesian Optimization
Size-invariance Matters: Rethinking Metrics and Losses for Imbalanced Multi-object Salient Object Detection
Stereographic Spherical Sliced Wasserstein Distances
Model Alignment as Prospect Theoretic Optimization
Evaluation of Trajectory Distribution Predictions with Energy Score
Replicable Learning of Large-Margin Halfspaces
Towards Theoretical Understanding of Learning Large-scale Dependent Data via Random Features
Allocation Requires Prediction Only if Inequality Is Low
Differentially Private Synthetic Data via Foundation Model APIs 2: Text
Test-Time Degradation Adaptation for Open-Set Image Restoration
Position: The No Free Lunch Theorem, Kolmogorov Complexity, and the Role of Inductive Biases in Machine Learning
Navigating Scaling Laws: Compute Optimality in Adaptive Model Training
Block Acceleration Without Momentum: On Optimal Stepsizes of Block Gradient Descent for Least-Squares
Multi-Track Message Passing: Tackling Oversmoothing and Oversquashing in Graph Learning via Preventing Heterophily Mixing
Unsupervised Zero-Shot Reinforcement Learning via Functional Reward Encodings
Learning Optimal Deterministic Policies with Stochastic Policy Gradients
Concentration Inequalities for General Functions of Heavy-Tailed Random Variables
Feedback Loops With Language Models Drive In-Context Reward Hacking
Optimal Ridge Regularization for Out-of-Distribution Prediction
Gambling-Based Confidence Sequences for Bounded Random Vectors
An Efficient Maximal Ancestral Graph Listing Algorithm
Automating the Selection of Proxy Variables of Unmeasured Confounders
High-Dimensional Bayesian Optimization via Semi-Supervised Learning with Optimized Unlabeled Data Sampling
DsDm: Model-Aware Dataset Selection with Datamodels
Faster Adaptive Decentralized Learning Algorithms
Position: Data Authenticity, Consent, & Provenance for AI are all broken: what will it take to fix them?
Fundamental Benefit of Alternating Updates in Minimax Optimization
Prospective Side Information for Latent MDPs
RIME: Robust Preference-based Reinforcement Learning with Noisy Preferences
Masked Face Recognition with Generative-to-Discriminative Representations
Tuning-Free Stochastic Optimization
Sequential Neural Score Estimation: Likelihood-Free Inference with Conditional Score Based Diffusion Models
On Stronger Computational Separations Between Multimodal and Unimodal Machine Learning
Provably Better Explanations with Optimized Aggregation of Feature Attributions
A Subquadratic Time Algorithm for Robust Sparse Mean Estimation
On the Complexity of Finite-Sum Smooth Optimization under the Polyak–Łojasiewicz Condition
Dynamic Correlation Clustering in Sublinear Update Time
Exploiting Code Symmetries for Learning Program Semantics
Position: Levels of AGI for Operationalizing Progress on the Path to AGI
TravelPlanner: A Benchmark for Real-World Planning with Language Agents
Variational Learning is Effective for Large Deep Networks
Transport of Algebraic Structure to Latent Embeddings
FAFE: Immune Complex Modeling with Geodesic Distance Loss on Noisy Group Frames
PRISE: LLM-Style Sequence Compression for Learning Temporal Action Abstractions in Control
Tabular Insights, Visual Impacts: Transferring Expertise from Tables to Images
Explaining Probabilistic Models with Distributional Values
Robust and Conjugate Gaussian Process Regression
Physics of Language Models: Part 3.1, Knowledge Storage and Extraction
Second-Order Uncertainty Quantification: A Distance-Based Approach
Pessimism Meets Risk: Risk-Sensitive Offline Reinforcement Learning
LIDAO: Towards Limited Interventions for Debiasing (Large) Language Models
InterpreTabNet: Distilling Predictive Signals from Tabular Data by Salient Feature Interpretation
Stay on Topic with Classifier-Free Guidance
Position: What makes an image realistic?
BayOTIDE: Bayesian Online Multivariate Time Series Imputation with Functional Decomposition
Local vs. Global Interpretability: A Computational Complexity Perspective
LLM Maybe LongLM: SelfExtend LLM Context Window Without Tuning
Estimating Unknown Population Sizes Using the Hypergeometric Distribution
Triple Changes Estimator for Targeted Policies
On a Neural Implementation of Brenier's Polar Factorization
The Perception-Robustness Tradeoff in Deterministic Image Restoration
Relaxing the Accurate Imputation Assumption in Doubly Robust Learning for Debiased Collaborative Filtering
PriorBoost: An Adaptive Algorithm for Learning from Aggregate Responses
Memoria: Resolving Fateful Forgetting Problem through Human-Inspired Memory Architecture
Fine-tuning Reinforcement Learning Models is Secretly a Forgetting Mitigation Problem
RICE: Breaking Through the Training Bottlenecks of Reinforcement Learning with Explanation
FiT: Flexible Vision Transformer for Diffusion Model
A Tale of Tails: Model Collapse as a Change of Scaling Laws
MindEye2: Shared-Subject Models Enable fMRI-To-Image With 1 Hour of Data
Consistent Adversarially Robust Linear Classification: Non-Parametric Setting
Precise Accuracy / Robustness Tradeoffs in Regression: Case of General Norms
Adapting Static Fairness to Sequential Decision-Making: Bias Mitigation Strategies towards Equal Long-term Benefit Rate
ACE: Off-Policy Actor-Critic with Causality-Aware Entropy Regularization
Statistical Inference Under Constrained Selection Bias
Graph-Triggered Rising Bandits
Integrated Hardware Architecture and Device Placement Search
Local Feature Selection without Label or Feature Leakage for Interpretable Machine Learning Predictions
Probabilistic Generating Circuits - Demystified
C-RAG: Certified Generation Risks for Retrieval-Augmented Language Models
Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities
Learning from Memory: Non-Parametric Memory Augmented Self-Supervised Learning of Visual Features
Graph Adversarial Diffusion Convolution
Preference Optimization for Molecule Synthesis with Conditional Residual Energy-based Models
Lie Neurons: Adjoint-Equivariant Neural Networks for Semisimple Lie Algebras
DISCRET: Synthesizing Faithful Explanations For Treatment Effect Estimation
Towards Compositionality in Concept Learning
Human vs. Generative AI in Content Creation Competition: Symbiosis or Conflict?
Bifurcated Attention for Single-Context Large-Batch Sampling
Demystifying SGD with Doubly Stochastic Gradients
Structured Chemistry Reasoning with Large Language Models
CLIPZyme: Reaction-Conditioned Virtual Screening of Enzymes
T-Cal: An Optimal Test for the Calibration of Predictive Models
GenCO: Generating Diverse Designs with Combinatorial Constraints
Contrastive Predict-and-Search for Mixed Integer Linear Programs
Monitoring AI-Modified Content at Scale: A Case Study on the Impact of ChatGPT on AI Conference Peer Reviews
On Computational Limits of Modern Hopfield Models: A Fine-Grained Complexity Analysis
Noise-Aware Algorithm for Heterogeneous Differentially Private Federated Learning
Multicalibration for Confidence Scoring in LLMs
Disguised Copyright Infringement of Latent Diffusion Models
Statistical Properties of Robust Satisficing
Provably Scalable Black-Box Variational Inference with Structured Variational Families
Learning to Stabilize Online Reinforcement Learning in Unbounded State Spaces
Prior Mismatch and Adaptation in PnP-ADMM with a Nonconvex Convergence Analysis
Kernel-Based Evaluation of Conditional Biological Sequence Models
MALIBO: Meta-learning for Likelihood-free Bayesian Optimization
Total Variation Distance Meets Probabilistic Inference
Neural-Kernel Conditional Mean Embeddings
Listening to the noise: Blind Denoising with Gibbs Diffusion
Target Networks and Over-parameterization Stabilize Off-policy Bootstrapping with Function Approximation
Training Greedy Policy for Proposal Batch Selection in Expensive Multi-Objective Combinatorial Optimization
Which Frequencies do CNNs Need? Emergent Bottleneck Structure in Feature Learning
Evaluation of Test-Time Adaptation Under Computational Time Constraints
Cluster-Aware Similarity Diffusion for Instance Retrieval
Understanding Diffusion Models by Feynman's Path Integral
Mathematical Framework for Online Social Media Auditing
Et Tu Certifications: Robustness Certificates Yield Better Adversarial Examples
SLOG: An Inductive Spectral Graph Neural Network Beyond Polynomial Filter
Visual Transformer with Differentiable Channel Selection: An Information Bottleneck Inspired Approach
Adversarial Robustness Limits via Scaling-Law and Human-Alignment Studies
Nearest Neighbour Score Estimators for Diffusion Generative Models
DPOT: Auto-Regressive Denoising Operator Transformer for Large-Scale PDE Pre-Training
Pausing Policy Learning in Non-stationary Reinforcement Learning
Conformal prediction for multi-dimensional time series by ellipsoidal sets
Lyapunov-stable Neural Control for State and Output Feedback: A Novel Formulation
Intersecting-Boundary-Sensitive Fingerprinting for Tampering Detection of DNN Models
PGODE: Towards High-quality System Dynamics Modeling
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models
Automated Statistical Model Discovery with Language Models
Superposition Prompting: Improving and Accelerating Retrieval-Augmented Generation
Pairwise Alignment Improves Graph Domain Adaptation
Hybrid Neural Representations for Spherical Data
VideoPoet: A Large Language Model for Zero-Shot Video Generation
Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws
Self-Correcting Self-Consuming Loops for Generative Model Training
MMPareto: Boosting Multimodal Learning with Innocent Unimodal Assistance
Discovering Mixtures of Structural Causal Models from Time Series Data
Latent Space Symmetry Discovery
Multi-Fidelity Residual Neural Processes for Scalable Surrogate Modeling
Revisiting Context Aggregation for Image Matting
VisionGraph: Leveraging Large Multimodal Models for Graph Theory Problems in Visual Context
Theoretical Analysis of Learned Database Operations under Distribution Shift through Distribution Learnability
ELF: Encoding Speaker-Specific Latent Speech Feature for Speech Synthesis
EvoluNet: Advancing Dynamic Non-IID Transfer Learning on Graphs
Unlock the Cognitive Generalization of Deep Reinforcement Learning via Granular Ball Representation
Connecting the Dots: Collaborative Fine-tuning for Black-Box Vision-Language Models
Language-Driven Cross-Modal Classifier for Zero-Shot Multi-Label Image Recognition
Realistic Unsupervised CLIP Fine-tuning with Universal Entropy Optimization
Generalization Bounds for Causal Regression: Insights, Guarantees and Sensitivity Analysis
Neighboring Perturbations of Knowledge Editing on Large Language Models
Constrained Reinforcement Learning Under Model Mismatch
OSN: Infinite Representations of Dynamic 3D Scenes from Monocular Videos
Generating In-Distribution Proxy Graphs for Explaining Graph Neural Networks
CurBench: Curriculum Learning Benchmark
Open-Domain Text Evaluation via Contrastive Distribution Methods
A sampling theory perspective on activations for implicit neural representations
Hyperbolic Geometric Latent Diffusion Model for Graph Generation
Task Groupings Regularization: Data-Free Meta-Learning with Heterogeneous Pre-trained Models
CLIF: Complementary Leaky Integrate-and-Fire Neuron for Spiking Neural Networks
Achieving Lossless Gradient Sparsification via Mapping to Alternative Space in Federated Learning
Collective Certified Robustness against Graph Injection Attacks
Fewer Truncations Improve Language Modeling
Density-Softmax: Efficient Test-time Model for Uncertainty Estimation and Robustness under Distribution Shifts
Convex Relaxations of ReLU Neural Networks Approximate Global Optima in Polynomial Time
Unifying Image Processing as Visual Prompting Question Answering
No Dimensional Sampling Coresets for Classification
Smoothing Proximal Gradient Methods for Nonsmooth Sparsity Constrained Optimization: Optimality Conditions and Global Convergence
tnGPS: Discovering Unknown Tensor Network Structure Search Algorithms via Large Language Models (LLMs)
Differentiable Weightless Neural Networks
Sequential Asynchronous Action Coordination in Multi-Agent Systems: A Stackelberg Decision Transformer Approach
Information Complexity of Stochastic Convex Optimization: Applications to Generalization, Memorization, and Tracing
Hypergraph-enhanced Dual Semi-supervised Graph Classification
Enabling Few-Shot Learning with PID Control: A Layer Adaptive Optimizer
Agnostic Sample Compression Schemes for Regression
EvTexture: Event-driven Texture Enhancement for Video Super-Resolution
EE-LLM: Large-Scale Training and Inference of Early-Exit Large Language Models with 3D Parallelism
Causal Bandits: The Pareto Optimal Frontier of Adaptivity, a Reduction to Linear Bandits, and Limitations around Unknown Marginals
When Do Skills Help Reinforcement Learning? A Theoretical Analysis of Temporal Abstractions
CasCast: Skillful High-resolution Precipitation Nowcasting via Cascaded Modelling
Toward Availability Attacks in 3D Point Clouds
Long-Tail Learning with Foundation Model: Heavy Fine-Tuning Hurts
Scalable Pre-training of Large Autoregressive Image Models
CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution
Model-Based RL for Mean-Field Games is not Statistically Harder than Single-Agent RL
Enhancing Adversarial Robustness in SNNs with Sparse Gradients
Out-of-Distribution Detection via Deep Multi-Comprehension Ensemble
Exploring Intrinsic Dimension for Vision-Language Model Pruning
Irregular Multivariate Time Series Forecasting: A Transformable Patching Graph Neural Networks Approach
Symbolic Music Generation with Non-Differentiable Rule Guided Diffusion
PEARL: Zero-shot Cross-task Preference Alignment and Robust Reward Learning for Robotic Manipulation
Predictive Coding beyond Correlations
Harmonizing Generalization and Personalization in Federated Prompt Learning
Slicing Mutual Information Generalization Bounds for Neural Networks
Asymmetry in Low-Rank Adapters of Foundation Models
Binary Decomposition: A Problem Transformation Perspective for Open-Set Semi-Supervised Learning
Exploring the Complexity of Deep Neural Networks through Functional Equivalence
Mitigating Oversmoothing Through Reverse Process of GNNs for Heterophilic Graphs
Neural Image Compression with Text-guided Encoding for both Pixel-level and Perceptual Fidelity
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
Fast Sampling-Based Sketches for Tensors
Conformal Prediction Sets Improve Human Decision Making
Understanding Server-Assisted Federated Learning in the Presence of Incomplete Client Participation
Accelerating Transformer Pre-training with 2:4 Sparsity
Graph Geometry-Preserving Autoencoders
Transformers Provably Learn Sparse Token Selection While Fully-Connected Nets Cannot
Towards Efficient Training and Evaluation of Robust Models against $l_0$ Bounded Adversarial Perturbations
Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization
Sub-token ViT Embedding via Stochastic Resonance Transformers
Adapt and Diffuse: Sample-adaptive Reconstruction via Latent Diffusion Models
MuxServe: Flexible Spatial-Temporal Multiplexing for Multiple LLM Serving
WISER: Weak Supervision and Supervised Representation Learning to Improve Drug Response Prediction in Cancer
CHEMREASONER: Heuristic Search over a Large Language Model’s Knowledge Space using Quantum-Chemical Feedback
Integrating Multimodal Data for Joint Generative Modeling of Complex Dynamics
Isometric Representation Learning for Disentangled Latent Space of Diffusion Models
CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers
Individual Contributions as Intrinsic Exploration Scaffolds for Multi-agent Reinforcement Learning
Online Cascade Learning for Efficient Inference over Streams
Towards Optimal Adversarial Robust Q-learning with Bellman Infinity-error
Towards Unified Multi-granularity Text Detection with Interactive Attention
Converting Transformers to Polynomial Form for Secure Inference Over Homomorphic Encryption
Vision Transformers as Probabilistic Expansion from Learngene
Exploiting Human-AI Dependence for Learning to Defer
Bounding the Excess Risk for Linear Models Trained on Marginal-Preserving, Differentially-Private, Synthetic Data
Bidirectional Reciprocative Information Communication for Few-Shot Semantic Segmentation
LoRAP: Transformer Sub-Layers Deserve Differentiated Structured Compression for Large Language Models
Pi-DUAL: Using privileged information to distinguish clean from noisy labels
Stochastic Localization via Iterative Posterior Sampling
GPT-4V(ision) is a Generalist Web Agent, if Grounded
Position: Leverage Foundational Models for Black-Box Optimization
Single-Model Attribution of Generative Models Through Final-Layer Inversion
Balanced Resonate-and-Fire Neurons
Towards Understanding Inductive Bias in Transformers: A View From Infinity
Learning to Reach Goals via Diffusion
Deletion-Anticipative Data Selection with a Limited Budget
PIPER: Primitive-Informed Preference-based Hierarchical Reinforcement Learning via Hindsight Relabeling
Positional Knowledge is All You Need: Position-induced Transformer (PiT) for Operator Learning
Neuroexplicit Diffusion Models for Inpainting of Optical Flow Fields
LoRA Training in the NTK Regime has No Spurious Local Minima
Two Tales of Single-Phase Contrastive Hebbian Learning
Sparse-to-dense Multimodal Image Registration via Multi-Task Learning
Learning with Complementary Labels Revisited: The Selected-Completely-at-Random Setting Is More Practical
Learning Scale-Aware Spatio-temporal Implicit Representation for Event-based Motion Deblurring
A Diffusion Model Framework for Unsupervised Neural Combinatorial Optimization
How Private are DP-SGD Implementations?
On the Diminishing Returns of Width for Continual Learning
Individualized Privacy Accounting via Subsampling with Applications in Combinatorial Optimization
Predicting Lagrangian Multipliers for Mixed Integer Linear Programs
Zero-Shot ECG Classification with Multimodal Learning and Test-time Clinical Knowledge Enhancement
Analyzing $D^\alpha$ seeding for $k$-means
EfficientZero V2: Mastering Discrete and Continuous Control with Limited Data
Fast Text-to-3D-Aware Face Generation and Manipulation via Direct Cross-modal Mapping and Geometric Regularization
Pursuing Overall Welfare in Federated Learning through Sequential Decision Making
Statistical Test for Attention Maps in Vision Transformers
Unveiling the Potential of AI for Nanomaterial Morphology Prediction
MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI
PASOA- PArticle baSed Bayesian Optimal Adaptive design
Online Learning and Information Exponents: The Importance of Batch size & Time/Complexity Tradeoffs
Optimal Kernel Quantile Learning with Random Features
A General Online Algorithm for Optimizing Complex Performance Metrics
Distributed High-Dimensional Quantile Regression: Estimation Efficiency and Support Recovery
How Free is Parameter-Free Stochastic Optimization?
Position: Towards Implicit Prompt For Text-To-Image Models
Bridging Data Gaps in Diffusion Models with Adversarial Noise-Based Transfer Learning
Libra: Building Decoupled Vision System on Large Language Models
SPADE: Sparsity-Guided Debugging for Deep Neural Networks
Extreme Compression of Large Language Models via Additive Quantization
Error Feedback Can Accurately Compress Preconditioners
OODRobustBench: a Benchmark and Large-Scale Analysis of Adversarial Robustness under Distribution Shift
Designing Decision Support Systems using Counterfactual Prediction Sets
ESM All-Atom: Multi-Scale Protein Language Model for Unified Molecular Modeling
Latent variable model for high-dimensional point process with structured missingness
Active Adaptive Experimental Design for Treatment Effect Estimation with Covariate Choice
On the Weight Dynamics of Deep Normalized Networks
FedRC: Tackling Diverse Distribution Shifts Challenge in Federated Learning by Robust Clustering
In-Context Freeze-Thaw Bayesian Optimization for Hyperparameter Optimization
Adaptive Learning of Density Ratios in RKHS
Overcoming Saturation in Density Ratio Estimation by Iterated Regularization
Self-Composing Policies for Scalable Continual Reinforcement Learning
Energy-based Backdoor Defense without Task-Specific Samples and Model Retraining
Case-Based or Rule-Based: How Do Transformers Do the Math?
SF-DQN: Provable Knowledge Transfer using Successor Feature for Deep Reinforcement Learning
Performative Prediction with Bandit Feedback: Learning through Reparameterization
Conformal Predictions under Markovian Data
Policy-conditioned Environment Models are More Generalizable
convSeq: Fast and Scalable Method for Detecting Patterns in Spike Data
Generalization to New Sequential Decision Making Tasks with In-Context Learning
Offline Transition Modeling via Contrastive Energy Learning
Reservoir Computing for Short High-Dimensional Time Series: an Application to SARS-CoV-2 Hospitalization Forecast
TinyTrain: Resource-Aware Task-Adaptive Sparse Training of DNNs at the Data-Scarce Edge
PAPM: A Physics-aware Proxy Model for Process Systems
DE-COP: Detecting Copyrighted Content in Language Models Training Data
Imitation Learning in Discounted Linear MDPs without exploration assumptions
Learning Causal Relations from Subsampled Time Series with Two Time-Slices
WARM: On the Benefits of Weight Averaged Reward Models
ReLU to the Rescue: Improve Your On-Policy Actor-Critic with Positive Advantages
EvIL: Evolution Strategies for Generalisable Imitation Learning
Post-hoc Part-Prototype Networks
Knowledge-aware Reinforced Language Models for Protein Directed Evolution
Promptbreeder: Self-Referential Self-Improvement via Prompt Evolution
DiracDiffusion: Denoising and Incremental Reconstruction with Assured Data-Consistency
Differentially Private Post-Processing for Fair Regression
Enabling Uncertainty Estimation in Iterative Neural Networks
LangCell: Language-Cell Pre-training for Cell Identity Understanding
Unveiling Privacy, Memorization, and Input Curvature Links
RL-CFR: Improving Action Abstraction for Imperfect Information Extensive-Form Games with Reinforcement Learning
The Pitfalls and Promise of Conformal Inference Under Adversarial Attacks
DynSyn: Dynamical Synergistic Representation for Efficient Learning and Control in Overactuated Embodied Systems
Positive Concave Deep Equilibrium Models
Do Language Models Exhibit the Same Cognitive Biases in Problem Solving as Human Learners?
The Entropy Enigma: Success and Failure of Entropy Minimization
Human Alignment of Large Language Models through Online Preference Optimisation
Nash Learning from Human Feedback
Decoding-time Realignment of Language Models
Generalized Preference Optimization: A Unified Approach to Offline Alignment
AD3: Implicit Action is the Key for World Models to Distinguish the Diverse Visual Distractors
COALA: A Practical and Vision-Centric Federated Learning Platform
Inherent Trade-Offs between Diversity and Stability in Multi-Task Benchmarks
VoroNav: Voronoi-based Zero-shot Object Navigation with Large Language Model
Human-like Category Learning by Injecting Ecological Priors from Large Language Models into Neural Networks
Trained Random Forests Completely Reveal your Dataset
ReLU Network with Width $d+\mathcal{O}(1)$ Can Achieve Optimal Approximation Rate
Recovering Labels from Local Updates in Federated Learning
Sign Rank Limitations for Inner Product Graph Decoders
Conditional Normalizing Flows for Active Learning of Coarse-Grained Molecular Representations
Stochastic Gradient Flow Dynamics of Test Risk and its Exact Solution for Weak Features
Diffusion Models Encode the Intrinsic Dimension of Data Manifolds
Towards an Understanding of Stepwise Inference in Transformers: A Synthetic Graph Navigation Model
Position: Amazing Things Come From Having Many Good Models
Exploring Correlations of Self-Supervised Tasks for Graphs
FlowMM: Generating Materials with Riemannian Flow Matching
Exponential Spectral Pursuit: An Effective Initialization Method for Sparse Phase Retrieval
Causal Discovery via Conditional Independence Testing with Proxy Variables
Mean-field Underdamped Langevin Dynamics and its Spacetime Discretization
Correcting Diffusion-Based Perceptual Image Compression with Privileged End-to-End Decoder
Copyright Traps for Large Language Models
Out-of-Domain Generalization in Dynamical Systems Reconstruction
Not all distributional shifts are equal: Fine-grained robust conformal inference
Robust Optimization in Protein Fitness Landscapes Using Reinforcement Learning in Latent Space
On the Hardness of Probabilistic Neurosymbolic Learning
Feature Attribution with Necessity and Sufficiency via Dual-stage Perturbation Test for Causal Explanation
FairProof : Confidential and Certifiable Fairness for Neural Networks
MagicPose: Realistic Human Poses and Facial Expressions Retargeting with Identity-aware Diffusion
PAC-Bayesian Error Bound, via Rényi Divergence, for a Class of Linear Time-Invariant State-Space Models
Exploring the Benefit of Activation Sparsity in Pre-training
Exploiting Negative Samples: A Catalyst for Cohort Discovery in Healthcare Analytics
LaMAGIC: Language-Model-based Topology Generation for Analog Integrated Circuits
FuRL: Visual-Language Models as Fuzzy Rewards for Reinforcement Learning
GistScore: Learning Better Representations for In-Context Example Selection with Gist Bottlenecks
Retrieval Across Any Domains via Large-scale Pre-trained Model
Learning Exceptional Subgroups by End-to-End Maximizing KL-Divergence
BWS: Best Window Selection Based on Sample Scores for Data Pruning across Broad Ranges
ByMI: Byzantine Machine Identification with False Discovery Rate Control
Learning Useful Representations of Recurrent Neural Network Weight Matrices
An Image is Worth Multiple Words: Discovering Object Level Concepts using Multi-Concept Prompt Learning
BOtied: Multi-objective Bayesian optimization with tied multivariate ranks
Sequence Compression Speeds Up Credit Assignment in Reinforcement Learning
Highway Value Iteration Networks
GPTSwarm: Language Agents as Optimizable Graphs
Boosting Reinforcement Learning with Strongly Delayed Feedback Through Auxiliary Short Delays
Scalable Safe Policy Improvement for Factored Multi-Agent MDPs
tinyBenchmarks: evaluating LLMs with fewer examples
Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement Learning
Larimar: Large Language Models with Episodic Memory Control
Whispering Experts: Neural Interventions for Toxicity Mitigation in Language Models
On Gradient-like Explanation under a Black-box Setting: When Black-box Explanations Become as Good as White-box
Decoupling Feature Extraction and Classification Layers for Calibrated Neural Networks
Contrasting Multiple Representations with the Multi-Marginal Matching Gap
Trustless Audits without Revealing Data or Models
Tandem Transformers for Inference Efficient LLMs
Data-free Neural Representation Compression with Riemannian Neural Dynamics
FedMBridge: Bridgeable Multimodal Federated Learning
Expressivity and Generalization: Fragment-Biases for Molecular GNNs
Uncertainty for Active Learning on Graphs
QORA: Zero-Shot Transfer via Interpretable Object-Relational Model Learning
Prior Specification for Bayesian Matrix Factorization via Prior Predictive Matching
Risk Aware Benchmarking of Large Language Models
Amend to Alignment: Decoupled Prompt Tuning for Mitigating Spurious Correlation in Vision-Language Models
Variance-reduced Zeroth-Order Methods for Fine-Tuning Language Models
On the Consistency of Kernel Methods with Dependent Observations
From Inverse Optimization to Feasibility to ERM
Characteristic Guidance: Non-linear Correction for Diffusion Model at Large Guidance Scale
Unveiling the Cycloid Trajectory of EM Iterations in Mixed Linear Regression
Dealing With Unbounded Gradients in Stochastic Saddle-point Optimization
Position: On the Societal Impact of Open Foundation Models
FADAS: Towards Federated Adaptive Asynchronous Optimization
Sequential Disentanglement by Extracting Static Information From A Single Sequence Element
Finite-Time Convergence and Sample Complexity of Actor-Critic Multi-Objective Reinforcement Learning
Deep Fusion: Efficient Network Training via Pre-trained Initializations
All-in-one simulation-based inference
Deep Neural Room Acoustics Primitive
Position: Understanding LLMs Requires More Than Statistical Generalization
Failures Are Fated, But Can Be Faded: Characterizing and Mitigating Unwanted Behaviors in Large-Scale Vision and Language Models
Premier-TACO is a Few-Shot Policy Learner: Pretraining Multitask Representation via Temporal Action-Driven Contrastive Loss
Sparser, Better, Deeper, Stronger: Improving Static Sparse Training with Exact Orthogonal Initialization
Tripod: Three Complementary Inductive Biases for Disentangled Representation Learning
Physics-Informed Neural Network Policy Iteration: Algorithms, Convergence, and Verification
Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling
Speech Self-Supervised Learning Using Diffusion Model Synthetic Data
VQDNA: Unleashing the Power of Vector Quantization for Multi-Species Genomic Sequence Modeling
EMC$^2$: Efficient MCMC Negative Sampling for Contrastive Learning with Global Convergence
A Distributional Analogue to the Successor Representation
Diffuse, Sample, Project: Plug-And-Play Controllable Graph Generation
GLoRe: When, Where, and How to Improve LLM Reasoning via Global and Local Refinements
Layer-Aware Analysis of Catastrophic Overfitting: Revealing the Pseudo-Robust Shortcut Dependency
Slot Abstractors: Toward Scalable Abstract Visual Reasoning
Enhancing Size Generalization in Graph Neural Networks through Disentangled Representation Learning
Neural Operators with Localized Integral and Differential Kernels
Inferring the Long-Term Causal Effects of Long-Term Treatments from Short-Term Experiments
SelMatch: Effectively Scaling Up Dataset Distillation via Selection-Based Initialization and Partial Updates by Trajectory Matching
MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions
Breadth-First Exploration on Adaptive Grid for Reinforcement Learning
Antibody Design Using a Score-based Diffusion Model Guided by Evolutionary, Physical and Geometric Constraints
CarbonNovo: Joint Design of Protein Structure and Sequence Using a Unified Energy-based Model
Locally Interdependent Multi-Agent MDP: Theoretical Framework for Decentralized Agents with Dynamic Dependencies
Fair Classification with Partial Feedback: An Exploration-Based Data Collection Approach
A Simple Early Exiting Framework for Accelerated Sampling in Diffusion Models
RLAIF vs. RLHF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
Unmasking Vulnerabilities: Cardinality Sketches under Adaptive Inputs
PairNet: Training with Observed Pairs to Estimate Individual Treatment Effect
InstructZero: Efficient Instruction Optimization for Black-Box Large Language Models
ODIN: Disentangled Reward Mitigates Hacking in RLHF
Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text
Revitalizing Multivariate Time Series Forecasting: Learnable Decomposition with Inter-Series Dependencies and Intra-Series Variations Modeling
WAVES: Benchmarking the Robustness of Image Watermarks
ELTA: An Enhancer against Long-Tail for Aesthetics-oriented Models
Remembering to Be Fair: Non-Markovian Fairness in Sequential Decision Making
Transforming and Combining Rewards for Aligning Large Language Models
The Linear Representation Hypothesis and the Geometry of Large Language Models
On the Origins of Linear Representations in Large Language Models
Active Label Correction for Semantic Segmentation with Foundation Models
Monotone, Bi-Lipschitz, and Polyak-Łojasiewicz Networks
Imitation Learning from Purified Demonstrations
SiT: Symmetry-invariant Transformers for Generalisation in Reinforcement Learning
ConTextual: Evaluating Context-Sensitive Text-Rich Visual Reasoning in Large Multimodal Models
Score-Based Causal Discovery of Latent Variable Causal Models
S3GCL: Spectral, Swift, Spatial Graph Contrastive Learning
Out of the Ordinary: Spectrally Adapting Regression for Covariate Shift
Self-Consistency Training for Density-Functional-Theory Hamiltonian Prediction
Parallelized Spatiotemporal Slot Binding for Videos
Dr. Strategy: Model-Based Generalist Agents with Strategic Dreaming
PlanDQ: Hierarchical Plan Orchestration via D-Conductor and Q-Performer
OAK: Enriching Document Representations using Auxiliary Knowledge for Extreme Classification
Position: LLMs Can’t Plan, But Can Help Planning in LLM-Modulo Frameworks
Beyond the Federation: Topology-aware Federated Learning for Generalization to Unseen Clients
Learning Label Shift Correction for Test-Agnostic Long-Tailed Recognition
Image Clustering with External Guidance
Few-Shot Character Understanding in Movies as an Assessment to Meta-Learning of Theory-of-Mind
Predictive Performance Comparison of Decision Policies Under Confounding
Smooth Tchebycheff Scalarization for Multi-Objective Optimization
Two Heads are Actually Better than One: Towards Better Adversarial Robustness via Transduction and Rejection
PinNet: Pinpoint Instructive Information for Retrieval Augmented Code-to-Text Generation
Adaptive Accompaniment with ReaLchords
Acquisition Conditioned Oracle for Nongreedy Active Feature Acquisition
Multigroup Robustness
Amortizing Pragmatic Program Synthesis with Rankings
Time Weaver: A Conditional Time Series Generation Model
Optimal Differentially Private Model Training with Public Data
Flora: Low-Rank Adapters Are Secretly Gradient Compressors
Stabilizing Policy Gradients for Stochastic Differential Equations via Consistency with Perturbation Process
RoboMP$^2$: A Robotic Multimodal Perception-Planning Framework with Multimodal Large Language Models
Unifying Bayesian Flow Networks and Diffusion Models through Stochastic Differential Equations
Learning to Scale Logits for Temperature-Conditional GFlowNets
SAPG: Split and Aggregate Policy Gradients
Symmetric Replay Training: Enhancing Sample Efficiency in Deep Reinforcement Learning for Combinatorial Optimization
Improving Transformers with Dynamically Composable Multi-Head Attention
Accelerated Speculative Sampling Based on Tree Monte Carlo
Privacy-Preserving Data Release Leveraging Optimal Transport and Particle Gradient Descent
Unsupervised Domain Adaptation for Anatomical Structure Detection in Ultrasound Images
On Hypothesis Transfer Learning of Functional Linear Models
Building Socially-Equitable Public Models
Combinatorial Approximations for Cluster Deletion: Simpler, Faster, and Better
Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs
Reducing Balancing Error for Causal Inference via Optimal Transport
Homomorphism Counts for Graph Neural Networks: All About That Basis
Smoothness Adaptive Hypothesis Transfer Learning
Neural operators meet conjugate gradients: The FCG-NO method for efficient PDE solving
An Empirical Study of Realized GNN Expressiveness
Explorations of Self-Repair in Language Models
Interacting Diffusion Processes for Event Sequence Forecasting
Sparsest Models Elude Pruning: An Exposé of Pruning’s Current Capabilities
Attribution-based Explanations that Provide Recourse Cannot be Robust
Online Non-stochastic Control with Partial Feedback
One Meta-tuned Transformer is What You Need for Few-shot Learning
NeWRF: A Deep Learning Framework for Wireless Radiation Field Reconstruction and Channel Prediction
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models
EquiPocket: an E(3)-Equivariant Geometric Graph Neural Network for Ligand Binding Site Prediction
Breaking through the learning plateaus of in-context learning in Transformer
Efficient PAC Learnability of Dynamical Systems Over Multilayer Networks
Aligning Transformers with Weisfeiler-Leman
InfoNet: Neural Estimation of Mutual Information without Test-Time Optimization
Coresets for Multiple $\ell_p$ Regression
Generalized Smooth Variational Inequalities: Methods with Adaptive Stepsizes
PAGER: Accurate Failure Characterization in Deep Regression Models
Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study
Emergent Representations of Program Semantics in Language Models Trained on Programs
Robustness of Nonlinear Representation Learning
Skill Set Optimization: Reinforcing Language Model Behavior via Transferable Skills
Parsimonious Learning-Augmented Approximations for Dense Instances of $\mathcal{NP}$-hard Problems
Profile Reconstruction from Private Sketches
In-Context Learning Agents Are Asymmetric Belief Updaters
Private Truly-Everlasting Robust-Prediction
Pedestrian Attribute Recognition as Label-balanced Multi-label Learning
Convergence of Some Convex Message Passing Algorithms to a Fixed Point
Improving Robustness to Multiple Spurious Correlations by Multi-Objective Optimization
Adapting Pretrained ViTs with Convolution Injector for Visuo-Motor Control
Characterizing Overfitting in Kernel Ridgeless Regression Through the Eigenspectrum
Entropy-Reinforced Planning with Large Language Models for Drug Discovery
Delving into Differentially Private Transformer
Minimum Norm Interpolation Meets The Local Theory of Banach Spaces
Incorporating probabilistic domain knowledge into deep multiple instance learning
Don't be so Negative! Score-based Generative Modeling with Oracle-assisted Guidance
How Do Nonlinear Transformers Learn and Generalize in In-Context Learning?
PANDA: Expanded Width-Aware Message Passing Beyond Rewiring
Federated Optimization with Doubly Regularized Drift Correction
Do Large Language Models Perform the Way People Expect? Measuring the Human Generalization Function
From Coarse to Fine: Enable Comprehensive Graph Self-supervised Learning with Multi-granular Semantic Ensemble
Fair Resource Allocation in Multi-Task Learning
Causally Motivated Personalized Federated Invariant Learning with Shortcut-Averse Information-Theoretic Regularization
Improving Computational Complexity in Statistical Models with Local Curvature Information
Disentangled 3D Scene Generation with Layout Learning
Two Fists, One Heart: Multi-Objective Optimization Based Strategy Fusion for Long-tailed Learning
Instruction Tuning for Secure Code Generation
Balancing Feature Similarity and Label Variability for Optimal Size-Aware One-shot Subset Selection
Hidden Traveling Waves bind Working Memory Variables in Recurrent Neural Networks
An Information-Theoretic Analysis of In-Context Learning
Invariant Risk Minimization Is A Total Variation Model
Extending Test-Time Augmentation with Metamorphic Relations for Combinatorial Problems
Optimization without Retraction on the Random Generalized Stiefel Manifold
DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models (Exemplified as A Video Agent)
Learning and Forgetting Unsafe Examples in Large Language Models
Why Do Animals Need Shaping? A Theory of Task Composition and Curriculum Learning
Parallel Affine Transformation Tuning of Markov Chain Monte Carlo
Efficient Low-Rank Matrix Estimation, Experimental Design, and Arm-Set-Dependent Low-Rank Bandits
OTMatch: Improving Semi-Supervised Learning with Optimal Transport
Harnessing the Power of Neural Operators with Automatically Encoded Conservation Laws
Arrows of Time for Large Language Models
Mobile Attention: Mobile-Friendly Linear-Attention for Vision Transformers
Rényi Pufferfish Privacy: General Additive Noise Mechanisms and Privacy Amplification by Iteration via Shift Reduction Lemmas
Graph Neural Networks with a Distribution of Parametrized Graphs
Memorization Through the Lens of Curvature of Loss Function Around Samples
MoMo: Momentum Models for Adaptive Learning Rates
OT-CLIP: Understanding and Generalizing CLIP via Optimal Transport
CuTS: Customizable Tabular Synthetic Data Generation
GeoMFormer: A General Architecture for Geometric Molecular Representation Learning
RL-VLM-F: Reinforcement Learning from Vision Language Foundation Model Feedback
Unsupervised Evaluation of Code LLMs with Round-Trip Correctness
Towards Robust Model-Based Reinforcement Learning Against Adversarial Corruption
The Fundamental Limits of Least-Privilege Learning
Safe Exploration in Dose Finding Clinical Trials with Heterogeneous Participants
UP2ME: Univariate Pre-training to Multivariate Fine-tuning as a General-purpose Framework for Multivariate Time Series Analysis
Why Do You Grok? A Theoretical Analysis on Grokking Modular Addition
Accelerated Policy Gradient: On the Convergence Rates of the Nesterov Momentum for Reinforcement Learning
Transformers Implement Functional Gradient Descent to Learn Non-Linear Functions In Context
KernelSHAP-IQ: Weighted Least Square Optimization for Shapley Interactions
Momentor: Advancing Video Large Language Model with Fine-Grained Temporal Reasoning
One Prompt is not Enough: Automated Construction of a Mixture-of-Expert Prompts
Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch
PolySketchFormer: Fast Transformers via Sketching Polynomial Kernels
Characterizing Large Language Model Geometry Helps Solve Toxicity Detection and Generation
DeepPolar: Inventing Nonlinear Large-Kernel Polar Codes via Deep Learning
Triplet Interaction Improves Graph Transformers: Accurate Molecular Graph Learning with Triplet Graph Transformers
EvGGS: A Collaborative Learning Framework for Event-based Generalizable Gaussian Splatting
Position: Fundamental Limitations of LLM Censorship Necessitate New Approaches
In-Context Decision Transformer: Reinforcement Learning via Hierarchical Chain-of-Thought
Self-Supervised Coarsening of Unstructured Grid with Automatic Differentiation
More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning
BAT: Learning to Reason about Spatial Sounds with Large Language Models
Selective Mixup Helps with Distribution Shifts, But Not (Only) because of Mixup
Feel-Good Thompson Sampling for Contextual Dueling Bandits
Understanding and Diagnosing Deep Reinforcement Learning
Beyond the ROC Curve: Classification Trees Using Cost-Optimal Curves, with Application to Imbalanced Datasets
Distributionally Robust Data Valuation
Unsupervised Representation Learning of Brain Activity via Bridging Voxel Activity and Functional Connectivity
Enhancing Value Function Estimation through First-Order State-Action Dynamics in Offline Reinforcement Learning
Double-Step Alternating Extragradient with Increasing Timescale Separation for Finding Local Minimax Points: Provable Improvements
Graph-based Time Series Clustering for End-to-End Hierarchical Forecasting
Learning Latent Space Hierarchical EBM Diffusion Models
Translation Equivariant Transformer Neural Processes
Efficient Denoising Diffusion via Probabilistic Masking
Visual Representation Learning with Stochastic Frame Prediction
Exploration and Anti-Exploration with Distributional Random Network Distillation
Position: Is machine learning good or bad for the natural sciences?
A New Theoretical Perspective on Data Heterogeneity in Federated Optimization
SaVeR: Optimal Data Collection Strategy for Safe Policy Evaluation in Tabular MDP
Sample Complexity Bounds for Estimating Probability Divergences under Invariances
Pruned Pivot: Correlation Clustering Algorithm for Dynamic, Parallel, and Local Computation Models
Latent Noise Segmentation: How Neural Noise Leads to the Emergence of Segmentation and Grouping
Position: Evolving AI Collectives Enhance Human Diversity and Enable Self-Regulation
Scaling Beyond the GPU Memory Limit for Large Mixture-of-Experts Model Training
Language Agents with Reinforcement Learning for Strategic Play in the Werewolf Game
xT: Nested Tokenization for Larger Context in Large Images
CompeteAI: Understanding the Competition Dynamics of Large Language Model-based Agents
Joint Composite Latent Space Bayesian Optimization
A Dense Reward View on Aligning Text-to-Image Diffusion with Preference
Improved Differentially Private and Lazy Online Convex Optimization: Lower Regret without Smoothness Requirements
Position: AI/ML Influencers Have a Place in the Academic Process
Characterizing ResNet's Universal Approximation Capability
Overcoming the Optimizer's Curse: Obtaining Realistic Prescriptions from Neural Networks
We use cookies to store which papers have been visited.
I agree
Successful Page Load
ICML uses cookies to remember that you are logged in. By using our websites, you agree to the placement of cookies.
Our Privacy Policy »
Accept Cookies