Show Detail » Timezone:
Filter Events:  
Filter Rooms:  


6 a.m.
Invited Talk:
Weinan E
(ends 7:00 AM)
7 a.m.
Coffee Break
7:30 a.m.
Spotlight s 7:30-7:35
[7:30] Why the Rich Get Richer? On the Balancedness of Random Partition Models
Oral s 7:35-8:15
[7:35] Tackling covariate shift with node-based Bayesian neural networks
[7:55] How Tempering Fixes Data Augmentation in Bayesian Neural Networks
Spotlight s 8:15-8:25
[8:15] A Completely Tuning-Free and Robust Approach to Sparse Precision Matrix Estimation
[8:20] Markov Chain Monte Carlo for Continuous-Time Switching Dynamical Systems
Oral s 8:25-8:45
[8:25] Tractable Uncertainty for Structure Learning
Spotlight s 8:45-8:55
[8:45] Calibrated Learning to Defer with One-vs-All Classifiers
[8:50] Adapting the Linearised Laplace Model Evidence for Modern Deep Learning
(ends 9:00 AM)
Spotlight s 7:30-7:35
[7:30] Exploring and Exploiting Hubness Priors for High-Quality GAN Latent Sampling
Oral s 7:35-7:55
[7:35] Equivariant Diffusion for Molecule Generation in 3D
Spotlight s 7:55-9:00
[7:55] Controlling Conditional Language Models without Catastrophic Forgetting
[8:00] GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models
[8:05] Structure-preserving GANs
[8:10] Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale
[8:15] Estimating the Optimal Covariance with Imperfect Mean in Diffusion Probabilistic Models
[8:20] ButterflyFlow: Building Invertible Layers with Butterfly Matrices
[8:25] Forward Operator Estimation in Generative Models with Kernel Transfer Operators
[8:30] Conditional GANs with Auxiliary Discriminative Classifier
[8:35] Improved StyleGAN-v2 based Inversion for Out-of-Distribution Images
[8:40] Matching Normalizing Flows and Probability Paths on Manifolds
[8:45] Marginal Distribution Adaptation for Discrete Sets via Module-Oriented Divergence Minimization
[8:50] Learning to Incorporate Texture Saliency Adaptive Attention to Image Cartoonization
[8:55] Region-Based Semantic Factorization in GANs
(ends 9:00 AM)
Spotlight s 7:30-7:45
[7:30] Fast Convex Optimization for Two-Layer ReLU Networks: Equivalent Model Classes and Cone Decompositions
[7:35] NysADMM: faster composite convex optimization via low-rank approximation
[7:40] FedNew: A Communication-Efficient and Privacy-Preserving Newton-Type Method for Federated Learning
Oral s 7:45-8:05
[7:45] Exact Optimal Accelerated Complexity for Fixed-Point Iterations
Spotlight s 8:05-8:30
[8:05] Unraveling Attention via Convex Duality: Analysis and Interpretations of Vision Transformers
[8:10] Sparser Kernel Herding with Pairwise Conditional Gradients without Swap Steps
[8:15] Only tails matter: Average-Case Universality and Robustness in the Convex Regime
[8:20] Approximate Frank-Wolfe Algorithms over Graph-structured Support Sets
[8:25] Batch Greenkhorn Algorithm for Entropic-Regularized Multimarginal Optimal Transport: Linear Rate of Convergence and Iteration Complexity
Oral s 8:30-8:50
[8:30] Continuous-Time Analysis of Accelerated Gradient Methods via Conservation Laws in Dilated Coordinate Systems
Spotlight s 8:50-9:00
[8:50] Neural Fisher Discriminant Analysis: Optimal Neural Network Embeddings in Polynomial Time
[8:55] Active Sampling for Min-Max Fairness
(ends 9:00 AM)
Spotlight s 7:30-7:45
[7:30] Differentially Private Approximate Quantiles
[7:35] Fairness Interventions as (Dis)Incentives for Strategic Manipulation
[7:40] Robust Models Are More Interpretable Because Attributions Look Normal
Oral s 7:45-8:05
[7:45] Bounding Training Data Reconstruction in Private (Deep) Learning
Spotlight s 8:05-9:00
[8:05] A Joint Exponential Mechanism For Differentially Private Top-$k$
[8:10] Transfer Learning In Differential Privacy's Hybrid-Model
[8:15] Robust Kernel Density Estimation with Median-of-Means principle
[8:20] Sequential Covariate Shift Detection Using Classifier Two-Sample Tests
[8:25] Plug & Play Attacks: Towards Robust and Flexible Model Inversion Attacks
[8:30] FriendlyCore: Practical Differentially Private Aggregation
[8:35] ViT-NeT: Interpretable Vision Transformers with Neural Tree Decoder
[8:40] Fishing for User Data in Large-Batch Federated Learning via Gradient Magnification
[8:45] Public Data-Assisted Mirror Descent for Private Model Training
[8:50] Low-Complexity Deep Convolutional Neural Networks on Fully Homomorphic Encryption Using Multiplexed Parallel Convolutions
[8:55] Differential Privacy Has Disparate Impact on Generative Models and Synthetic Data
(ends 9:00 AM)
Spotlight s 7:30-8:55
[7:30] Optimally Controllable Perceptual Lossy Compression
[7:35] Fused Acoustic and Text Pretraining for Speech Synthesis and Editing
[7:40] On the Learning of Non-autoregressive Transformers
[7:45] Latent Diffusion Energy-Based Model for Interpretable Text Modelling
[7:50] UniREx: A Unified Learning Framework for Language Model Rationale Extraction
[7:55] Black-Box Tuning for Language-Model-as-a-Service
[8:00] Certified Robustness Against Natural Language Attacks by Causal Intervention
[8:05] Co-training Improves Prompt-based Learning for Large Language Models
[8:10] Directed Acyclic Transformer for Non-Autoregressive Machine Translation
[8:15] StreamingQA: A Benchmark for Adaptation to New Knowledge over Time in Question Answering Models
[8:20] Unsupervised Detection of Contextualized Embedding Bias with Application to Ideology
[8:25] Generative Cooperative Networks for Natural Language Generation
[8:30] What Language Model Architecture and Pretraining Objective Works Best for Zero-Shot Generalization?
[8:35] Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding
[8:40] Codeformer: Learning to Translate from C to CUDA
[8:45] Improving Self-Supervised Speech Representations by Disentangling Speakers
[8:50] On Distribution Shift in Learning-based Bug Detectors
(ends 9:00 AM)
Oral s 7:30-7:50
[7:30] Improved No-Regret Algorithms for Stochastic Shortest Path with Linear MDP
Spotlight s 7:50-8:50
[7:50] On the Impossibility of Learning to Cooperate with Adaptive Partner Strategies in Repeated Games
[7:55] Distributional Hamilton-Jacobi-Bellman Equations for Continuous-Time Reinforcement Learning
[8:00] Provable Reinforcement Learning with a Short-Term Memory
[8:05] Optimistic Linear Support and Successor Features as a Basis for Optimal Policy Transfer
[8:10] Mirror Learning: A Unifying Framework of Policy Optimisation
[8:15] Dynamic Regret of Online Markov Decision Processes
[8:20] Learning Infinite-horizon Average-reward Markov Decision Process with Constraints
[8:25] A State-Distribution Matching Approach to Non-Episodic Reinforcement Learning
[8:30] Langevin Monte Carlo for Contextual Bandits
[8:35] Prompting Decision Transformer for Few-shot Policy Generalization
[8:40] Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning
[8:45] Human-in-the-loop: Provably Efficient Preference-based Reinforcement Learning with General Function Approximation
(ends 9:00 AM)
Oral s 7:30-7:50
[7:30] Detecting Adversarial Examples Is (Nearly) As Hard As Classifying Them
Spotlight s 7:50-7:55
[7:50] ShiftAddNAS: Hardware-Inspired Search for More Accurate and Efficient Neural Networks
Oral s 7:55-8:15
[7:55] To Smooth or Not? When Label Smoothing Meets Noisy Labels
Spotlight s 8:15-8:55
[8:15] Certifying Out-of-Domain Generalization for Blackbox Functions
[8:20] Intriguing Properties of Input-Dependent Randomized Smoothing
[8:25] Provably Adversarially Robust Nearest Prototype Classifiers
[8:30] Evaluating the Adversarial Robustness of Adaptive Test-time Defenses
[8:35] On The Generalization Analysis of Adversarial Learning
[8:40] Demystifying the Adversarial Robustness of Random Transformation Defenses
[8:45] Double Sampling Randomized Smoothing
[8:50] TPC: Transformation-Specific Smoothing for Point Cloud Models
(ends 9:00 AM)
Spotlight s 7:30-9:00
[7:30] Structural Entropy Guided Graph Hierarchical Pooling
[7:35] Self-Supervised Representation Learning via Latent Graph Prediction
[7:40] DSTAGNN: Dynamic Spatial-Temporal Aware Graph Neural Network for Traffic Flow Forecasting
[7:45] Coarsening the Granularity: Towards Structurally Sparse Lottery Tickets
[7:50] Omni-Granular Ego-Semantic Propagation for Self-Supervised Graph Representation Learning
[7:55] Analyzing and Mitigating Interference in Neural Architecture Search
[8:00] Adversarial robustness against multiple and single $l_p$-threat models via quick fine-tuning of robust classifiers
[8:05] On the Practicality of Deterministic Epistemic Uncertainty
[8:10] Combining Diverse Feature Priors
[8:15] Removing Batch Normalization Boosts Adversarial Training
[8:20] Reverse Engineering $\ell_p$ attacks: A block-sparse optimization approach with recovery guarantees
[8:25] DRAGONN: Distributed Randomized Approximate Gradients of Neural Networks
[8:30] A deep convolutional neural network that is invariant to time rescaling
[8:35] LyaNet: A Lyapunov Framework for Training Neural ODEs
[8:40] Transfer and Marginalize: Explaining Away Label Noise with Privileged Information
[8:45] On Collective Robustness of Bagging Against Data Poisoning
[8:50] Hindering Adversarial Attacks with Implicit Neural Representations
[8:55] From Noisy Prediction to True Label: Noisy Prediction Calibration via Generative Model
(ends 9:00 AM)
Spotlight s 7:30-9:00
[7:30] Multi-Task Learning as a Bargaining Game
[7:35] Frustratingly Easy Transferability Estimation
[7:40] Transformer Neural Processes: Uncertainty-Aware Meta Learning Via Sequence Modeling
[7:45] A Difference Standardization Method for Mutual Transfer Learning
[7:50] Improving Task-free Continual Learning by Distributionally Robust Memory Evolution
[7:55] A multi-objective / multi-task learning framework induced by Pareto stationarity
[8:00] Sparse Invariant Risk Minimization
[8:05] Provable Domain Generalization via Invariant-Feature Subspace Recovery
[8:10] A Closer Look at Smoothness in Domain Adversarial Training
[8:15] Balancing Discriminability and Transferability for Source-Free Domain Adaptation
[8:20] Model Agnostic Sample Reweighting for Out-of-Distribution Learning
[8:25] Zero-shot AutoML with Pretrained Models
[8:30] Efficient Variance Reduction for Meta-learning
[8:35] Generalizing to Evolving Domains with Latent Structure-Aware Sequential Autoencoder
[8:40] Partial disentanglement for domain adaptation
[8:45] Subspace Learning for Effective Meta-Learning
[8:50] Continual Learning via Function-Space Variational Inference
[8:55] Efficient Test-Time Model Adaptation without Forgetting
(ends 9:00 AM)
Oral s 7:30-7:50
[7:30] Online Learning for Min Sum Set Cover and Pandora’s Box
Spotlight s 7:50-7:55
[7:50] Smoothed Adversarial Linear Contextual Bandits with Knapsacks
Oral s 7:55-8:15
[7:55] A Simple yet Universal Strategy for Online Convex Optimization
Spotlight s 8:15-8:40
[8:15] Thompson Sampling for (Combinatorial) Pure Exploration
[8:20] Revisiting Online Submodular Minimization: Gap-Dependent Regret Bounds, Best of Both Worlds and Adversarial Robustness
[8:25] Rotting infinitely many-armed bandits
[8:30] Equivalence Analysis between Counterfactual Regret Minimization and Online Mirror Descent
[8:35] Simultaneously Learning Stochastic and Adversarial Bandits with General Graph Feedback
Oral s 8:40-9:00
[8:40] Batched Dueling Bandits
(ends 9:00 AM)
9 a.m.
Lunch Break - on your own
10:30 a.m.
Spotlight s 10:30-11:25
[10:30] Meaningfully debugging model mistakes using conceptual counterfactual explanations
[10:35] Measuring the Effect of Training Data on Deep Learning Predictions via Randomized Experiments
[10:40] Robust Counterfactual Explanations for Tree-Based Ensembles
[10:45] A Rigorous Study of Integrated Gradients Method and Extensions to Internal Neuron Attributions
[10:50] Estimating and Penalizing Induced Preference Shifts in Recommender Systems
[10:55] Framework for Evaluating Faithfulness of Local Explanations
[11:00] A Consistent and Efficient Evaluation Strategy for Attribution Methods
[11:05] Interpretable Off-Policy Learning via Hyperbox Search
[11:10] Label-Descriptive Patterns and Their Application to Characterizing Classification Errors
[11:15] XAI for Transformers: Better Explanations through Conservative Propagation
[11:20] Quantification and Analysis of Layer-wise and Pixel-wise Information Discarding
Oral s 11:25-11:45
[11:25] Training Characteristic Functions with Reinforcement Learning: XAI-methods play Connect Four
Spotlight s 11:45-12:00
[11:45] Neuron Dependency Graphs: A Causal Abstraction of Neural Networks
[11:50] On the Adversarial Robustness of Causal Algorithmic Recourse
[11:55] Knowledge-Grounded Self-Rationalization via Extractive and Natural Language Explanations
(ends 12:00 PM)
Spotlight s 10:30-10:40
[10:30] Additive Gaussian Processes Revisited
[10:35] Probabilistic ODE Solutions in Millions of Dimensions
Oral s 10:40-11:00
[10:40] Preconditioning for Scalable Gaussian Process Hyperparameter Optimization
Spotlight s 11:00-11:50
[11:00] Volatility Based Kernels and Moving Average Means for Accurate Forecasting with Gaussian Processes
[11:05] Fenrir: Physics-Enhanced Regression for Initial Value Problems
[11:10] Adaptive Gaussian Process Change Point Detection
[11:15] Variational nearest neighbor Gaussian process
[11:20] Spectral Representation of Robustness Measures for Optimization Under Input Uncertainty
[11:25] Bayesian Optimization under Stochastic Delayed Feedback
[11:30] Bayesian Optimization for Distributionally Robust Chance-constrained Problem
[11:35] Efficient Distributionally Robust Bayesian Optimization with Worst-case Sensitivity
[11:40] Improved Convergence Rates for Sparse Approximation Methods in Kernel-Based Learning
[11:45] Efficient First-Order Bayesian Optimization via Structured Automatic Differentiation
(ends 12:00 PM)
Spotlight s 10:30-10:40
[10:30] Stochastic Reweighted Gradient Descent
[10:35] Sharpened Quasi-Newton Methods: Faster Superlinear Rate and Larger Local Convergence Neighborhood
Oral s 10:40-11:00
[10:40] Topology-Aware Network Pruning using Multi-stage Graph Embedding and Reinforcement Learning
Spotlight s 11:00-11:20
[11:00] Image-to-Image Regression with Distribution-Free Uncertainty Quantification and Applications in Imaging
[11:05] FedNL: Making Newton-Type Methods Applicable to Federated Learning
[11:10] Value Function based Difference-of-Convex Algorithm for Bilevel Hyperparameter Selection Problems
[11:15] Dimension-free Complexity Bounds for High-order Nonconvex Finite-sum Optimization
Oral s 11:20-11:40
[11:20] Solving Stackelberg Prediction Game with Least Squares Loss Via Spherically Constrained Least Squares Reformulation
Spotlight s 11:40-11:55
[11:40] Probabilistic Bilevel Coreset Selection
[11:45] Linear-Time Gromov Wasserstein Distances using Low Rank Couplings and Costs
[11:50] On Implicit Bias in Overparameterized Bilevel Optimization
(ends 12:00 PM)
Spotlight s 10:30-12:00
[10:30] An iterative clustering algorithm for the Contextual Stochastic Block Model with optimality guarantees
[10:35] Smoothed Adaptive Weighting for Imbalanced Semi-Supervised Learning: Improve Reliability Against Unknown Distribution Data
[10:40] Class-Imbalanced Semi-Supervised Learning with Adaptive Thresholding
[10:45] Correlation Clustering via Strong Triadic Closure Labeling: Fast Approximation Algorithms and Practical Lower Bounds
[10:50] Interactive Correlation Clustering with Existential Cluster Constraints
[10:55] Simultaneous Graph Signal Clustering and Graph Learning
[11:00] Bregman Power k-Means for Clustering Exponential Family Data
[11:05] On Finite-Sample Identifiability of Contrastive Learning-Based Nonlinear Independent Component Analysis
[11:10] Revisiting Contrastive Learning through the Lens of Neighborhood Component Analysis: an Integrated Framework
[11:15] Open-Sampling: Exploring Out-of-Distribution data for Re-balancing Long-tailed datasets
[11:20] Confidence Score for Source-Free Unsupervised Domain Adaptation
[11:25] Gradient based clustering
[11:30] Global Optimization of K-Center Clustering
[11:35] Latent Outlier Exposure for Anomaly Detection with Contaminated Data
[11:40] Understanding Doubly Stochastic Clustering
[11:45] A Tighter Analysis of Spectral Clustering, and Beyond
[11:50] SpaceMAP: Visualizing High-Dimensional Data by Space Expansion
[11:55] Unsupervised Ground Metric Learning Using Wasserstein Singular Vectors
(ends 12:00 PM)
Oral s 10:30-10:50
[10:30] Align-RUDDER: Learning From Few Demonstrations by Reward Redistribution
Spotlight s 10:50-11:05
[10:50] Learning Transferable Polices By Inferring Agent Morphology
[10:55] DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with Prototypical Representations
[11:00] Stabilizing Off-Policy Deep Reinforcement Learning from Pixels
Oral s 11:05-11:25
[11:05] Offline RL Policies Should Be Trained to be Adaptive
Spotlight s 11:25-12:00
[11:25] CtrlFormer: Learning Transferable State Representation for Visual Control via Transformer
[11:30] Influence-Augmented Local Simulators: a Scalable Solution for Fast Deep RL in Large Networked Systems
[11:35] Lyapunov Density Models: Constraining Distribution Shift in Learning-Based Control
[11:40] PMIC: Improving Multi-Agent Reinforcement Learning with ProgressiveMutual Information Collaboration
[11:45] Supervised Off-Policy Ranking
[11:50] The Primacy Bias in Deep Reinforcement Learning
[11:55] Regularizing a Model-based Policy Stationary Distribution to Stabilize Offline Reinforcement Learning
(ends 12:00 PM)
Spotlight s 10:30-12:00
[10:30] DNA: Domain Generalization with Diversified Neural Averaging
[10:35] Unified Fourier-based Kernel and Nonlinearity Design for Equivariant Networkson Homogeneous Spaces
[10:40] DynaMixer: A Vision MLP Architecture with Dynamic Mixing
[10:45] Fishr: Invariant Gradient Variances for Out-of-distribution Generalization
[10:50] Robust Group Synchronization via Quadratic Programming
[10:55] UAST: Uncertainty-Aware Siamese Tracking
[11:00] Improving Generic Models for Image-Goal Navigation
[11:05] You Only Cut Once: Boosting Data Augmentation with a Single Cut
[11:10] Generative Modeling for Multitask Visual Learning
[11:15] HyperTransformer: Model Generation for Supervised and Semi-Supervised Few-Shot Learning
[11:20] Parametric Visual Program Induction with Function Modularization
[11:25] Deep Neural Network Fusion via Graph Matching with Applications to Model Ensemble and Federated Learning
[11:30] VLMixer: Unpaired Vision-Language Pre-training via Cross-Modal CutMix
[11:35] Neural Implicit Dictionary Learning via Mixture-of-Expert Training
[11:40] Time Is MattEr: Temporal Self-supervision for Video Transformers
[11:45] Channel Importance Matters in Few-Shot Image Classification
[11:50] Benchmarking and Analyzing Point Cloud Classification under Corruptions
[11:55] Understanding The Robustness in Vision Transformers
(ends 12:00 PM)
Spotlight s 10:30-11:00
[10:30] Online Continual Learning through Mutual Information Maximization
[10:35] Learning Iterative Reasoning through Energy Minimization
[10:40] DepthShrinker: A New Compression Paradigm Towards Boosting Real-Hardware Efficiency of Compact Neural Networks
[10:45] PoF: Post-Training of Feature Extractor for Improving Generalization
[10:50] Improving Ensemble Distillation with Weight Averaging and Diversifying Perturbation
[10:55] Set Based Stochastic Subsampling
Oral s 11:00-11:20
[11:00] Monarch: Expressive Structured Matrices for Efficient and Accurate Training
Spotlight s 11:20-11:55
[11:20] Generalizing to New Physical Systems via Context-Informed Dynamics Model
[11:25] Self-conditioning Pre-Trained Language Models
[11:30] TAM: Topology-Aware Margin Loss for Class-Imbalanced Node Classification
[11:35] Bitwidth Heterogeneous Federated Learning with Progressive Weight Dequantization
[11:40] Penalizing Gradient Norm for Efficiently Improving Generalization in Deep Learning
[11:45] Semiparametric Subgraph Reasoning for Question Answering over Large Knowledge Bases
[11:50] When AUC meets DRO: Optimizing Partial AUC for Deep Learning with Non-Convex Convergence Guarantee
(ends 12:00 PM)
Oral s 10:30-10:50
[10:30] Learning Mixtures of Linear Dynamical Systems
Spotlight s 10:50-11:00
[10:50] Massively Parallel $k$-Means Clustering for Perturbation Resilient Instances
[10:55] Residual-based Sampling for Online Outlier Robust PCA
Oral s 11:00-11:20
[11:00] Generalized Results for the Existence and Consistency of the MLE in the Bradley-Terry-Luce Model
Spotlight s 11:20-12:00
[11:20] Streaming Algorithms for Support-Aware Histograms
[11:25] Power-law escape rate of SGD
[11:30] Scaling Gaussian Process Optimization by Evaluating a Few Unique Candidates Multiple Times
[11:35] Faster Algorithms for Learning Convex Functions
[11:40] Feature selection using e-values
[11:45] ActiveHedge: Hedge meets Active Learning
[11:50] One-Pass algorithms for MAP Inference of Nonsymmetric Determinantal Point Processes
[11:55] Deciphering Lasso-based Classification Through a Large Dimensional Analysis of the Iterative Soft-Thresholding Algorithm
(ends 12:00 PM)
Spotlight s 10:30-11:55
[10:30] pathGCN: Learning General Graph Spatial Operators from Paths
[10:35] Graph-Coupled Oscillator Networks
[10:40] HousE: Knowledge Graph Embedding with Householder Parameterization
[10:45] Information Bottleneck-Guided Stochastic Attention Mechanism for Interpretable Graph Learning
[10:50] ProGCL: Rethinking Hard Negative Mining in Graph Contrastive Learning
[10:55] G$^2$CN: Graph Gaussian Convolution Networks with Concentrated Graph Filters
[11:00] SpeqNets: Sparsity-aware permutation-equivariant graph networks
[11:05] Let Invariant Rationale Discovery Inspire Graph Contrastive Learning
[11:10] Graph Neural Architecture Search Under Distribution Shifts
[11:15] How Powerful are Spectral Graph Neural Networks
[11:20] Constraint-based graph network simulator
[11:25] PACE: A Parallelizable Computation Encoder for Directed Acyclic Graphs
[11:30] Structure-Aware Transformer for Graph Representation Learning
[11:35] GNNRank: Learning Global Rankings from Pairwise Comparisons via Directed Graph Neural Networks
[11:40] Large-Scale Graph Neural Architecture Search
[11:45] Optimization-induced Implicit Graph Diffusion
[11:50] Deep and Flexible Graph Neural Architecture Search
(ends 12:00 PM)
Spotlight s 10:30-10:35
[10:30] Learning to Hash Robustly, Guaranteed
Oral s 10:35-10:55
[10:35] An Improved Analysis of Algorithmic Robustness
Spotlight s 10:55-11:30
[10:55] Policy Gradient Method For Robust Reinforcement Learning
[11:00] A query-optimal algorithm for finding counterfactuals
[11:05] Linear Bandit Algorithms with Sublinear Time Complexity
[11:10] Quantum-Inspired Algorithms from Randomized Numerical Linear Algebra
[11:15] Coordinated Attacks against Contextual Bandits: Fundamental Limits and Defense Mechanisms
[11:20] Correlated quantization for distributed mean estimation and optimization
[11:25] Multiple-Play Stochastic Bandits with Shareable Finite-Capacity Arms
Oral s 11:30-11:50
[11:30] Individual Preference Stability for Clustering
Spotlight s 11:50-12:00
[11:50] The algebraic path problem for graph metrics
[11:55] Steerable 3D Spherical Neurons
(ends 12:00 PM)
Coffee Break
1 p.m.
Short Break
1:15 p.m.
Spotlight s 1:15-1:50
[1:15] Rethinking Convergence in Deep Learning: Beyond Stationary Points
[1:20] Convergence and Recovery Guarantees of the K-Subspaces Method for Subspace Clustering
[1:25] Restarted Nonconvex Accelerated Gradient Descent: No More Polylogarithmic Factor in the $O(\epsilon^{-7/4})$ Complexity
[1:30] Understanding the unstable convergence of gradient descent
[1:35] Federated Minimax Optimization: Improved Convergence Analyses and Algorithms
[1:40] Inductive Matrix Completion: No Bad Local Minima and a Fast Algorithm
[1:45] AdaGrad Avoids Saddle Points
Oral s 1:50-2:10
[1:50] FEDNEST: Federated Bilevel Optimization
Spotlight s 2:10-2:35
[2:10] Fast and Provable Nonconvex Tensor RPCA
[2:15] Towards Understanding Convergence of Simultaneous Gradient Descent-Ascent in Minimax Optimization
[2:20] Convergence Rates of Non-Convex Stochastic Gradient Descent Under a Generic Lojasiewicz Condition and Local Smoothness
[2:25] A Single-Loop Gradient Descent and Perturbed Ascent Algorithm for Nonconvex Functional Constrained Optimization
[2:30] Anticorrelated Noise Injection for Improved Generalization
(ends 2:45 PM)
Spotlight s 1:15-2:05
[1:15] Model-Free Opponent Shaping
[1:20] Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline Reinforcement Learning
[1:25] Efficient Model-based Multi-agent Reinforcement Learning via Optimistic Equilibrium Computation
[1:30] Disentangling Sources of Risk for Distributional Multi-Agent Reinforcement Learning
[1:35] Scalable Deep Reinforcement Learning Algorithms for Mean Field Games
[1:40] Revisiting Some Common Practices in Cooperative Multi-Agent Reinforcement Learning
[1:45] Greedy when Sure and Conservative when Uncertain about the Opponents
[1:50] Self-Organized Polynomial-Time Coordination Graphs
[1:55] Individual Reward Assisted Multi-Agent Reinforcement Learning
[2:00] Generalized Beliefs for Cooperative AI
Oral s 2:05-2:25
[2:05] Independent Policy Gradient for Large-Scale Markov Potential Games: Sharper Rates, Function Approximation, and Game-Agnostic Convergence
Spotlight s 2:25-2:45
[2:25] Deconfounded Value Decomposition for Multi-Agent Reinforcement Learning
[2:30] Welfare Maximization in Competitive Equilibrium: Reinforcement Learning for Markov Exchange Economy
[2:35] Simplex Neural Population Learning: Any-Mixture Bayes-Optimality in Symmetric Zero-sum Games
[2:40] Sample and Communication-Efficient Decentralized Actor-Critic Algorithms with Finite-Time Analysis
(ends 2:45 PM)
Oral s 1:15-2:15
[1:15] H-Consistency Estimation Error of Surrogate Loss Minimizers
[1:35] Refined Convergence Rates for Maximum Likelihood Estimation under Finite Mixture Models
[1:55] Last Iterate Risk Bounds of SGD with Decaying Stepsize for Overparameterized Linear Regression
Spotlight s 2:15-2:45
[2:15] Sparse Mixed Linear Regression with Guarantees: Taming an Intractable Problem with Invex Relaxation
[2:20] TURF: Two-Factor, Universal, Robust, Fast Distribution Learning Algorithm
[2:25] Learning General Halfspaces with Adversarial Label Noise via Online Gradient Descent
[2:30] The Teaching Dimension of Regularized Kernel Learners
[2:35] Multiclass learning with margin: exponential rates with no bias-variance trade-off
[2:40] High Probability Guarantees for Nonconvex Stochastic Gradient Descent with Heavy Tails
(ends 2:45 PM)
Spotlight s 1:15-2:45
[1:15] Bayesian Nonparametric Learning for Point Processes with Spatial Homogeneity: A Spatial Analysis of NBA Shot Locations
[1:20] On the Effects of Artificial Data Modification
[1:25] Deep Squared Euclidean Approximation to the Levenshtein Distance for DNA Storage
[1:30] How Faithful is your Synthetic Data? Sample-level Metrics for Evaluating and Auditing Generative Models
[1:35] Error-driven Input Modulation: Solving the Credit Assignment Problem without a Backward Pass
[1:40] How to Train Your Wide Neural Network Without Backprop: An Input-Weight Alignment Perspective
[1:45] MAE-DET: Revisiting Maximal Entropy Principle in Zero-Shot NAS for Efficient Object Detection
[1:50] Describing Differences between Text Distributions with Natural Language
[1:55] Distinguishing rule- and exemplar-based generalization in learning systems
[2:00] Burst-dependent plasticity and dendritic amplification support target-based learning and hierarchical imitation learning
[2:05] Examining Scaling and Transfer of Language Model Architectures for Machine Translation
[2:10] State Transition of Dendritic Spines Improves Learning of Sparse Spiking Neural Networks
[2:15] How to Fill the Optimal Set? Population Gradient Descent with Harmless Diversity
[2:20] A Deep Learning Approach for the Segmentation of Electroencephalography Data in Eye Tracking Applications
[2:25] Minimizing Control for Credit Assignment with Strong Feedback
[2:30] Self-supervised models of audio effectively explain human cortical responses to speech
[2:35] Towards Scaling Difference Target Propagation by Learning Backprop Targets
[2:40] Content addressable memory without catastrophic forgetting by heteroassociation with a fixed scaffold
(ends 2:45 PM)
Spotlight s 1:15-1:40
[1:15] Coordinated Double Machine Learning
[1:20] Exploiting Independent Instruments: Identification and Distribution Generalization
[1:25] Partial Counterfactual Identification from Observational and Experimental Data
[1:30] On Measuring Causal Contributions via do-interventions
[1:35] The Role of Deconfounding in Meta-learning
Oral s 1:40-2:00
[1:40] Minimum Cost Intervention Design for Causal Effect Identification
Spotlight s 2:00-2:45
[2:00] Online Balanced Experimental Design
[2:05] CITRIS: Causal Identifiability from Temporal Intervened Sequences
[2:10] Causal structure-based root cause analysis of outliers
[2:15] Instrumental Variable Regression with Confounder Balancing
[2:20] Causal Transformer for Estimating Counterfactual Outcomes
[2:25] Causal Inference Through the Structural Causal Marginal Problem
[2:30] Functional Generalized Empirical Likelihood Estimation for Conditional Moment Restrictions
[2:35] Matching Learned Causal Effects of Neural Networks with Domain Priors
[2:40] Heteroscedastic Noise Based Causal Inference
(ends 2:45 PM)
Spotlight s 1:15-2:45
[1:15] Prototype Based Classification from Hierarchy to Fairness
[1:20] Neural-Symbolic Models for Logical Queries on Knowledge Graphs
[1:25] Deep Probability Estimation
[1:30] Uncertainty Modeling in Generative Compressed Sensing
[1:35] Going Deeper into Permutation-Sensitive Graph Neural Networks
[1:40] Learning from Counterfactual Links for Link Prediction
[1:45] Training Discrete Deep Generative Models via Gapped Straight-Through Estimator
[1:50] Fast and Reliable Evaluation of Adversarial Robustness with Minimum-Margin Attack
[1:55] Principal Component Flows
[2:00] Bit Prioritization in Variational Autoencoders via Progressive Coding
[2:05] Generative Flow Networks for Discrete Probabilistic Modeling
[2:10] Diffusion bridges vector quantized variational autoencoders
[2:15] Mitigating modality collapse in multimodal VAEs via impartial optimization
[2:20] Soft Truncation: A Universal Training Technique of Score-based Diffusion Model for High Precision Score Estimation
[2:25] Maximum Likelihood Training for Score-based Diffusion ODEs by High Order Denoising Score Matching
[2:30] Fast Lossless Neural Compression with Integer-Only Discrete Flows
[2:35] SQ-VAE: Variational Bayes on Discrete Representation with Self-annealed Stochastic Quantization
[2:40] Hierarchical Few-Shot Generative Models
(ends 2:45 PM)
Spotlight s 1:15-1:25
[1:15] Neural Network Pruning Denoises the Features and Makes Local Connectivity Emerge in Visual Tasks
[1:20] Simple and near-optimal algorithms for hidden stratification and multi-group learning
Oral s 1:25-1:45
[1:25] Cooperative Online Learning in Stochastic and Adversarial MDPs
Spotlight s 1:45-2:10
[1:45] Being Properly Improper
[1:50] On the Finite-Time Complexity and Practical Computation of Approximate Stationarity Concepts of Lipschitz Functions
[1:55] Nearly Optimal Policy Optimization with Stable at Any Time Guarantee
[2:00] Adaptive Best-of-Both-Worlds Algorithm for Heavy-Tailed Multi-Armed Bandits
[2:05] Minimax M-estimation under Adversarial Contamination
Oral s 2:10-2:30
[2:10] Efficient Contextual Bandits with CVaR Regret
Spotlight s 2:30-2:45
[2:30] Efficiently Learning the Topology and Behavior of a Networked Dynamical System Via Active Queries
[2:35] Boosting Graph Structure Learning with Dummy Nodes
[2:40] Lazy Estimation of Variable Importance for Large Neural Networks
(ends 2:45 PM)
Spotlight s 1:15-1:25
[1:15] Modeling Irregular Time Series with Continuous Recurrent Units
[1:20] TACTiS: Transformer-Attentional Copulas for Time Series
Oral s 1:25-1:45
[1:25] Neural Laplace: Learning diverse classes of differential equations in the Laplace domain
Spotlight s 1:45-2:15
[1:45] Approximately Equivariant Networks for Imperfectly Symmetric Dynamics
[1:50] IDYNO: Learning Nonparametric DAGs from Interventional Dynamic Data
[1:55] GSmooth: Certified Robustness against Semantic Transformations via Generalized Randomized Smoothing
[2:00] CerDEQ: Certifiable Deep Equilibrium Model
[2:05] Improving language models by retrieving from trillions of tokens
[2:10] Closed-Form Diffeomorphic Transformations for Time Series Alignment
Oral s 2:15-2:35
[2:15] Unified Scaling Laws for Routed Language Models
Spotlight s 2:35-2:45
[2:35] Forgetting-free Continual Learning with Winning Subnetworks
[2:40] FEDformer: Frequency Enhanced Decomposed Transformer for Long-term Series Forecasting
(ends 2:45 PM)
Spotlight s 1:15-1:55
[1:15] Algorithms for the Communication of Samples
[1:20] Robust SDE-based variational formulations for solving linear PDEs via deep learning
[1:25] Hessian-Free High-Resolution Nesterov Acceleration For Sampling
[1:30] LSB: Local Self-Balancing MCMC in Discrete Spaces
[1:35] A Langevin-like Sampler for Discrete Distributions
[1:40] Scalable Spike-and-Slab
[1:45] Low-Precision Stochastic Gradient Langevin Dynamics
[1:50] Continual Repeated Annealed Flow Transport Monte Carlo
Oral s 1:55-2:35
[1:55] Scalable MCMC Sampling for Nonsymmetric Determinantal Point Processes
[2:15] Nonparametric Involutive Markov Chain Monte Carlo
Spotlight s 2:35-2:45
[2:35] Fast Relative Entropy Coding with A* coding
[2:40] Accurate Quantization of Measures via Interacting Particle-based Optimization
(ends 2:45 PM)
Spotlight s 1:15-1:20
[1:15] Selective Network Linearization for Efficient Private Inference
Oral s 1:20-2:00
[1:20] Out-of-Distribution Detection with Posterior Sampling
[1:40] Rethinking Image-Scaling Attacks: The Interplay Between Vulnerabilities in Machine Learning Systems
Spotlight s 2:00-2:20
[2:00] Efficient Computation of Higher-Order Subgraph Attribution via Message Passing
[2:05] A Theoretical Analysis on Independence-driven Importance Weighting for Covariate-shift Generalization
[2:10] Modular Conformal Calibration
[2:15] Context-Aware Drift Detection
Oral s 2:20-2:40
[2:20] Privacy for Free: How does Dataset Condensation Help Privacy?
(ends 2:45 PM)
3:30 p.m.
(ends 5:30 PM)
6 a.m.
Invited Talk:
Regina Barzilay
(ends 7:00 AM)
7 a.m.
Coffee Break
7:30 a.m.
Spotlight s 7:30-8:20
[7:30] Towards understanding how momentum improves generalization in deep learning
[7:35] What Can Linear Interpolation of Neural Network Loss Landscapes Tell Us?
[7:40] Deep equilibrium networks are sensitive to initialization statistics
[7:45] Scaling-up Diverse Orthogonal Convolutional Networks by a Paraunitary Framework
[7:50] Stability Based Generalization Bounds for Exponential Family Langevin Dynamics
[7:55] Local Augmentation for Graph Neural Networks
[8:00] On Non-local Convergence Analysis of Deep Linear Networks
[8:05] On the Equivalence Between Temporal and Static Equivariant Graph Representations
[8:10] Diversified Adversarial Attacks based on Conjugate Gradient Method
[8:15] On the Optimization Landscape of Neural Collapse under MSE Loss: Global Optimality with Unconstrained Features
Oral s 8:20-8:40
[8:20] Adaptive Inertia: Disentangling the Effects of Adaptive Learning Rate and Momentum
Spotlight s 8:40-9:00
[8:40] Robust Training of Deep Networks by Sparse Over-parameterization
[8:45] Implicit Bias of the Step Size in Linear Diagonal Neural Networks
[8:50] Extended Unconstrained Features Model for Exploring Deep Neural Collapse
[8:55] Score-Guided Intermediate Level Optimization: Fast Langevin Mixing for Inverse Problems
(ends 9:00 AM)
Spotlight s 7:30-8:30
[7:30] On Numerical Integration in Neural Ordinary Differential Equations
[7:35] Reverse Engineering the Neural Tangent Kernel
[7:40] Principled Knowledge Extrapolation with GANs
[7:45] Informed Learning by Wide Neural Networks: Convergence, Generalization and Sampling Complexity
[7:50] Data Augmentation as Feature Manipulation: a story of desert cows and grass cows
[7:55] Convolutional and Residual Networks Provably Contain Lottery Tickets
[8:00] Feature Learning and Signal Propagation in Deep Neural Networks
[8:05] Benefits of Deep and Wide Convolutional Residual Networks: Function Approximation under Smoothness Constraint
[8:10] Understanding Contrastive Learning Requires Incorporating Inductive Biases
[8:15] Implicit Regularization with Polynomial Growth in Deep Tensor Factorization
[8:20] Deep Network Approximation in Terms of Intrinsic Parameters
[8:25] Coin Flipping Neural Networks
Oral s 8:30-8:50
[8:30] Robust Training of Neural Networks using Scale Invariant Architectures
Spotlight s 8:50-9:00
[8:50] More Than a Toy: Random Matrix Models Predict How Real-World Neural Representations Generalize
[8:55] Equivariant graph neural networks with complete local frames
(ends 9:00 AM)
Spotlight s 7:30-8:15
[7:30] Interpretable Neural Networks with Frank-Wolfe: Sparse Relevance Maps and Relevance Orderings
[7:35] Label-Free Explainability for Unsupervised Models
[7:40] Towards Theoretical Analysis of Transformation Complexity of ReLU DNNs
[7:45] A Study of Face Obfuscation in ImageNet
[7:50] Fair Representation Learning through Implicit Path Alignment
[7:55] Mitigating Neural Network Overconfidence with Logit Normalization
[8:00] Learning fair representation with a parametric integral probability metric
[8:05] Fairness with Adaptive Weights
[8:10] Fair Generalized Linear Models with a Convex Penalty
Oral s 8:15-8:35
[8:15] Causal Conceptions of Fairness and their Consequences
Spotlight s 8:35-9:00
[8:35] Understanding Instance-Level Impact of Fairness Constraints
[8:40] Achieving Fairness at No Utility Cost via Data Reweighing
[8:45] Mitigating Gender Bias in Face Recognition using the von Mises-Fisher Mixture Model
[8:50] Selective Regression under Fairness Criteria
[8:55] Input-agnostic Certified Group Fairness via Gaussian Parameter Smoothing
(ends 9:00 AM)
Spotlight s 7:30-8:35
[7:30] Modeling Strong and Human-Like Gameplay with KL-Regularized Search
[7:35] Showing Your Offline Reinforcement Learning Work: Online Evaluation Budget Matters
[7:40] Phasic Self-Imitative Reduction for Sparse-Reward Goal-Conditioned Reinforcement Learning
[7:45] Model-based Meta Reinforcement Learning using Graph Structured Surrogate Models and Amortized Policy Search
[7:50] Generalized Data Distribution Iteration
[7:55] Optimizing Tensor Network Contraction Using Reinforcement Learning
[8:00] History Compression via Language Models in Reinforcement Learning
[8:05] Model-Value Inconsistency as a Signal for Epistemic Uncertainty
[8:10] LeNSE: Learning To Navigate Subgraph Embeddings for Large-Scale Combinatorial Optimisation
[8:15] Efficient Learning for Alpha Zero via Path Consistency
[8:20] A data-driven approach for learning to control computers
[8:25] Zero-Shot Reward Specification via Grounded Natural Language
[8:30] How to Stay Curious while avoiding Noisy TVs using Aleatoric Uncertainty Estimation
Oral s 8:35-8:55
[8:35] REvolveR: Continuous Evolutionary Models for Robot-to-robot Policy Transfer
Spotlight s 8:55-9:00
[8:55] Improving Policy Optimization with Generalist-Specialist Learning
(ends 9:00 AM)
Spotlight s 7:30-7:55
[7:30] Centroid Approximation for Bootstrap: Improving Particle Quality at Inference
[7:35] Surrogate Likelihoods for Variational Annealed Importance Sampling
[7:40] Nonparametric Sparse Tensor Factorization with Hierarchical Gamma Processes
[7:45] Fat–Tailed Variational Inference with Anisotropic Tail Adaptive Flows
[7:50] Variational Sparse Coding with Learned Thresholding
Oral s 7:55-8:15
[7:55] BAMDT: Bayesian Additive Partial Multivariate Decision Trees for Nonparametric Regression
Spotlight s 8:15-8:25
[8:15] Structured Stochastic Gradient MCMC
[8:20] Variational Inference with Locally Enhanced Bounds for Hierarchical Models
Oral s 8:25-8:45
[8:25] Path-Gradient Estimators for Continuous Normalizing Flows
Spotlight s 8:45-9:00
[8:45] Deep Reference Priors: What is the best way to pretrain a model?
[8:50] Variational Feature Pyramid Networks
[8:55] Reparametrisation Gradient and Convergent SGD for Non-Differentiable Models via Smoothing: A Programming Language Approach
(ends 9:00 AM)
Spotlight s 7:30-7:55
[7:30] Weisfeiler-Lehman meets Gromov-Wasserstein
[7:35] GenLabel: Mixup Relabeling using Generative Models
[7:40] When and How Mixup Improves Calibration: A Theoretical View
[7:45] On Transportation of Mini-batches: A Hierarchical Approach
[7:50] VariGrow: Variational Architecture Growing for Task-Agnostic Continual Learning based on Bayesian Novelty
Oral s 7:55-8:15
[7:55] Stable Conformal Prediction Sets
Spotlight s 8:15-9:00
[8:15] A Model-Agnostic Randomized Learning Framework based on Random Hypothesis Subspace Sampling
[8:20] Label Noise Transition Matrix Estimation for Tasks with Lower-Quality Features
[8:25] Rethinking Fano’s Inequality in Ensemble Learning
[8:30] FITNESS: (Fine Tune on New and Similar Samples) to detect anomalies in streams with drift and outliers
[8:35] Improving Mini-batch Optimal Transport via Partial Transportation
[8:40] Minimax rate of consistency for linear models with missing values
[8:45] Permutation Search of Tensor Network Structures via Local Sampling
[8:50] Revisiting Label Smoothing and Knowledge Distillation Compatibility: What was Missing?
[8:55] DNNR: Differential Nearest Neighbors Regression
(ends 9:00 AM)
Oral s 7:30-8:10
[7:30] Adapting to Mixing Time in Stochastic Optimization with Markovian Data
[7:50] Towards Noise-adaptive, Problem-adaptive (Accelerated) Stochastic Gradient Descent
Spotlight s 8:10-9:00
[8:10] Beyond Worst-Case Analysis in Stochastic Approximation: Moment Estimation Improves Instance Complexity
[8:15] Personalization Improves Privacy-Accuracy Tradeoffs in Federated Optimization
[8:20] Optimal Algorithms for Stochastic Multi-Level Compositional Optimization
[8:25] Finite-Sum Coupled Compositional Stochastic Optimization: Theories and Applications
[8:30] Fast Composite Optimization and Statistical Recovery in Federated Learning
[8:35] Statistical inference with implicit SGD: proximal Robbins-Monro vs. Polyak-Ruppert
[8:40] ProxSkip: A Simple and Provably Effective Communication-Acceleration Technique for Federated Learning
[8:45] Communication-Efficient Adaptive Federated Learning
[8:50] RECAPP: Crafting a More Efficient Catalyst for Convex Optimization
[8:55] Kill a Bird with Two Stones: Closing the Convergence Gaps in Non-Strongly Convex Optimization by Directly Accelerated SVRG with Double Compensation and Snapshots
(ends 9:00 AM)
Spotlight s 7:30-8:15
[7:30] Learning Domain Adaptive Object Detection with Probabilistic Teacher
[7:35] Adaptive Data Analysis with Correlated Observations
[7:40] Efficient PAC Learning from the Crowd with Pairwise Comparisons
[7:45] On the Statistical Benefits of Curriculum Learning
[7:50] Feature and Parameter Selection in Stochastic Linear Bandits
[7:55] Disentangled Federated Learning for Tackling Attributes Skew via Invariant Aggregation and Diversity Transferring
[8:00] Streaming Algorithms for High-Dimensional Robust Statistics
[8:05] Contextual Bandits with Large Action Spaces: Made Practical
[8:10] Identifiability Conditions for Domain Adaptation
Oral s 8:15-8:35
[8:15] A new similarity measure for covariate shift with applications to nonparametric regression
Spotlight s 8:35-9:00
[8:35] Popular decision tree algorithms are provably noise tolerant
[8:40] Understanding and Improving Knowledge Graph Embedding for Entity Alignment
[8:45] Perfectly Balanced: Improving Transfer and Robustness of Supervised Contrastive Learning
[8:50] Robust Fine-tuning of Deep Neural Networks with Hessian-based Generalization Guarantees
[8:55] Understanding Gradual Domain Adaptation: Improved Analysis, Optimal Path and Beyond
(ends 9:00 AM)
Oral s 7:30-8:10
[7:30] A Minimax Learning Approach to Off-Policy Evaluation in Partially Observable Markov Decision Processes
[7:50] Federated Reinforcement Learning: Communication-Efficient Algorithms and Convergence Analysis
Spotlight s 8:10-8:20
[8:10] The Power of Exploiter: Provable Multi-Agent RL in Large State Spaces
[8:15] Extracting Latent State Representations with Linear Dynamics from Rich Observations
Oral s 8:20-8:40
[8:20] Learning Markov Games with Adversarial Opponents: Efficient Algorithms and Fundamental Limits
Spotlight s 8:40-8:55
[8:40] Nearly Minimax Optimal Reinforcement Learning with Linear Function Approximation
[8:45] Near-Optimal Learning of Extensive-Form Games with Imperfect Information
[8:50] Guarantees for Epsilon-Greedy Reinforcement Learning with Function Approximation
(ends 9:00 AM)
Spotlight s 7:30-8:05
[7:30] Skin Deep Unlearning: Artefact and Instrument Debiasing in the Context of Melanoma Classification
[7:35] One-Pass Diversified Sampling with Application to Terabyte-Scale Genomic Sequence Streams
[7:40] Unsupervised Flow-Aligned Sequence-to-Sequence Learning for Video Restoration
[7:45] ME-GAN: Learning Panoptic Electrocardio Representations for Multi-view ECG Synthesis Conditioned on Heart Diseases
[7:50] Variational Mixtures of ODEs for Inferring Cellular Gene Expression Dynamics
[7:55] Bayesian Imitation Learning for End-to-End Mobile Manipulation
[8:00] De novo mass spectrometry peptide sequencing with a transformer model
Oral s 8:05-8:25
[8:05] Learning inverse folding from millions of predicted structures
Spotlight s 8:25-8:30
[8:25] Guided-TTS: A Diffusion Model for Text-to-Speech via Classifier Guidance
Oral s 8:30-8:50
[8:30] Contrastive Mixture of Posteriors for Counterfactual Inference, Data Integration and Fairness
Spotlight s 8:50-9:00
[8:50] Proximal Exploration for Model-guided Protein Sequence Design
[8:55] The Transfo-k-mer: protein fitness prediction with auto-regressive transformers and inference-time retrieval
(ends 9:00 AM)
9 a.m.
Lunch Break On Your Own
10:15 a.m.
Spotlight s 10:15-11:15
[10:15] Choosing Answers in Epsilon-Best-Answer Identification for Linear Bandits
[10:20] On the Finite-Time Performance of the Knowledge Gradient Algorithm
[10:25] Expression might be enough: representing pressure and demand for reinforcement learning based traffic signal control
[10:30] Generalization Bounds using Lower Tail Exponents in Stochastic Optimizers
[10:35] No-Regret Learning in Time-Varying Zero-Sum Games
[10:40] Achieving Minimax Rates in Pool-Based Batch Active Learning
[10:45] Active Multi-Task Representation Learning
[10:50] Thompson Sampling for Robust Transfer in Multi-Task Bandits
[10:55] Metric-Fair Active Learning
[11:00] Metric-Fair Classifier Derandomization
[11:05] Interactively Learning Preference Constraints in Linear Bandits
[11:10] Convergence of Uncertainty Sampling for Active Learning
Oral s 11:15-11:35
[11:15] Active fairness auditing
Spotlight s 11:35-11:45
[11:35] Constants Matter: The Performance Gains of Active Learning
[11:40] Cross-Space Active Learning on Graph Convolutional Networks
(ends 11:45 AM)
Spotlight s 10:15-11:15
[10:15] MemSR: Training Memory-efficient Lightweight Model for Image Super-Resolution
[10:20] PINs: Progressive Implicit Networks for Multi-Scale Neural Representations
[10:25] Translating Robot Skills: Learning Unsupervised Skill Correspondences Across Robots
[10:30] Causal Inference Principles for Reasoning about Commonsense Causality
[10:35] Generative Coarse-Graining of Molecular Conformations
[10:40] LIMO: Latent Inceptionism for Targeted Molecule Generation
[10:45] Learning to Separate Voices by Spatial Regions
[10:50] Constrained Optimization with Dynamic Bound-scaling for Effective NLP Backdoor Defense
[10:55] 3D Infomax improves GNNs for Molecular Property Prediction
[11:00] Biological Sequence Design with GFlowNets
[11:05] Pocket2Mol: Efficient Molecular Sampling Based on 3D Protein Pockets
[11:10] Retroformer: Pushing the Limits of End-to-end Retrosynthesis Transformer
Oral s 11:15-11:35
[11:15] 3DLinker: An E(3) Equivariant Variational Autoencoder for Molecular Linker Design
Spotlight s 11:35-11:45
[11:35] Path-aware and structure-preserving generation of synthetically accessible molecules
[11:40] EquiBind: Geometric Deep Learning for Drug Binding Structure Prediction
(ends 11:45 AM)
Spotlight s 10:15-10:20
[10:15] Generating Distributional Adversarial Examples to Evade Statistical Detectors
Oral s 10:20-10:40
[10:20] A Dynamical System Perspective for Lipschitz Neural Networks
Spotlight s 10:40-11:20
[10:40] Modeling Adversarial Noise for Adversarial Training
[10:45] Improving Adversarial Robustness via Mutual Information Estimation
[10:55] Query-Efficient and Scalable Black-Box Adversarial Attacks on Discrete Sequential Data via Bayesian Optimization
[11:00] Test-Time Training Can Close the Natural Distribution Shift Performance Gap in Deep Learning Based Compressed Sensing
[11:05] Improving Out-of-Distribution Robustness via Selective Augmentation
[11:10] Data Determines Distributional Robustness in Contrastive Language Image Pre-training
[11:15] Neurotoxin: Durable Backdoors in Federated Learning
Oral s 11:20-11:40
[11:20] Correct-N-Contrast: a Contrastive Approach for Improving Robustness to Spurious Correlations
Spotlight s 11:40-11:45
[11:40] Bayesian Learning with Information Gain Provably Bounds Risk for a Robust Adversarial Defense
(ends 11:45 AM)
Spotlight s 10:15-11:40
[10:15] Biased Gradient Estimate with Drastic Variance Reduction for Meta Reinforcement Learning
[10:20] Analysis of Stochastic Processes through Replay Buffers
[10:25] Cascaded Gaps: Towards Logarithmic Regret for Risk-Sensitive Reinforcement Learning
[10:30] Communicating via Maximum Entropy Reinforcement Learning
[10:35] PAGE-PG: A Simple and Loopless Variance-Reduced Policy Gradient Method with Probabilistic Gradient Estimation
[10:40] DNS: Determinantal Point Process Based Neural Network Sampler\\ for Ensemble Reinforcement Learning
[10:45] Denoised MDPs: Learning World Models Better Than the World Itself
[10:50] A Temporal-Difference Approach to Policy Gradient Estimation
[10:55] MASER: Multi-Agent Reinforcement Learning with Subgoals Generated from Experience Replay Buffer
[11:00] Sample-Efficient Reinforcement Learning for POMDPs with Linear Function Approximations
[11:05] Actor-Critic based Improper Reinforcement Learning
[11:10] On the Sample Complexity of Learning Infinite-horizon Discounted Linear Kernel MDPs
[11:15] The Geometry of Robust Value Functions
[11:20] Delayed Reinforcement Learning by Imitation
[11:25] Reachability Constrained Reinforcement Learning
[11:30] Adaptive Model Design for Markov Decision Process
[11:35] Objective Robustness in Deep Reinforcement Learning
(ends 11:45 AM)
Spotlight s 10:15-10:25
[10:15] Online Learning and Pricing with Reusable Resources: Linear Bandit with Sub-exponential Rewards
[10:20] A Resilient Distributed Boosting Algorithm
Oral s 10:25-10:45
[10:25] Generative Trees: Adversarial and Copycat
Spotlight s 10:45-11:10
[10:45] On Well-posedness and Minimax Optimal Rates of Nonparametric Q-function Estimation in Off-policy Evaluation
[10:50] Congested Bandits: Optimal Routing via Short-term Resets
[10:55] Stochastic Rising Bandits
[11:00] PDE-Based Optimal Strategy for Unconstrained Online Learning
[11:05] Fluctuations, Bias, Variance & Ensemble of Learners: Exact Asymptotics for Convex Losses in High-Dimension
Oral s 11:10-11:30
[11:10] Agnostic Learnability of Halfspaces via Logistic Loss
Spotlight s 11:30-11:45
[11:30] Provable Acceleration of Heavy Ball beyond Quadratics for a class of Polyak-Lojasiewicz Functions when the Non-Convexity is Averaged-Out
[11:35] On Learning Mixture of Linear Regressions in the Non-Realizable Setting
[11:40] Random Forest Density Estimation
(ends 11:45 AM)
Spotlight s 10:15-10:20
[10:15] QSFL: A Two-Level Uplink Communication Optimization Framework for Federated Learning
Oral s 10:20-11:00
[10:20] Tight and Robust Private Mean Estimation with Few Users
[10:40] Improved Rates for Differentially Private Stochastic Convex Optimization with Heavy-Tailed Data
Spotlight s 11:00-11:45
[11:00] Sanity Simulations for Saliency Methods
[11:05] Out-of-distribution Detection with Deep Nearest Neighbors
[11:10] Differentially Private Maximal Information Coefficients
[11:15] Robustness and Accuracy Could Be Reconcilable by (Proper) Definition
[11:20] On the Difficulty of Defending Self-Supervised Learning against Model Extraction
[11:25] Adversarial Attacks and Defenses for Non-Parametric Two-Sample Tests
[11:30] Certified Adversarial Robustness Under the Bounded Support Set
[11:35] Predicting Out-of-Distribution Error with the Projection Norm
[11:40] Adversarially Robust Models may not Transfer Better: Sufficient Conditions for Domain Transferability from the View of Regularization
(ends 11:45 AM)
Spotlight s 10:15-11:35
[10:15] Decomposing Temporal High-Order Interactions via Latent ODEs
[10:20] Log-Euclidean Signatures for Intrinsic Distances Between Unaligned Datasets
[10:25] DRIBO: Robust Deep Reinforcement Learning via Multi-View Information Bottleneck
[10:30] End-to-End Balancing for Causal Continuous Treatment-Effect Estimation
[10:35] Role-based Multiplex Network Embedding
[10:40] Measure Estimation in the Barycentric Coding Model
[10:45] COAT: Measuring Object Compositionality in Emergent Representations
[10:50] Counterfactual Transportability: A Formal Approach
[10:55] Estimation of Linear Non-Gaussian Latent Hierarchical Structure
[11:00] Wide Neural Networks Forget Less Catastrophically
[11:05] MAML and ANIL Provably Learn Representations
[11:10] NAFS: A Simple yet Tough-to-beat Baseline for Graph Representation Learning
[11:15] Action-Sufficient State Representation Learning for Control with Structural Constraints
[11:20] C-MinHash: Improving Minwise Hashing with Circulant Permutation
[11:25] Proximal denoiser for convergent plug-and-play optimization with nonconvex regularization
[11:30] Generalization and Robustness Implications in Object-Centric Learning
(ends 11:45 AM)
Oral s 10:15-10:55
[10:15] Bayesian Continuous-Time Tucker Decomposition
[10:35] Function-space Inference with Sparse Implicit Processes
Spotlight s 10:55-11:45
[10:55] Probabilistic Inverse Optimal Transport
[11:00] Easy Variational Inference for Categorical Models via an Independent Binary Approximation
[11:05] Streaming Inference for Infinite Feature Models
[11:10] Optimizing Sequential Experimental Design with Deep Reinforcement Learning
[11:15] Approximate Bayesian Computation with Domain Expert in the Loop
[11:20] Variational Inference for Infinitely Deep Neural Networks
[11:25] Personalized Federated Learning via Variational Bayesian Inference
[11:30] Sampling from wide Bayesian neural networks
[11:35] Bayesian Deep Embedding Topic Meta-Learner
[11:40] Efficient Approximate Inference for Stationary Kernel on Frequency domain
(ends 11:45 AM)
Spotlight s 10:15-10:20
[10:15] From data to functa: Your data point is a function and you should treat it like one
Oral s 10:20-10:40
[10:20] Generating 3D Molecules for Target Protein Binding
Spotlight s 10:40-11:45
[10:40] Differentiable Top-k Classification Learning
[10:45] Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural Networks
[10:50] Characterizing and overcoming the greedy nature of learning in multi-modal deep neural networks
[10:55] Training Your Sparse Neural Network Better with Any Mask
[11:00] Federated Learning with Positive and Unlabeled Data
[11:05] DisPFL: Towards Communication-Efficient Personalized Federated learning via Decentralized Sparse Training
[11:10] Sparse Double Descent: Where Network Pruning Aggravates Overfitting
[11:15] Collaboration of Experts: Achieving 80% Top-1 Accuracy on ImageNet with 100M FLOPs
[11:20] Revisiting Consistency Regularization for Deep Partial Label Learning
[11:25] Stochastic smoothing of the top-K calibrated hinge loss for deep imbalanced classification
[11:30] A Unified Weight Initialization Paradigm for Tensorial Convolutional Neural Networks
[11:35] PLATINUM: Semi-Supervised Model Agnostic Meta-Learning using Submodular Mutual Information
[11:40] Multicoated Supermasks Enhance Hidden Networks
(ends 11:45 AM)
Spotlight s 10:15-10:55
[10:15] DAdaQuant: Doubly-adaptive quantization for communication-efficient Federated Learning
[10:20] Unsupervised Time-Series Representation Learning with Iterative Bilinear Temporal-Spectral Fusion
[10:25] RetrievalGuard: Provably Robust 1-Nearest Neighbor Image Retrieval
[10:30] Modeling Structure with Undirected Neural Networks
[10:35] Certified Neural Network Watermarks with Randomized Smoothing
[10:40] Improved Certified Defenses against Data Poisoning with (Deterministic) Finite Aggregation
[10:45] Adversarial Vulnerability of Randomized Ensembles
[10:50] The CLRS Algorithmic Reasoning Benchmark
Oral s 10:55-11:15
[10:55] Robustness Verification for Contrastive Learning
Spotlight s 11:15-11:45
[11:15] Finding Global Homophily in Graph Neural Networks When Meeting Heterophily
[11:20] Understanding Robust Generalization in Learning Regular Languages
[11:25] Improving Robustness against Real-World and Worst-Case Distribution Shifts through Decision Region Quantification
[11:30] AdAUC: End-to-end Adversarial AUC Optimization Against Long-tail Problems
[11:35] A Modern Self-Referential Weight Matrix That Learns to Modify Itself
[11:40] Short-Term Plasticity Neurons Learning to Learn and Forget
(ends 11:45 AM)
11:30 a.m.
Spotlight s 11:30-12:25
[11:30] Improved Regret for Differentially Private Exploration in Linear MDP
[11:35] Differentially Private Community Detection for Stochastic Block Models
[11:40] Understanding Clipping for Federated Learning: Convergence and Client-Level Differential Privacy
[11:45] Hermite Polynomial Features for Private Data Generation
[11:50] How to Steer Your Adversary: Targeted and Efficient Model Stealing Defenses with Gradient Redirection
[11:55] Deduplicating Training Data Mitigates Privacy Risks
[12:00] Private frequency estimation via projective geometry
[12:05] Secure Quantized Training for Deep Learning
[12:10] Faster Privacy Accounting via Evolving Discretization
[12:15] The Fundamental Price of Secure Aggregation in Differentially Private Federated Learning
[12:20] Private Adaptive Optimization with Side information
Oral s 12:25-12:45
[12:25] The Poisson Binomial Mechanism for Unbiased Federated Learning with Secure Aggregation
Spotlight s 12:45-1:00
[12:45] Private optimization in the interpolation regime: faster rates and hardness results
[12:50] Differentially Private Coordinate Descent for Composite Empirical Risk Minimization
[12:55] Private Streaming SCO in $\ell_p$ geometry with Applications in High Dimensional Online Decision Making
(ends 1:00 PM)
Spotlight s 11:30-12:25
[11:30] $p$-Laplacian Based Graph Neural Networks
[11:35] Equivariant Quantum Graph Circuits
[11:40] A Theoretical Comparison of Graph Neural Network Extensions
[11:45] Variational On-the-Fly Personalization
[11:50] Deep symbolic regression for recurrence prediction
[11:55] GMC - Geometric Contrastive Multimodal Representation Learning
[12:00] Universality of Winning Tickets: A Renormalization Group Perspective
[12:05] A Differential Entropy Estimator for Training Neural Networks
[12:10] Loss Function Learning for Domain Generalization by Implicit Gradient
[12:15] GraphFM: Improving Large-Scale GNN Training via Feature Momentum
[12:20] Generalization Guarantee of Training Graph Convolutional Networks with Graph Topology Sampling
Oral s 12:25-12:45
[12:25] Partial and Asymmetric Contrastive Learning for Out-of-Distribution Detection in Long-Tailed Recognition
Spotlight s 12:45-1:00
[12:45] Improving and Assessing Anomaly Detectors for Large-Scale Settings
[12:50] Score-based Generative Modeling of Graphs via the System of Stochastic Differential Equations
[12:55] SPECTRE : Spectral Conditioning Overcomes the Expressivity Limits of One-shot Graph Generators
(ends 1:00 PM)
Spotlight s 11:30-12:15
[11:30] Gradient Descent on Neurons and its Link to Approximate Second-order Optimization
[11:35] A Tree-based Model Averaging Approach for Personalized Treatment Effect Estimation from Heterogeneous Data Sources
[11:40] Efficient Online ML API Selection for Multi-Label Classification Tasks
[11:45] Entropic Causal Inference: Graph Identifiability
[11:50] Architecture Agnostic Federated Learning for Neural Networks
[11:55] Conformal Prediction Sets with Limited False Positives
[12:00] HyperImpute: Generalized Iterative Imputation with Automatic Model Selection
[12:05] Learning Pseudometric-based Action Representations for Offline Reinforcement Learning
[12:10] A Statistical Manifold Framework for Point Cloud Data
Oral s 12:15-12:35
[12:15] LIDL: Local Intrinsic Dimension estimation using approximate Likelihood
Spotlight s 12:35-12:55
[12:35] A Natural Actor-Critic Framework for Zero-Sum Markov Games
[12:40] Distributionally Robust $Q$-Learning
[12:45] Sparsity in Partially Controllable Linear Systems
[12:50] SAUTE RL: Toward Almost Surely Safe Reinforcement Learning Using State Augmentation
(ends 1:00 PM)
Spotlight s 11:30-12:00
[11:30] NISPA: Neuro-Inspired Stability-Plasticity Adaptation for Continual Learning in Sparse Networks
[11:35] Synergy and Symmetry in Deep Learning: Interactions between the Data, Model, and Inference Algorithm
[11:40] Auxiliary Learning with Joint Task and Data Scheduling
[11:45] Large-scale Stochastic Optimization of NDCG Surrogates for Deep Learning with Provable Convergence
[11:50] Gating Dropout: Communication-efficient Regularization for Sparsely Activated Transformers
[11:55] Generalizing Gaussian Smoothing for Random Search
Oral s 12:00-12:20
[12:00] A General Recipe for Likelihood-free Bayesian Optimization
Spotlight s 12:20-12:55
[12:20] Constrained Discrete Black-Box Optimization using Mixed-Integer Programming
[12:25] Risk-Averse No-Regret Learning in Online Convex Games
[12:30] Improve Single-Point Zeroth-Order Optimization Using High-Pass and Low-Pass Filters
[12:35] Robust Multi-Objective Bayesian Optimization Under Input Noise
[12:40] Gradient-Free Method for Heavily Constrained Nonconvex Optimization
[12:45] Sequential- and Parallel- Constrained Max-value Entropy Search via Information Lower Bound
[12:50] The power of first-order smooth optimization for black-box non-smooth problems
(ends 1:00 PM)
Spotlight s 11:30-11:50
[11:30] SoQal: Selective Oracle Questioning for Consistency Based Active Learning of Cardiac Signals
[11:35] Matching Structure for Dual Learning
[11:40] BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
[11:45] YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
Oral s 11:50-12:10
[11:50] Re-evaluating Word Mover's Distance
Spotlight s 12:10-1:00
[12:10] SDQ: Stochastic Differentiable Quantization with Mixed Precision
[12:15] IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and Languages
[12:20] Inducing Causal Structure for Interpretable Neural Networks
[12:25] Translatotron 2: High-quality direct speech-to-speech translation with voice preservation
[12:30] Robust alignment of cross-session recordings of neural population activity by behaviour via unsupervised domain adaptation
[12:35] Symmetric Machine Theory of Mind
[12:40] UNITE: Uncertainty Adjusted Pruning for Large Transformer Models
[12:45] LCANets: Lateral Competition Improves Robustness Against Corruption and Attack
[12:50] Reconstructing nonlinear dynamical systems from multi-modal time series
[12:55] Neural language models are not born equal to fit brain data, but training helps
(ends 1:00 PM)
Spotlight s 11:30-12:15
[11:30] The Infinite Contextual Graph Markov Model
[11:35] RankSim: Ranking Similarity Regularization for Deep Imbalanced Regression
[11:40] Detached Error Feedback for Distributed SGD with Random Sparsification
[11:45] Training OOD Detectors in Their Natural Habitats
[11:50] Constrained Gradient Descent: A Powerful and Principled Evasion Attack Against Neural Networks
[11:55] Neural Tangent Kernel Empowered Federated Learning
[12:00] Probabilistically Robust Learning: Balancing Average- and Worst-case Performance
[12:05] A Study on the Ramanujan Graph Property of Winning Lottery Tickets
[12:10] Feature Space Particle Inference for Neural Network Ensembles
Oral s 12:15-12:35
[12:15] Adversarially trained neural representations are already as robust as biological neural representations
Spotlight s 12:35-1:00
[12:35] PAC-Net: A Model Pruning Approach to Inductive Transfer Learning
[12:40] EDEN: Communication-Efficient and Robust Distributed Mean Estimation for Federated Learning
[12:45] Fisher SAM: Information Geometry and Sharpness Aware Minimisation
[12:50] Deep Networks on Toroids: Removing Symmetries Reveals the Structure of Flat Regions in the Landscape Geometry
[12:55] Towards Understanding Sharpness-Aware Minimization
(ends 1:00 PM)
Spotlight s 11:30-12:05
[11:30] Greedy based Value Representation for Optimal Coordination in Multi-agent Reinforcement Learning
[11:35] Bayesian Nonparametrics for Offline Skill Discovery
[11:40] Convergence of policy gradient for entropy regularized MDPs with neural network approximation in the mean-field regime
[11:45] Curriculum Reinforcement Learning via Constrained Optimal Transport
[11:50] Recurrent Model-Free RL can be a Strong Baseline for Many POMDPs
[11:55] Stabilizing Q-learning with Linear Architectures for Provable Efficient Learning
[12:00] Constrained Offline Policy Optimization
Oral s 12:05-12:25
[12:05] Causal Dynamics Learning for Task-Independent State Abstraction
Spotlight s 12:25-12:40
[12:25] Leveraging Approximate Symbolic Models for Reinforcement Learning via Skill Diversity
[12:30] Reinforcement Learning with Action-Free Pre-Training from Videos
[12:35] Towards Adaptive Model-Based Reinforcement Learning
Oral s 12:40-1:00
[12:40] Planning with Diffusion for Flexible Behavior Synthesis
(ends 1:00 PM)
Spotlight s 11:30-11:55
[11:30] Neural Tangent Kernel Beyond the Infinite-Width Limit: Effects of Depth and Initialization
[11:35] Implicit Bias of Linear Equivariant Networks
[11:40] The State of Sparse Training in Deep Reinforcement Learning
[11:45] Set Norm and Equivariant Skip Connections: Putting the Deep in Deep Sets
[11:50] Datamodels: Understanding Predictions with Data and Data with Predictions
Oral s 11:55-12:15
[11:55] Not All Poisons are Created Equal: Robust Training against Data Poisoning
Spotlight s 12:15-12:30
[12:15] Deep Causal Metric Learning
[12:20] Revisiting and Advancing Fast Adversarial Training Through The Lens of Bi-Level Optimization
[12:25] Learning Symmetric Embeddings for Equivariant World Models
Oral s 12:30-12:50
[12:30] Born-Infeld (BI) for AI: Energy-Conserving Descent (ECD) for Optimization
Spotlight s 12:50-1:00
[12:50] Three-stage Evolution and Fast Equilibrium for SGD with Non-degerate Critical Points
[12:55] Optimization-Derived Learning with Essential Convergence Analysis of Training and Hyper-training
(ends 1:00 PM)
Spotlight s 11:30-1:00
[11:30] The dynamics of representation learning in shallow, non-linear autoencoders
[11:35] Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks
[11:40] Estimation in Rotationally Invariant Generalized Linear Models via Approximate Message Passing
[11:45] Failure and success of the spectral bias prediction for Kernel Ridge Regression: the case of low-dimensional data
[11:50] Regret Bounds for Stochastic Shortest Path Problems with Linear Function Approximation
[11:55] Universal Joint Approximation of Manifolds and Densities by Simple Injective Flows
[12:00] Bounding the Width of Neural Networks via Coupled Initialization - A Worst Case Analysis
[12:05] Maslow's Hammer in Catastrophic Forgetting: Node Re-Use vs. Node Activation
[12:10] The Pathway Race Reduction: Dynamics of Abstraction in Gated Networks
[12:15] Efficient Learning of CNNs using Patch Based Features
[12:20] Neural Tangent Kernel Analysis of Deep Narrow Neural Networks
[12:25] Modality Competition: What Makes Joint Training of Multi-modal Network Fail in Deep Learning? (Provably)
[12:30] Fully-Connected Network on Noncompact Symmetric Space and Ridgelet Transform based on Helgason-Fourier Analysis
[12:35] Non-Vacuous Generalisation Bounds for Shallow Neural Networks
[12:40] An initial alignment between neural network and target is needed for gradient descent to learn
[12:45] Inductive Biases and Variable Creation in Self-Attention Mechanisms
[12:50] Topology-aware Generalization of Decentralized SGD
[12:55] Understanding Gradient Descent on the Edge of Stability in Deep Learning
(ends 1:00 PM)
Spotlight s 11:30-12:15
[11:30] A New Perspective on the Effects of Spectrum in Graph Neural Networks
[11:35] Molecular Graph Representation Learning via Heterogeneous Motif Graph Construction
[11:40] Partial Label Learning via Label Influence Function
[11:45] Minimax Classification under Concept Drift with Multidimensional Adaptation and Performance Guarantees
[11:50] Understanding Robust Overfitting of Adversarial Training and Beyond
[11:55] A random matrix analysis of online learning: coping with limited memory resources
[12:00] Dual Perspective of Label-Specific Feature Learning for Multi-Label Classification
[12:05] Supervised Learning with General Risk Functionals
[12:10] Locally Sparse Neural Networks for Tabular Biomedical Data
Oral s 12:15-12:35
[12:15] Hierarchical Shrinkage: Improving the accuracy and interpretability of tree-based models.
Spotlight s 12:35-12:55
[12:35] Detecting Corrupted Labels Without Training a Model to Predict
[12:40] Prototype-anchored Learning for Learning with Imperfect Annotations
[12:45] Learning to Predict Graphs with Fused Gromov-Wasserstein Barycenters
[12:50] Deep Safe Incomplete Multi-view Clustering: Theorem and Algorithm
(ends 1:00 PM)
11:45 a.m.
Coffee Break
12:15 p.m.
Invited Talk:
Guido Imbens
(ends 1:15 PM)
1:15 p.m.
Coffee Break
3:30 p.m.