The empirical success of state-of-the-art machine learning (ML) techniques has outpaced their theoretical understanding. Deep learning models, for example, perform far better than classical statistical learning theory predicts, leading to its widespread use by Industry and Government. At the same time, the deployment of ML systems that are not fully understood often leads to unexpected and detrimental individual-level impact. Finally, the large-scale adoption of ML means that ML systems are now critical infrastructure on which millions rely. In the face of these challenges, there is a critical need for theory that provides rigorous performance guarantees for practical ML models; guides the responsible deployment of ML in applications of social consequence; and enables the design of reliable ML systems in large-scale, distributed environments.
For decades, information theory has provided a mathematical foundation for the systems and algorithms that fuel the current data science revolution. Recent advances in privacy, fairness, and generalization bounds demonstrate that information theory will also play a pivotal role in the next decade of ML applications: information-theoretic methods can sharpen generalization bounds for deep learning, provide rigorous guarantees for compression of neural networks, promote fairness and privacy in ML training and deployment, and shed light on the limits of learning from noisy data.
We propose a workshop that brings together researchers and practitioners in ML and information theory to encourage knowledge transfer and collaboration between the sister fields. For information theorists, the workshop will highlight novel and socially-critical research directions that promote reliable, responsible, and rigorous development of ML. Moreover, the workshop will expose ICML attendees to emerging information-theoretic tools that may play a critical role in the next decade of ML applications.
Opening Remarks (Intro & Welcome) | |
Virtual Poster Session #1 (Poster Session) | |
Invited Talk: Maxim Raginsky (Invited Talk) | |
Q&A: Maxim Raginsky (Q&A) | |
Invited Talk: Alex Dimakis (Invited Talk) | |
Q&A: Alex Dimakis (Q&A) | |
Small Break (Break) | |
Invited Talk: Kamalika Chaudhuri (Invited Talk) | |
Q&A: Kamalika Chaudhuri (Q&A) | |
Invited Talk: Todd Coleman (Invited Talk) | |
Q&A: Todd Coleman (Q&A) | |
Contributed Talk #1 (Contributed Talk) | |
Contributed Talk #2 (Contributed Talk) | |
Big Break (Break) | |
Panel Discussion (Panel) | |
Invited Talk: Kush Varshney (Invited Talk) | |
Q&A: Kush Varshney (Q&A) | |
Invited Talk: Thomas Steinke (Invited Talk) | |
Q&A: Thomas Steinke (Q&A) | |
Virtual Poster Session #2 (Poster Session) | |
Invited Talk: Lalitha Sankar (Invited Talk) | |
Q&A: Lalitha Sankar (Q&A) | |
Contributed Talk #3 (Contributed Talk) | |
Contributed Talk #4 (Contributed Talk) | |
Invited Talk: David Tse (Invited Talk) | |
Q&A: David Tse (Q&A) | |
Concluding Remarks | |
Social Hour | |
A unified PAC-Bayesian framework for machine unlearning via information risk minimization (Poster) | |
Active privacy-utility trade-off against a hypothesis testing adversary (Poster) | |
When Optimizing f-divergence is Robust with Label Noise (Poster) | |
Single-Shot Compression for Hypothesis Testing (Poster) | |
Characterizing the Generalization Error of Gibbs Algorithm with Symmetrized KL information (Poster) | |
Prediction-focused Mixture Models (Poster) | |
Tighter Expected Generalization Error Bounds via Wasserstein Distance (Poster) | |
Data-Dependent PAC-Bayesian Bounds in the Random-Subset Setting with Applications to Neural Networks (Poster) | |
Active Sampling for Binary Gaussian Model Testing in High Dimensions (Poster) | |
Unsupervised Information Obfuscation for Split Inference of Neural Networks (Poster) | |
Entropic Causal Inference: Identifiability for Trees and Complete Graphs (Poster) | |
Towards a Unified Information-Theoretic Framework for Generalization (Poster) | |
Excess Risk Analysis of Learning Problems via Entropy Continuity (Poster) | |
Out-of-Distribution Robustness in Deep Learning Compression (Poster) | |
Coded Privacy-Preserving Computation at Edge Networks (Poster) | |
a-VAEs : Optimising variational inference by learning data-dependent divergence skew (Poster) | |
Learning under Distribution Mismatch and Model Misspecification (Poster) | |
Sub-population Guarantees for Importance Weights and KL-Divergence Estimation (Poster) | |
Soft BIBD and Product Gradient Codes: Coding Theoretic Constructions to Mitigate Stragglers in Distributed Learning (Poster) | |
Within-layer Diversity Reduces Generalization Gap (Poster) | |
Sliced Mutual Information: A Scalable Measure of Statistical Dependence (Poster) | |
Minimax Bounds for Generalized Pairwise Comparisons (Poster) | |
Information-Guided Sampling for Low-Rank Matrix Completion (Poster) | |
True Few-Shot Learning with Language Models (Poster) | |
Realizing GANs via a Tunable Loss Function (Poster) | |
Neural Network-based Estimation of the MMSE (Poster) | |