Workshop: Information-Theoretic Methods for Rigorous, Responsible, and Reliable Machine Learning (ITR3)

Sliced Mutual Information: A Scalable Measure of Statistical Dependence

Ziv Goldfeld · Kristjan Greenewald


Mutual information (MI) is a fundamental measure of statistical dependence, with a myriad of applications to information theory, statistics, and machine learning. While it possesses many desirable structural properties, the estimation of high-dimensional MI from samples suffers from the curse of dimensionality. Motivated by scalability to high dimensions, this paper proposes \emph{sliced} MI (SMI) as a surrogate measure of dependence. SMI is defined as an average of MI terms between one-dimensional random projections. We show that it preserves many of the structural properties of classic MI, while gaining scalable computation and estimation from samples. Furthermore, and in contrast to classic MI, SMI can be created from computation. This violation of the data processing inequality is consistent with modern feature extraction techniques that often result in post-processed representations that are more useful for inference than the raw data. Our theory is supported by a numerical study of independence testing and feature extraction, which demonstrate the potential gains SMI offers over classic MI for high-dimensional inference.

Chat is not available.