Workshop: Subset Selection in Machine Learning: From Theory to Applications

Using Machine Learning to Recognise Statistical Dependence

Ubai Sandouk


Recognising statistical dependence of variables is a fundamentally prominent task in modelling data relatedness, having impact on multiple data disciplines. However, it is a peculiarly challenging task often requiring manual scrutiny of data in order to uncover the correct relatedness function. Inspired by this human ability, we cast the named task as machine learning problem. Solving such problem results in models encoding abstract meaning of statistical dependence. In this paper, we report overwhelmingly and significantly superior performance to known models, especially when few data points are available and/or high noise is present. Such novel view of the named task is transformative for numerous data disciplines such as unsupervised learning, feature selection and causal inference.