Workshop: Subset Selection in Machine Learning: From Theory to Applications
Using Machine Learning to Recognise Statistical Dependence
Recognising statistical dependence of variables is a fundamentally prominent task in modelling data relatedness, having impact on multiple data disciplines. However, it is a peculiarly challenging task often requiring manual scrutiny of data in order to uncover the correct relatedness function. Inspired by this human ability, we cast the named task as machine learning problem. Solving such problem results in models encoding abstract meaning of statistical dependence. In this paper, we report overwhelmingly and significantly superior performance to known models, especially when few data points are available and/or high noise is present. Such novel view of the named task is transformative for numerous data disciplines such as unsupervised learning, feature selection and causal inference.