Timezone: »

Automatic Discovery of the Statistical Types of Variables in a Dataset
Isabel Valera · Zoubin Ghahramani

Sun Aug 06 06:24 PM -- 06:42 PM (PDT) @ C4.9& C4.10

A common practice in statistics and machine learning is to assume that the statistical data types (e.g., ordinal, categorical or real-valued) of variables, and usually also the likelihood model, is known. However, as the availability of real-world data increases, this assumption becomes too restrictive. Data are often heterogeneous, complex, and improperly or incompletely documented. Surprisingly, despite their practical importance, there is still a lack of tools to automatically discover the statistical types of, as well as appropriate likelihood (noise) models for, the variables in a dataset. In this paper, we fill this gap by proposing a Bayesian method, which accurately discovers the statistical data types in both synthetic and real data.

Author Information

Isabel Valera (University of Cambridge)

Isabel Valera is a Minerva research group leader at the Max Planck Institute for Intelligent Systems. Isabel develops flexible and efficient probabilistic models and inference algorithms to fit and analyze real-world data. She is particularly interested in problems related to the unstructured and complex nature of real-world data, which are often time-dependent, heterogeneous, noisy, and might contain errors and missing values. Isabel obtained her PhD in 2014 and her MSc degree in 2012, both from the University Carlos III in Madrid, Spain. She has been a German Humboldt Post-Doctoral Fellowship Holder, and recently she has been granted with a Minerva fast track research group from the Max Planck Society. You can find more about her at https://ivaleram.github.io/.

Zoubin Ghahramani (University of Cambridge & Uber)

Zoubin Ghahramani is a Professor at the University of Cambridge, and Chief Scientist at Uber. He is also Deputy Director of the Leverhulme Centre for the Future of Intelligence, was a founding Director of the Alan Turing Institute and co-founder of Geometric Intelligence (now Uber AI Labs). His research focuses on probabilistic approaches to machine learning and AI. In 2015 he was elected a Fellow of the Royal Society.

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors