Timezone: »
A common practice in statistics and machine learning is to assume that the statistical data types (e.g., ordinal, categorical or real-valued) of variables, and usually also the likelihood model, is known. However, as the availability of real-world data increases, this assumption becomes too restrictive. Data are often heterogeneous, complex, and improperly or incompletely documented. Surprisingly, despite their practical importance, there is still a lack of tools to automatically discover the statistical types of, as well as appropriate likelihood (noise) models for, the variables in a dataset. In this paper, we fill this gap by proposing a Bayesian method, which accurately discovers the statistical data types in both synthetic and real data.
Author Information
Isabel Valera (University of Cambridge)
Isabel Valera is a Minerva research group leader at the Max Planck Institute for Intelligent Systems. Isabel develops flexible and efficient probabilistic models and inference algorithms to fit and analyze real-world data. She is particularly interested in problems related to the unstructured and complex nature of real-world data, which are often time-dependent, heterogeneous, noisy, and might contain errors and missing values. Isabel obtained her PhD in 2014 and her MSc degree in 2012, both from the University Carlos III in Madrid, Spain. She has been a German Humboldt Post-Doctoral Fellowship Holder, and recently she has been granted with a Minerva fast track research group from the Max Planck Society. You can find more about her at https://ivaleram.github.io/.
Zoubin Ghahramani (University of Cambridge & Uber)
Zoubin Ghahramani is a Professor at the University of Cambridge, and Chief Scientist at Uber. He is also Deputy Director of the Leverhulme Centre for the Future of Intelligence, was a founding Director of the Alan Turing Institute and co-founder of Geometric Intelligence (now Uber AI Labs). His research focuses on probabilistic approaches to machine learning and AI. In 2015 he was elected a Fellow of the Royal Society.
Related Events (a corresponding poster, oral, or spotlight)
-
2017 Talk: Automatic Discovery of the Statistical Types of Variables in a Dataset »
Mon. Aug 7th 01:24 -- 01:42 AM Room C4.9& C4.10
More from the Same Authors
-
2022 : Plex: Towards Reliability using Pretrained Large Model Extensions »
Dustin Tran · Andreas Kirsch · Balaji Lakshminarayanan · Huiyi Hu · Du Phan · D. Sculley · Jasper Snoek · Jeremiah Liu · Jie Ren · Joost van Amersfoort · Kehang Han · E. Kelly Buchanan · Kevin Murphy · Mark Collier · Mike Dusenberry · Neil Band · Nithum Thain · Rodolphe Jenatton · Tim G. J Rudner · Yarin Gal · Zachary Nado · Zelda Mariet · Zi Wang · Zoubin Ghahramani -
2022 : Plex: Towards Reliability using Pretrained Large Model Extensions »
Dustin Tran · Andreas Kirsch · Balaji Lakshminarayanan · Huiyi Hu · Du Phan · D. Sculley · Jasper Snoek · Jeremiah Liu · JIE REN · Joost van Amersfoort · Kehang Han · Estefany Kelly Buchanan · Kevin Murphy · Mark Collier · Michael Dusenberry · Neil Band · Nithum Thain · Rodolphe Jenatton · Tim G. J Rudner · Yarin Gal · Zachary Nado · Zelda Mariet · Zi Wang · Zoubin Ghahramani -
2023 Poster: Neural Diffusion Processes »
Vincent Dutordoir · Alan Saul · Zoubin Ghahramani · Fergus Simpson -
2022 : Plex: Towards Reliability using Pretrained Large Model Extensions »
Dustin Tran · Andreas Kirsch · Balaji Lakshminarayanan · Huiyi Hu · Du Phan · D. Sculley · Jasper Snoek · Jeremiah Liu · JIE REN · Joost van Amersfoort · Kehang Han · Estefany Kelly Buchanan · Kevin Murphy · Mark Collier · Michael Dusenberry · Neil Band · Nithum Thain · Rodolphe Jenatton · Tim G. J Rudner · Yarin Gal · Zachary Nado · Zelda Mariet · Zi Wang · Zoubin Ghahramani -
2020 Poster: Einsum Networks: Fast and Scalable Learning of Tractable Probabilistic Circuits »
Robert Peharz · Steven Lang · Antonio Vergari · Karl Stelzner · Alejandro Molina · Martin Trapp · Guy Van den Broeck · Kristian Kersting · Zoubin Ghahramani -
2018 Poster: Variational Bayesian dropout: pitfalls and fixes »
Jiri Hron · Alexander Matthews · Zoubin Ghahramani -
2018 Poster: The Mirage of Action-Dependent Baselines in Reinforcement Learning »
George Tucker · Surya Bhupatiraju · Shixiang Gu · Richard E Turner · Zoubin Ghahramani · Sergey Levine -
2018 Oral: Variational Bayesian dropout: pitfalls and fixes »
Jiri Hron · Alexander Matthews · Zoubin Ghahramani -
2018 Oral: The Mirage of Action-Dependent Baselines in Reinforcement Learning »
George Tucker · Surya Bhupatiraju · Shixiang Gu · Richard E Turner · Zoubin Ghahramani · Sergey Levine -
2018 Poster: Discovering Interpretable Representations for Both Deep Generative and Discriminative Models »
Tameem Adel · Zoubin Ghahramani · Adrian Weller -
2018 Oral: Discovering Interpretable Representations for Both Deep Generative and Discriminative Models »
Tameem Adel · Zoubin Ghahramani · Adrian Weller -
2017 Poster: Magnetic Hamiltonian Monte Carlo »
Nilesh Tripuraneni · Mark Rowland · Zoubin Ghahramani · Richard E Turner -
2017 Talk: Magnetic Hamiltonian Monte Carlo »
Nilesh Tripuraneni · Mark Rowland · Zoubin Ghahramani · Richard E Turner -
2017 Poster: Lost Relatives of the Gumbel Trick »
Matej Balog · Nilesh Tripuraneni · Zoubin Ghahramani · Adrian Weller -
2017 Poster: Bayesian inference on random simple graphs with power law degree distributions »
Juho Lee · Creighton Heaukulani · Zoubin Ghahramani · Lancelot F. James · Seungjin Choi -
2017 Talk: Lost Relatives of the Gumbel Trick »
Matej Balog · Nilesh Tripuraneni · Zoubin Ghahramani · Adrian Weller -
2017 Talk: Bayesian inference on random simple graphs with power law degree distributions »
Juho Lee · Creighton Heaukulani · Zoubin Ghahramani · Lancelot F. James · Seungjin Choi -
2017 Poster: A Birth-Death Process for Feature Allocation »
Konstantina Palla · David Knowles · Zoubin Ghahramani -
2017 Poster: Deep Bayesian Active Learning with Image Data »
Yarin Gal · Riashat Islam · Zoubin Ghahramani -
2017 Talk: A Birth-Death Process for Feature Allocation »
Konstantina Palla · David Knowles · Zoubin Ghahramani -
2017 Talk: Deep Bayesian Active Learning with Image Data »
Yarin Gal · Riashat Islam · Zoubin Ghahramani