Timezone: »

Transformed Distribution Matching for Missing Value Imputation
He Zhao · Ke Sun · Amir Dezfouli · Edwin V. Bonilla

Wed Jul 26 05:00 PM -- 06:30 PM (PDT) @ Exhibit Hall 1 #728
Event URL: https://github.com/hezgit/TDM »

We study the problem of imputing missing values in a dataset, which has important applications in many domains. The key to missing value imputation is to capture the data distribution with incomplete samples and impute the missing values accordingly. In this paper, by leveraging the fact that any two batches of data with missing values come from the same data distribution, we propose to impute the missing values of two batches of samples by transforming them into a latent space through deep invertible functions and matching them distributionally. To learn the transformations and impute the missing values simultaneously, a simple and well-motivated algorithm is proposed. Our algorithm has fewer hyperparameters to fine-tune and generates high-quality imputations regardless of how missing values are generated. Extensive experiments over a large number of datasets and competing benchmark algorithms show that our method achieves state-of-the-art performance.

Author Information

He Zhao (CSIRO's Data61)
Ke Sun (Data61)
Amir Dezfouli (CSIRO's Data61)
Edwin V. Bonilla (CSIRO's Data61)

I am a Science Leader for Foundational Machine Learning of the Analytics and Decision Sciences Research program at CSIRO’s Data61, Australia. My expertise is in probabilistic modelling and inference algorithms for the analysis of complex data, in areas such as scalable Bayesian inference, Gaussian processes and multi-task learning. I have worked in applications such as geophysical inversions, spatio-temporal modelling, computer vision and document analysis. My current interests include Gaussian processes, Bayesian optimization, optimal design of experiments, neural differential equations and graph neural networks.

More from the Same Authors