Skip to yearly menu bar Skip to main content


Poster
in
Workshop: DMLR Workshop: Data-centric Machine Learning Research

Ensemble Fractional Imputation for Incomplete Categorical Data with a Graphical Model

Yonghyun Kwon · Jae-kwang Kim


Abstract:

Missing data is common in practice, and standard statistical inference can be biased when missingness is related to the outcome of interest. We present a frequentist approach using a graphical model and fractional imputation, which can handle missing data for multivariate categorical variables under missing at random assumption. To avoid the problem due to the curse of dimensionality in multivariate data, we adopt the idea of a random forest to fit multiple reduced models and then combine multiple models using model weights. The model weights are computed from the novel method, double projection, where the observed likelihood is projected to the class of a graphical mixture model. The performance of the proposed method is investigated through an extensive simulation study.

Chat is not available.