Skip to yearly menu bar Skip to main content


Poster
in
Workshop: The Second Workshop on Spurious Correlations, Invariance and Stability

Feature Selection in the Presence of Monotone Batch Effects

Peng Dai · Sina Baharlouei · Meisam Razaviyayn · Sze-Chuan Suen


Abstract: We study the problem of feature selection in the presence of monotone batch effects when merging datasets from disparate technologies and different environments affects the underlying causal dependence of data features. We propose two novel algorithms for this task: 1) joint feature selection and batch effect correction through transforming the data batches using deep neural networks; 2) transforming data using a batch-invariant characteristic (i.e., feature rank) to append datasets. We assess the performance of the feature selection methods in the presence of a monotone batch effect by $F_1$ score.Our experiments on synthetic data show that the former method combined with Lasso improves the $F_1$ score significantly, even with few samples per dataset. This method outperforms popular batch effect removal algorithms, including Combat-Seq, Limma, and PCA. Comparatively, while the ranking method is computationally more efficient, its performance is worse due to the information loss resulted from ignoring the magnitude of data.

Chat is not available.