Timezone: »
Poster
Supervised Quantile Normalization for Low Rank Matrix Factorization
Marco Cuturi · Olivier Teboul · Jonathan Niles-Weed · Jean-Philippe Vert
Thu Jul 16 12:00 PM -- 12:45 PM & Fri Jul 17 01:00 AM -- 01:45 AM (PDT) @
Low rank matrix factorization is a fundamental building block in machine learning, used for instance to summarize gene expression profile data or word-document counts. To be robust to outliers and differences in scale across features, a matrix factorization step is usually preceded by ad-hoc feature normalization steps, such as tf-idf scaling or data whitening. We propose in this work to learn these normalization operators jointly with the factorization itself. More precisely, given a $d\times n$ matrix $X$ of $d$ features measured on $n$ individuals, we propose to learn the parameters of quantile normalization operators that can operate row-wise on the values of $X$ and/or of its factorization $UV$ to improve the quality of the low-rank representation of $X$ itself. This optimization is facilitated by the introduction of differentiable quantile normalization operators derived using regularized optimal transport algorithms.
Author Information
Marco Cuturi (Google)
Olivier Teboul (Google Brain)
Jonathan Niles-Weed (NYU)
Jean-Philippe Vert (Google)
More from the Same Authors
-
2023 Poster: Perturbation Analysis of Neural Collapse »
Tom Tirer · Haoxiang Huang · Jonathan Niles-Weed -
2023 Poster: Minimax estimation of discontinuous optimal transport maps: The semi-discrete case »
Aram-Alexandre Pooladian · Vincent Divol · Jonathan Niles-Weed -
2022 Poster: Debiaser Beware: Pitfalls of Centering Regularized Transport Maps »
Aram-Alexandre Pooladian · Marco Cuturi · Jonathan Niles-Weed -
2022 Spotlight: Debiaser Beware: Pitfalls of Centering Regularized Transport Maps »
Aram-Alexandre Pooladian · Marco Cuturi · Jonathan Niles-Weed -
2022 Poster: Deep Probability Estimation »
Sheng Liu · Aakash Kaku · Weicheng Zhu · Matan Leibovich · Sreyas Mohan · Boyang Yu · Haoxiang Huang · Laure Zanna · Narges Razavian · Jonathan Niles-Weed · Carlos Fernandez-Granda -
2022 Poster: Linear-Time Gromov Wasserstein Distances using Low Rank Couplings and Costs »
Meyer Scetbon · Gabriel Peyré · Marco Cuturi -
2022 Spotlight: Deep Probability Estimation »
Sheng Liu · Aakash Kaku · Weicheng Zhu · Matan Leibovich · Sreyas Mohan · Boyang Yu · Haoxiang Huang · Laure Zanna · Narges Razavian · Jonathan Niles-Weed · Carlos Fernandez-Granda -
2022 Spotlight: Linear-Time Gromov Wasserstein Distances using Low Rank Couplings and Costs »
Meyer Scetbon · Gabriel Peyré · Marco Cuturi -
2021 Poster: Low-Rank Sinkhorn Factorization »
Meyer Scetbon · Marco Cuturi · Gabriel Peyré -
2021 Spotlight: Low-Rank Sinkhorn Factorization »
Meyer Scetbon · Marco Cuturi · Gabriel Peyré -
2020 Poster: Regularized Optimal Transport is Ground Cost Adversarial »
François-Pierre Paty · Marco Cuturi -
2020 Poster: Missing Data Imputation using Optimal Transport »
Boris Muzellec · Julie Josse · Claire Boyer · Marco Cuturi -
2020 Poster: Debiased Sinkhorn barycenters »
Hicham Janati · Marco Cuturi · Alexandre Gramfort -
2020 Poster: Fast Differentiable Sorting and Ranking »
Mathieu Blondel · Olivier Teboul · Quentin Berthet · Josip Djolonga -
2019 Poster: kernelPSI: a Post-Selection Inference Framework for Nonlinear Variable Selection »
Lotfi Slim · Clément Chatelain · Chloe-Agathe Azencott · Jean-Philippe Vert -
2019 Poster: Subspace Robust Wasserstein Distances »
François-Pierre Paty · Marco Cuturi -
2019 Oral: Subspace Robust Wasserstein Distances »
François-Pierre Paty · Marco Cuturi -
2019 Oral: kernelPSI: a Post-Selection Inference Framework for Nonlinear Variable Selection »
Lotfi Slim · Clément Chatelain · Chloe-Agathe Azencott · Jean-Philippe Vert