Timezone: »
Sub-population Guarantees for Importance Weights and KL-Divergence Estimation
Parikshit Gopalan · Nina Narodytska · Omer Reingold · Vatsal Sharan · Udi Wieder
The ratio between the probability that two distributions give to points in the domain are called importance weights or density ratios and they play a fundamental role in machine learning and information theory. Estimating them from samples is the subject of extensive research in multiple communities. However, there are strong lower bounds known for point-wise accurate estimation, and most theoretical guarantees require strong assumptions about the distributions. In this work we sidestep the lower bound by providing accuracy guarantees for the estimation of importance weights over \emph{sub-populations} belonging to a family $\mC$ of subsets of the domain. We argue that they capture intuitive expectations about importance weights, and could be viewed as a notion of a multi-group fairness guarantee. We formulate {\em sandwiching bounds} for sets: upper and lower bounds on the expected importance weight conditioned on a set. Additionally we show upper and lower bound on the expected log of the importance weight, often referred as the Kullback-Leibler (KL) divergence between the distributions, a problem which received a lot of attention on its own. Our techniques are inspired by the notion of a multicalibrated predictor \cite{hkrr2018}, originally perceived as a fairness guarantee for predictors. We demonstrate the effectiveness of our approach in experiments.
Author Information
Parikshit Gopalan (VMware Research)
Nina Narodytska (VMWare)
Omer Reingold (Stanford University)
Vatsal Sharan (Stanford University)
Udi Wieder (VMware Research)
More from the Same Authors
-
2023 Poster: Omnipredictors for Constrained Optimization »
Lunjia Hu · Inbal Livni Navon · Omer Reingold · Chutong Yang -
2021 Poster: Explanations for Monotonic Classifiers. »
Joao Marques-Silva · Thomas Gerspacher · Martin Cooper · Alexey Ignatiev · Nina Narodytska -
2021 Spotlight: Explanations for Monotonic Classifiers. »
Joao Marques-Silva · Thomas Gerspacher · Martin Cooper · Alexey Ignatiev · Nina Narodytska -
2020 Poster: Sample Amplification: Increasing Dataset Size even when Learning is Impossible »
Brian Axelrod · Shivam Garg · Vatsal Sharan · Gregory Valiant -
2019 Poster: Compressed Factorization: Fast and Accurate Low-Rank Factorization of Compressively-Sensed Data »
Vatsal Sharan · Kai Sheng Tai · Peter Bailis · Gregory Valiant -
2019 Oral: Compressed Factorization: Fast and Accurate Low-Rank Factorization of Compressively-Sensed Data »
Vatsal Sharan · Kai Sheng Tai · Peter Bailis · Gregory Valiant -
2018 Poster: Multicalibration: Calibration for the (Computationally-Identifiable) Masses »
Ursula Hebert-Johnson · Michael Kim · Omer Reingold · Guy Rothblum -
2018 Oral: Multicalibration: Calibration for the (Computationally-Identifiable) Masses »
Ursula Hebert-Johnson · Michael Kim · Omer Reingold · Guy Rothblum -
2017 Poster: Orthogonalized ALS: A Theoretically Principled Tensor Decomposition Algorithm for Practical Use »
Vatsal Sharan · Gregory Valiant -
2017 Talk: Orthogonalized ALS: A Theoretically Principled Tensor Decomposition Algorithm for Practical Use »
Vatsal Sharan · Gregory Valiant