Skip to yearly menu bar Skip to main content

Workshop: New Frontiers in Adversarial Machine Learning

Fair Universal Representations using Adversarial Models

Monica Welfert · Peter Kairouz · Jiachun Liao · Chong Huang · Lalitha Sankar


We present a data-driven framework for learning fair universal representations (FUR) that guarantee statistical fairness for any learning task that may not be known a priori. Our framework leverages recent advances in adversarial learning to allow a data holder to learn representations in which a set of sensitive attributes are decoupled from the rest of the dataset. We formulate this as a constrained minimax game between an encoder and an adversary where the constraint ensures a measure of usefulness (utility) of the representation. For appropriately chosen adversarial loss functions, our framework precisely clarifies the optimal adversarial strategy against strong information-theoretic adversaries; it also achieves the fairness measure of demographic parity for the resulting constrained representations. We highlight our results for the UCI Adult and UTKFace datasets.

Chat is not available.