Workshop: Responsible Decision Making in Dynamic Environments

Exposing Algorithmic Bias through Inverse Design

Carmen Mazijn · Carina Prunkl · Andres Algaba · Jan Danckaert · Vincent Ginis


Traditional group fairness notions assess a model’s equality of outcome by computing statistical metrics on the outputs. We argue that these output metrics encounter fundamental obstacles and present a novel approach that aligns with equality of treatment. Through gradient-based inverse design, we generate a canonical set that shows the desired inputs for a model given a preferred output. The canonical set reveals the internal logic of the model and thereby exposes potential unethical biases. For the UCI Adult data set, we find that the biases detected by a canonical set interestingly differ from those of output metrics.

Chat is not available.