Workshop on Socially Responsible Machine Learning

Chaowei Xiao, Animashree Anandkumar, Mingyan Liu, Dawn Song, Raquel Urtasun, Jieyu Zhao, Xueru Zhang, Cihang Xie, Xinyun Chen, Bo Li


Machine learning (ML) systems have been increasingly used in many applications, ranging from decision-making systems to safety-critical tasks. While the hope is to improve decision-making accuracy and societal outcomes with these ML models, concerns have been incurred that they can inflict harm if not developed or used with care. It has been well-documented that ML models can: (1) inherit pre-existing biases and exhibit discrimination against already-disadvantaged or marginalized social groups; (2) be vulnerable to security and privacy attacks that deceive the models and leak the training data's sensitive information; (3) make hard-to-justify predictions with a lack of transparency. Therefore, it is essential to build socially responsible ML models that are fair, robust, private, transparent, and interpretable.

Although extensive studies have been conducted to increase trust in ML, many of them either focus on well-defined problems that enable nice tractability from a mathematical perspective but are hard to adapt to real-world systems, or they mainly focus on mitigating risks in real-world applications without providing theoretical justifications. Moreover, most work studies those issues separately; the connections among them are less well-understood. This workshop aims to build connections by bringing together both theoretical and applied researchers from various communities (e.g., machine learning, fairness & ethics, security, privacy, etc.). We aim to synthesize promising ideas and research directions, as well as strengthen cross-community collaborations. We hope to chart out important directions for future work. We have an advisory committee and confirmed speakers whose expertise represents the diversity of the technical problems in this emerging research field.

Chat is not available.

Timezone: »


Sat 5:45 a.m. - 5:58 a.m.
Anima Anandkumar. Opening remarks (Opening remarks)   
Chaowei Xiao
Sat 5:45 a.m. - 2:00 p.m.
Workshop on Socially Responsible Machine Learning (Poster) [ Visit Poster at Spot A0 in Virtual World ]
Sat 5:58 a.m. - 6:00 a.m.
Opening remarks   
Sat 6:00 a.m. - 6:40 a.m.
Jun Zhu. Understand and Benchmark Adversarial Robustness of Deep Learning (Invited Talk)   
Chaowei Xiao
Sat 6:40 a.m. - 7:20 a.m.
Olga Russakovsky. Revealing, Quantifying, Analyzing and Mitigating Bias in Visual Recognition (Invited Talk)   
Chaowei Xiao
Sat 7:20 a.m. -

Adversarial machine learning is often used as a tool to assess the negative impacts and failure modes of a machine learning system. In this talk, I will present model reprogramming, a new paradigm of data-efficiency transfer learning motivated by studying the adversarial robustness of deep learning models.

Sat 8:10 a.m. - 8:50 a.m.
Tatsu Hashimoto. Not all uncertainty is noise: machine learning with confounders and inherent disagreements (Invited Talk)   
Sat 8:50 a.m. - 9:30 a.m.
Nicolas Papernot. What Does it Mean for ML to be Trustworthy (talk)   
Chaowei Xiao
Sat 10:30 a.m. - 10:50 a.m.
Contributed Talk-1. Machine Learning API Shift Assessments (Contributed Talk)   
Chaowei Xiao
Sat 10:50 a.m. - 11:30 a.m.

How can we quantify the accuracy and uncertainty of predictions that we make in online decision problems? Standard approaches, like asking for calibrated predictions or giving prediction intervals using conformal methods give marginal guarantees --- i.e. they offer promises that are averages over the history of data points. Guarantees like this are unsatisfying when the data points correspond to people, and the predictions are used in important contexts --- like personalized medicine.

In this work, we study how to give stronger than marginal ("multivalid") guarantees for estimates of means, moments, and prediction intervals. Guarantees like this are valid not just as averaged over the entire population, but also as averaged over an enormous number of potentially intersecting demographic groups. We leverage techniques from game theory to give efficient algorithms promising these guarantees even in adversarial environments.

Sat 11:30 a.m. - 12:10 p.m.
Jun-Yan Zhu. Understanding and Rewriting GANs (Invited Talk)   
Sat 12:20 p.m. - 1:00 p.m.
Kai-Wei Chang. Societal Bias in Language Generation (Invited Talk)   
Chaowei Xiao
Sat 1:00 p.m. - 1:40 p.m.
Yulia Tsvetkov. Proactive NLP: How to Prevent Social and Ethical Problems in NLP Systems? (Invited Talk)   
Sat 1:40 p.m. - 2:00 p.m.
Contributed Talk-2. Do Humans Trust Advice More if it Comes from AI? An Analysis of Human-AI Interactions (Contributed Talk)   
Chaowei Xiao
Sat 2:00 p.m. - 2:20 p.m.
Contributed Talk-3. FERMI: Fair Empirical Risk Minimization Via Exponential Rényi Mutual Information (Contributed Talk)   
Chaowei Xiao
Sat 2:20 p.m. - 2:40 p.m.
Contributed Talk-4. Auditing AI models for Verified Deployment under Semantic Specifications (Contributed Talk)   
Chaowei Xiao
Sat 3:00 p.m. -
Poster Sessions
[ Visit Poster at Spot D4 in Virtual World ]

We propose the use of probabilistic programming techniques to tackle the malicious user identification problem in a recommendation algorithm. Probabilistic programming provides numerous advantages over other techniques, including but not limited to providing a disentangled representation of how malicious users acted under a structured model, as well as allowing for the quantification of damage caused by malicious users. We show experiments in malicious user identification using a model of regular and malicious users interacting with a simple recommendation algorithm, and provide a novel simulation-based measure for quantifying the effects of a user or group of users on its dynamics.

Andrew Gambardella, Naeemullah Khan, Phil Torr, Atilim Gunes Baydin
[ Visit Poster at Spot D3 in Virtual World ]

Model interpretability has become an important problem in \ac{ML} due to the increased effect algorithmic decisions have on humans. Providing users with counterfactual explanations (CF) can help them understand not only why ML models make certain decisions, but also how these decisions can be changed. We extend previous work that could only be applied to differentiable models by introducing probabilistic model approximations in the optimization framework. We find that our CF examples are significantly closer to the original instances compared to other methods specifically designed for tree ensembles.

Ana Lucic, Harrie Oosterhuis, Hinda Haned, Maarten de Rijke
[ Visit Poster at Spot D2 in Virtual World ]

Benchmark datasets are used to show the performance of an algorithm,~e.g. its accuracy, computational speed, or versatility. In the majority of cases, benchmark datasets currently have no external use,~i.e.~an improvement on the benchmark doesn't directly translate to a real-world impact. In this paper, we explore why this is the case, weigh benefits and harms, and propose ways in which benchmark datasets could make a more direct positive impact.

Marius Hobbhahn
[ Visit Poster at Spot D1 in Virtual World ]

The fast-growing machine learning as a service industry has incubated many APIs for multi-label classification tasks such as OCR and multi-object recognition. The heterogeneity in those APIs’ price and performance, however, often forces users to make a choice between accuracy and expense. In this work, we propose FrugalMCT, a principled framework that jointly maximizes the accuracy while minimizes the expense by adaptively selecting the APIs to use for different data.. FrugalMCT combines different APIs’ predictions to improve accuracy, and selects which combination to use to respect expense constraint. Preliminary experiments using ML APIs from Google, Microsoft, and other providers for multi-label image classification show that FrugalMCT often achieves more than 50% cost reduction while matching the accuracy of the best single API.

Lingjiao Chen, James Zou, Matei Zaharia
[ Visit Poster at Spot D0 in Virtual World ]

To interpret uncertainty estimates from differentiable probabilistic models, Antorán et al. (2021) proposed generating a single Counterfactual Latent Uncertainty Explanation (CLUE) for a given data point where the model is uncertain. Ley et al. (2021) formulated δ-CLUE, the set of CLUEs within a δ ball of the original input in latent space- however, we find that many CLUEs generated by this method are very similar, hence redundant. Here we propose DIVerse CLUEs (∇-CLUEs), a set of CLUEs which each provide a distinct explanation as to how one can decrease the uncertainty associated with an input. We further introduce GLobal AMortised CLUEs (GLAM-CLUEs), which represent amortised mappings that apply to specific groups of uncertain inputs, taking them and efficiently transforming them in a single function call into inputs that a model will be certain about. Our experiments show that ∇-CLUEs and GLAM-CLUEs both address shortcomings of CLUE and provide beneficial explanations of uncertainty estimates to practitioners.

Dan Ley, Umang Bhatt, Adrian Weller
[ Visit Poster at Spot C6 in Virtual World ]

In social domains, Machine Learning algorithms often prompt individuals to strategically modify their observable attributes to receive more favorable predictions. As a result, the distribution the predictive model is trained on may differ from the one it operates on in deployment. While such distribution shifts, in general, hinder accurate predictions, our work identifies a unique opportunity associated with shifts due to strategic responses: We show that we can use strategic responses effectively to recover causal relationships between the observable features and outcomes we wish to predict. More specifically, we study a game-theoretic model in which a principal deploys a sequence of models to predict an outcome of interest (e.g., college GPA) for a sequence of strategic agents (e.g., college applicants). In response, strategic agents invest efforts and modify their features for better predictions. In such settings, unobserved confounding variables (e.g., family educational background) can influence both an agent's observable features (e.g., high school records) and outcomes (e.g., college GPA). Therefore, standard regression methods (such as OLS) generally produce biased estimators. In order to address this issue, our work establishes a novel connection between strategic responses to machine learning models and instrumental variable (IV) regression, by observing that the sequence of deployed models can be viewed as an instrument that affects agents' observable features but does not directly influence their outcomes. Therefore, two-stage least squares (2SLS) regression can recover the causal relationships between observable features and outcomes.

Keegan Harris, Daniel Ngo, Logan Stapleton, Hoda Heidari, Steven Wu
[ Visit Poster at Spot C5 in Virtual World ]

Automated decision-making tools increasingly assess individuals to determine if they qualify for high-stakes opportunities. A recent line of research investigates how strategic agents may respond to such scoring tools to receive favorable assessments. While prior work has focused on the short-term strategic interactions between a decision-making institution (modeled as a principal) and individual decision-subjects (modeled as agents), we investigate interactions spanning multiple time-steps. In particular, we consider settings in which the agent's effort investment today can accumulate over time in the form of an internal state - impacting both his future rewards and that of the principal. We characterize the Stackelberg equilibrium of the resulting game and provide novel algorithms for computing it. Our analysis reveals several intriguing insights about the role of multiple interactions in shaping the game's outcome: We establish that in our stateful setting, the class of all linear assessment policies remains as powerful as the larger class of all monotonic assessment policies. Our work addresses several critical gaps in the growing literature on the societal impacts of automated decision-making - by focusing on longer time horizons and accounting for the compounding nature of decisions individuals receive over time.

Keegan Harris, Hoda Heidari, Steven Wu
[ Visit Poster at Spot C4 in Virtual World ]

A growing number of applications rely on machine learning (ML) prediction APIs. Model updates or retraining can change an ML API silently. This leads to a key challenge to API users, who are unaware of if and how the ML model has been changed. We take the first step towards the study of ML API shifts. We first evaluate the performance shifts from 2020 to 2021 of popular ML APIs from Amazon, Baidu, and Google on a variety of datasets. Interestingly, some API’s predictions became notably worse for a certain class and better for another. Thus, we formulate the API shift assessment problem as estimating how the API model’s confusion matrix changes over time when the data distribution is constant. Next, we propose MASA, a principled adaptive sampling algorithm to efficiently estimate confusion matrix shifts. Empirically, MASA can accurately estimate the confusion matrix shifts in commercial ML APIs with up to 77% fewer samples than random sampling. This paves the way for understanding and monitoring ML API shifts efficiently.

Lingjiao Chen, James Zou, Matei Zaharia
[ Visit Poster at Spot C3 in Virtual World ]

3D point cloud data is increasingly used in safety-critical applications such as autonomous driving. Thus, robustness of 3D deep learning models against adversarial attacks is a major consideration. In this paper, we systematically study the impact of various self-supervised learning proxy tasks on different architectures and threat models for 3D point clouds. Specifically, we study MLP-based (PointNet), convolution-based (DGCNN), and transformer-based (PCT) 3D architectures. Through comprehensive experiments, we demonstrate that appropriate self-supervisions can significantly enhance the robustness in 3D point cloud recognition, achieving considerable improvements compared to the standard adversarial training baseline. Our analysis reveals that local feature learning is desirable for adversarial robustness since it limits the adversarial propagation between the point-level input perturbations and the model's final output. It also explains the success of DGCNN and the jigsaw proxy task in achieving 3D robustness.

Jiachen Sun, yulong cao, Christopher Choy, Zhiding Yu, Chaowei Xiao, Anima Anandkumar, Zhuoqing Morley Mao
[ Visit Poster at Spot C2 in Virtual World ]

Missing data are ubiquitous in the era of big data and, if inadequately handled, are known to lead to biased findings and have deleterious impact on data-driven decision makings. To mitigate its impact, many missing value imputation methods have been developed. However, the fairness of these imputation methods across sensitive groups has not been studied. In this paper, we conduct the first known research on fairness of missing data imputation. By studying the performance of imputation methods in three commonly used datasets, we demonstrate that unfairness of missing value imputation widely exists and may be associated with multiple factors. Our results suggest that, in practice, a careful investigation of related factors can provide valuable insights on mitigating unfairness associated with missing data imputation.

Yiliang Zhang, Qi Long
[ Visit Poster at Spot C1 in Virtual World ]

As the representations output by Graph Neural Networks (GNNs) are increasingly employed in real-world applications, it becomes important to ensure that these representations are fair and stable. In this work, we establish a key connection between fairness and stability and leverage it to propose a novel framework, NIFTY (uNIfying Fairness and stabiliTY), which can be used with any GNN to learn fair and stable representations. We introduce an objective function that simultaneously accounts for fairness and stability and proposes layer-wise weight normalization of GNNs using the Lipschitz constant. Further, we theoretically show that our layer-wise weight normalization promotes fairness and stability in the resulting representations. We introduce three new graph datasets comprising of high-stakes decisions in criminal justice and financial lending domains. Extensive experimentation with the above datasets demonstrates the efficacy of our framework.

Chirag Agarwal, Hima Lakkaraju, Marinka Zitnik
[ Visit Poster at Spot C0 in Virtual World ]

A recent line of work has focused on training machine learning (ML) models in the performative setting, i.e. when the data distribution reacts to the deployed model. The goal in this setting is to compute a model which both induces a favorable distribution and performs well on the induced distribution, thereby minimizing the test loss. Previous work on finding an optimal model assumes that the data distribution immediately adapts to the deployed model. In practice, however, this may not be the case, as the population may take time to adapt to the model. In this work, we propose an algorithm for minimizing the performative loss even in the presence of these effects.

Zachary Izzo, James Zou, Lexing Ying
[ Visit Poster at Spot B6 in Virtual World ]
Despite the success of large-scale empirical risk minimization (ERM) at achieving high accuracy across a variety of machine learning tasks, fair ERM is hindered by the incompatibility of fairness constraints with stochastic optimization. In this paper, we propose the fair empirical risk minimization via exponential Rényi mutual information (FERMI) framework. FERMI is built on a stochastic estimator for exponential Rényi mutual information (ERMI), an information divergence measuring the degree of the dependence of predictions on sensitive attributes. Theoretically, we show that ERMI upper bounds existing popular fairness violation metrics, thus controlling ERMI provides guarantees on other commonly used violations, such as $L_\infty$. We derive an unbiased estimator for ERMI, which we use to derive the FERMI algorithm. We prove that FERMI converges for demographic parity, equalized odds, and equal opportunity notions of fairness in stochastic optimization. Empirically, we show that FERMI is amenable to large-scale problems with multiple (non-binary) sensitive attributes and non-binary targets. Extensive experiments show that FERMI achieves the most favorable tradeoffs between fairness violation and test accuracy across all tested setups compared with state-of-the-art baselines for demographic parity, equalized odds, equal opportunity. These benefits are especially significant for non-binary classification with large sensitive sets and small batch sizes, showcasing the effectiveness of the FERMI objective and the developed stochastic algorithm for solving it.
Andrew Lowy, Rakesh Pavan, Sina Baharlouei, Meisam Razaviyayn, Ahmad Beirami
[ Visit Poster at Spot B5 in Virtual World ]

We consider counterfactual explanations for privacy-preserving support vector machines (SVM), where the privacy mechanism that publicly releases the classifier guarantees differential privacy. While privacy preservation is essential when dealing with sensitive data, there is a consequent degradation in the classification accuracy due to the introduced perturbations in the classifier weights. Therefore, counterfactual explanations need to be made robust against such perturbations in order to ensure, with high confidence, that the explanations are valid. In this work, we suitably model the uncertainties in the SVM weights and formulate the robust counterfactual explanation problem. Then, we study optimal and efficient suboptimal algorithms for its solution. Experimental results illustrate the connections between privacy levels, classifier accuracy, and the confidence levels that validate the counterfactual explanations.

Rami Mochaourab, Panagiotis Papapetrou
[ Visit Poster at Spot B4 in Virtual World ]

Auditing trained deep learning (DL) models prior to deployment is vital in preventing unintended consequences. One of the biggest challenges in auditing is in understanding how we can obtain human-interpretable specifications that are directly useful to the end-user. We address this challenge through a sequence of semantically-aligned unit tests, where each unit test verifies whether a predefined specification (e.g., accuracy over 95%) is satisfied with respect to controlled and semantically aligned variations in the input space (e.g., in face recognition, the angle relative to the camera). We perform these unit tests by directly verifying the semantically aligned variations in an interpretable latent space of a generative model. Our framework, AuditAI, bridges the gap between interpretable formal verification and scalability. With evaluations on four different datasets, covering images of towers, chest X-rays, human faces, and ImageNet classes, we show how AuditAI allows us to obtain controlled variations for verification and certified training while addressing the limitations of verifying using only pixel-space perturbations.

Homanga Bharadhwaj, De-An Huang, Chaowei Xiao, Anima Anandkumar, Animesh Garg
[ Visit Poster at Spot B3 in Virtual World ]

In recent years, machine learning techniques utilizing large-scale datasets have achieved remarkable performance. Differential privacy, by means of adding noise, provides strong privacy guarantees for such learning algorithms. The cost of differential privacy is often a reduced model accuracy and a lowered convergence speed. This paper investigates the impact of differential privacy on learning algorithms in terms of their carbon footprint due to either longer run-times or failed experiments. Through extensive experiments, further guidance is provided on choosing the noise levels which can strike a balance between desired privacy levels and reduced carbon emissions.

Rakshit Naidu, Harshita Diddee, Ajinkya Mulay, Vardhan Aleti, Krithika Ramesh, Ahmed Zamzam
[ Visit Poster at Spot B2 in Virtual World ]

In modern image semantic segmentation models, large receptive field is used for better segmentation performance. Due to the inefficiency of directly using large convolution kernels, several techniques such as dilated convolution, attention are invented to increase the receptive field of the deep learning models. However, large receptive fields introduces a new attack vector for adversarial attacks on segmentation/object detection models. In this work, we demonstrate that a large receptive field exposes the models to new risks. To show its serious consequences, we propose a new attack, remote adversarial patch attack, which is able to mislead the prediction results of the targeted object without directly accessing and manipulating (adding) adversarial perturbation to the targeted object. We conduct comprehensive experiments on evaluating the attack on models with different receptive field sizes, which reduces the mIoU from 30% to 100%. In the end, we also apply our remote adversarial patch attack to the physical-world setting. We show that with the adversarial patch printed on the road, it is able to remove the target vehicle at different positions which is unknown in advance.

yulong cao, Jiachen Sun, Chaowei Xiao, Qi Alfred Chen, Zhuoqing Morley Mao
[ Visit Poster at Spot B1 in Virtual World ]

In many applications of AI, the algorithm’s output is framed as a suggestion to a human user. The user may ignore the advice or take it into con- sideration to modify his/her decisions. With the increasing prevalence of such human-AI interac- tions, it is important to understand how users act (or do not act) upon AI advice, and how users re- gard advice differently if they believe the advice come from an “AI” versus other human. In this paper, we characterize how humans use AI sug- gestions relative to equivalent suggestions from a group of peer humans across several experimental settings. We find that participants’ beliefs about the human versus AI performance on a given task affects whether or not they heed the advice. When participants decide the use the advice, they do so similarly for human and AI suggestions. These results provide insights into factors that affect human-AI interactions.

Kailas Vodrahalli, James Zou
[ Visit Poster at Spot B0 in Virtual World ]

Algorithms that aid human decision-making may inadvertently discriminate against certain protected groups. We formalize direct discrimination as a direct causal effect of the protected attributes on the decisions, while induced indirect discrimination as a change in the influence of non-protected features associated with the protected attributes. The measurements of average treatment effect (ATE) and SHapley Additive exPlanations (SHAP) reveal that state-of-the-art fair learning methods can inadvertently induce indirect discrimination in synthetic and real-world datasets. To inhibit discrimination in algorithmic systems, we propose to nullify the influence of the protected attribute on the output of the system, while preserving the influence of remaining features. To achieve this objective, we introduce a risk minimization method which optimizes for the proposed fairness objective. We show that the method leverages model accuracy and disparity measures.

Aarshee Mishra, Nicholas Perello, Przemyslaw Grabowicz
[ Visit Poster at Spot A6 in Virtual World ]

We introduce Societal Norm Bias (SNoB), a subtle but consequential type of discrimination that may be exhibited by machine learning classification algorithms, even when these systems achieve group fairness objectives. This work illuminates the gap between definitions of algorithmic group fairness and concerns of harm based on adherence to societal norms. We study this issue through the lens of gender bias in occupation classification from online biographies. We quantify SNoB by measuring how an algorithm's predictions are associated with gender norms. This framework reveals that for classification tasks related to male-dominated occupations, fairness-aware classifiers favor biographies whose language aligns with masculine gender norms. We compare SNoB across fairness intervention techniques, finding that post-processing interventions do not mitigate this bias at all.

Myra Cheng, Maria De-Arteaga, Lester Mackey, Adam Tauman Kalai
[ Visit Poster at Spot A5 in Virtual World ]

The growing use of machine learning models in consequential settings has highlighted an important and seemingly irreconcilable tension between transparency and vulnerability to gaming. While this has sparked intense debate in legal literature, there has been comparatively less technical study of this contention. In this work, we propose a clean-cut formulation of this tension and a way to make the tradeoff between transparency and gaming -- it need not be one or the other. We identify the source of gaming as being points in the \emph{margin} of the model. And we initiate an investigation on how to provide example-based explanations that are expansive and yet consistent with a version space that is sufficiently uncertain with respect to the margin points' labels. Finally, we furnish our theoretical results with empirical investigations of this tradeoff on real-world datasets.

Tom Yan, Chicheng Zhang
[ Visit Poster at Spot A4 in Virtual World ]

The potential for machine learning systems to amplify social inequities and unfairness is receiving increasing popular and academic attention. However, there is little work on enhancing fairness in graph algorithms. Here, we develop a simple, effective and general method, CrossWalk, that enhances fairness of various graph algorithms, including influence maximization, link prediction and node classification, applied to node embeddings. CrossWalk is applicable to any random walk based node representation learning algorithm, such as DeepWalk and Node2Vec. The key idea is to bias random walks to cross group boundaries, by upweighting edges which (1) are closer to the groups' peripheries or (2) connect different groups in the network. It pulls nodes that are near groups' peripheries towards their neighbors from other groups in the embedding space, while preserving the necessary structural information from the graph. Extensive experiments show the effectiveness of our algorithm to enhance fairness in various graph algorithms in synthetic and real networks, with only a very small decrease in performance.

Ahmad Khajehnejad, Moein Khajehnejad, Krishna Gummadi, Adrian Weller, Baharan Mirzasoleiman
[ Visit Poster at Spot A3 in Virtual World ]

Collecting annotations from human raters often results in a trade-off between the quantity of labels one wishes to gather and the quality of these labels. As such, it is only possible to gather a small amount of high-quality labels. In this paper, we study how different training strategies can leverage a small dataset of human-annotated labels and a large but noisy dataset of synthetically generated labels (which exhibit bias against identity groups) for predicting toxicity of online comments. We evaluate the accuracy and fairness properties of these approaches, and whether there is a trade-off. While we find that pre-training on all of the data and fine-tuning on clean data produces the most accurate models, we could not determine a single strategy that was better across all fairness metrics considered.

Neel Nanda, Jonathan Uesato, Sven Gowal
[ Visit Poster at Spot A2 in Virtual World ]

A plug-in algorithm to estimate Bayes Optimal Classifiers for fairness-aware binary classification has been proposed in (Menon & Williamson, 2018). However, the statistical efficacy of their approach has not been established. We prove that the plug-in algorithm is statistically consistent. We also derive finite sample guarantees associated with learning the Bayes Optimal Classifiers via the plug-in algorithm. Finally, we propose a protocol that modifies the plug-in approach, so as to simultaneously guarantee fairness and differential privacy with respect to a binary feature deemed sensitive.

Drona Khurana, Srinivasan Ravichandran, Sparsh Jain, Narayanan Edakunni
[ Visit Poster at Spot A1 in Virtual World ]

Training machine learning models with the ultimate goal of maximizing only the accuracy could results in learning biases from data, making the learned model discriminatory towards certain groups. One approach to mitigate this problem is to find a representation which is more likely to yield fair outcomes using fair representation learning. In this paper, we propose a new fair representation leaning approach that leverages different level of representation of data to tighten the fairness bounds of the learned representation. Our results show that stacking different auto encoders and enforcing fairness at different latent spaces result in an improvement of fairness compared to other existing approaches.

Patrik Joslin Kenfack, Adil Khan, Rasheed Hussain