Algorithmic decisionmaking systems are increasingly used in sensitive applications such as advertising, resume reviewing, employment, credit lending, policing, criminal justice, and beyond. The longterm promise of these approaches is to automate, augment and/or eventually improve on the human decisions which can be biased or unfair, by leveraging the potential of machine learning to make decisions supported by historical data. Unfortunately, there is a growing body of evidence showing that the current machine learning technology is vulnerable to privacy or security attacks, lacks interpretability, or reproduces (and even exacerbates) historical biases or discriminatory behaviors against certain social groups.
Most of the literature on building socially responsible algorithmic decisionmaking systems focus on a static scenario where algorithmic decisions do not change the data distribution. However, realworld applications involve nonstationarities and feedback loops that must be taken into account to measure and mitigate fairness in the longterm. These feedback loops involve the learning process which may be biased because of insufficient exploration, or changes in the environment's dynamics due to strategic responses of the various stakeholders. From a machine learning perspective, these sequential processes are primarily studied through counterfactual analysis and reinforcement learning.
The purpose of this workshop is to bring together researchers from both industry and academia working on the full spectrum of responsible decisionmaking in dynamic environments, from theory to practice. In particular, we encourage submissions on the following topics: fairness, privacy and security, robustness, conservative and safe algorithms, explainability and interpretability.
Sat 6:00 a.m.  2:30 p.m.

Please visit the workshop website for the full program ( Program ) link »  🔗 
Sat 6:00 a.m.  6:10 a.m.

Introduction and opening remarks
(
Intro
)
SlidesLive Video » 
🔗 
Sat 6:10 a.m.  6:40 a.m.

Responsible DecisionMaking in Batch RL Settings
(
Keynote
)
SlidesLive Video » 
Finale DoshiVelez 🔗 
Sat 6:40 a.m.  8:00 a.m.

Poster session (inperson only, with coffee break)
(
Poster session
)

🔗 
Sat 8:00 a.m.  8:30 a.m.

Robust Multivalid Uncertainty Quantification
(
Keynote
)
SlidesLive Video » 
Aaron Roth 🔗 
Sat 8:30 a.m.  8:45 a.m.

Individually Fair Learning with OneSided Feedback
(
Contributed talk
)
SlidesLive Video » 
Yahav Bechavod 🔗 
Sat 8:45 a.m.  9:00 a.m.

Reward Reports for Reinforcement Learning
(
Contributed talk
)
SlidesLive Video » 
Nathan Lambert 🔗 
Sat 11:00 a.m.  11:30 a.m.

Dimension Reduction Tools and Their Use in Responsible Data Understanding in Dynamic Environments
(
Keynote
)
SlidesLive Video » 
Cynthia Rudin 🔗 
Sat 11:30 a.m.  12:00 p.m.

Explanations in Whose Interests?
(
Keynote
)
SlidesLive Video » 
🔗 
Sat 12:00 p.m.  1:00 p.m.

Poster session (inperson only, with coffee break)
(
Poster session
)

🔗 
Sat 1:00 p.m.  1:30 p.m.

ExposureAware Recommendation using Contextual Bandits
(
Keynote
)
SlidesLive Video » 
🔗 
Sat 1:30 p.m.  2:00 p.m.

Modeling Recommender Ecosystems  Some Considerations
(
Keynote
)
SlidesLive Video » 
Craig Boutilier 🔗 
Sat 2:00 p.m.  2:15 p.m.

Optimal Rates of (Locally) Differentially Private Heavytailed MultiArmed Bandits
(
Contributed talk
)
SlidesLive Video » 
Yulian Wu 🔗 
Sat 2:15 p.m.  2:30 p.m.

A GameTheoretic Perspective on Trust in Recommendation
(
Contributed talk
)
SlidesLive Video » 
Sarah Cen 🔗 


Combining Counterfactuals With Shapley Values To Explain Image Models
(
Poster
)
With the widespread use of sophisticated machine learning models in sensitive applications, understanding their decisionmaking has become an essential task. Models trained on tabular data have witnessed significant progress in explanations of their underlying decision making processes by virtue of having a small number of discrete features. However, applying these methods to highdimensional inputs such as images is not a trivial task. Images are composed of pixels at an atomic level and do not carry any interpretability by themselves. In this work, we seek to use annotated highlevel interpretable features of images to provide explanations. We leverage the Shapley value framework from Game Theory, which has garnered wide acceptance in general XAI problems. By developing a pipeline to generate counterfactuals and subsequently using it to estimate Shapley values, we obtain contrastive and interpretable explanations with strong axiomatic guarantees. 
Aditya Lahiri · Kamran Alipour · Ehsan Adeli · Babak Salimi 🔗 


Perspectives on Incorporating Expert Feedback into Model Updates
(
Poster
)
Machine learning (ML) practitioners are increasingly tasked with developing models that are aligned with nontechnical experts' values and goals. However, there has been insufficient consideration on how practitioners should translate domain expertise into ML updates. In this paper, we consider how to capture interactions between practitioners and experts systematically. We devise a taxonomy to match expert feedback types with practitioner updates. A practitioner may receive feedback from an expert at the observation or domainlevel, and convert this feedback into updates to the dataset, loss function, or parameter space. We review existing work from ML and humancomputer interaction to describe this feedbackupdate taxonomy, and highlight the insufficient consideration given to incorporating feedback from nontechnical experts. We end with open questions that naturally arise from our proposed taxonomy and subsequent survey. 
Valerie Chen · Umang Bhatt · Hoda Heidari · Adrian Weller · Ameet Talwalkar 🔗 


Individually Fair Learning with OneSided Feedback
(
Poster
)
We consider an online learning problem with onesided feedback, in which the learner is able to observe the true label only for positively predicted instances. On each round, k instances arrive and receive classification outcomes according to a randomized policy deployed by the learner, whose goal is to maximize accuracy while deploying individually fair policies. We first extend the framework of Bechavod et al. (2020), which relies on the existence of a human fairness auditor for detecting fairness violations, to instead incorporate feedback from dynamicallyselected panels of multiple, possibly inconsistent, auditors. We then construct an efficient reduction from our problem of online learning with onesided feedback and a panel reporting fairness violations to the contextual combinatorial semibandit problem (CesaBianchi & Lugosi, 2009, György et al., 2007). Finally, we show how to leverage the guarantees of two algorithms in the contextual combinatorial semibandit setting: Exp2 (Bubeck et al., 2012) and the oracleefficient ContextSemiBanditFTPL (Syrgkanis et al., 2016), to provide multicriteria no regret guarantees simultaneously for accuracy and fairness. Our results eliminate two potential sources of bias from prior work: the "hidden outcomes" that are not available to an algorithm operating in the full information setting, and human biases that might be present in any single human auditor, but can be mitigated by selecting a well chosen panel. 
Yahav Bechavod · Aaron Roth 🔗 


Robust Reinforcement Learning with Distributional Riskaverse formulation
(
Poster
)
Robust Reinforcement Learning tries to make predictions more robust to changes in the dynamics or rewards of the system. This problem is particularly important when the dynamics and rewards of the environment are estimated from the data. In this paper, we approximate the Robust Reinforcement Learning constrained with a $\Phi$divergence using an approximate RiskAverse formulation. We show that the classical Reinforcement Learning formulation can be robustified using standard deviation penalization of the objective. Two algorithms based on Distributional Reinforcement Learning, one for discrete and one for continuous action spaces are proposed and tested in a classical Gym environment to demonstrate the robustness of the algorithms.

Pierre Clavier · Stephanie Allassonniere · Erwann LE PENNEC 🔗 


Optimal Dynamic Regret in LQR Control
(
Poster
)
We consider the problem of nonstochastic control with a sequence of quadratic losses, i.e., LQR control. We provide an efficient online algorithm that achieves an optimal dynamic (policy) regret of $\tilde{O}(n^{1/3} \TV(M_{1:n}^{2/3} \vee 1)$, where $\TV(M_{1:n})$ is the total variation of any oracle sequence of \emph{Disturbance Action} policies parameterized by $M_1,...,M_n$  chosen in hindsight to cater to unknown nonstationarity. The rate improves the best known rate of $\tilde{O}(\sqrt{n (\TV(M_{1:n})+1)} )$ for general convex losses and is informationtheoretically optimal for LQR. Main technical components include the reduction of LQR to online linear regression with delayed feedback due to Foster and Simchowitz 2020, as well as a new \emph{proper} learning algorithm with an optimal $\tilde{O}(n^{1/3})$ dynamic regret on a family of ``minibatched'' quadratic losses, which could be of independent interest.

Dheeraj Baby · YuXiang Wang 🔗 


Optimal Rates of (Locally) Differentially Private Heavytailed MultiArmed Bandits
(
Poster
)
In this paper we investigate the problem of stochastic multiarmed bandits (MAB) in the (local) differential privacy (DP/LDP) model. Unlike previous results that assume bounded/subGaussian reward distributions, we focus on the setting where each arm's reward distribution only has $(1+v)$th moment with some $v\in (0, 1]$. In the first part, we study the problem in the central $\epsilon$DP model. We first provide a nearoptimal result by developing a private and robust Upper Confidence Bound (UCB) algorithm. Then, we improve the result via a private and robust version of the Successive Elimination (SE) algorithm. Finally, we establish the lower bound to show that the instancedependent regret of our improved algorithm is optimal. In the second part, we study the problem in the $\epsilon$LDP model. We propose an algorithm that can be seen as locally private and robust version of SE algorithm, which provably achieves (near) optimal rates for both instancedependent and instanceindependent regret. Our results reveal differences between the problem of private MAB with bounded/subGaussian rewards and heavytailed rewards. To achieve these (near) optimal rates, we develop several new hard instances and private robust estimators as byproducts, which might be used to other related problems.

Yulian Wu · Youming Tao · Peng Zhao · Di Wang 🔗 


RISE: Robust Individualized Decision Learning with Sensitive Variables
(
Poster
)
This paper introduces RISE, a robust individualized decision learning framework with sensitive variables, where sensitive variables are collectible data and important to the intervention decision, but their inclusion in decision making is prohibited due to reasons such as delayed availability or fairness concerns. The convention is to ignore these sensitive variables in learning decision rules, leading to significant uncertainty and bias. To address this, we propose a decision learning framework to incorporate sensitive variables during offline training but do not include them in the input of the learned decision rule during model deployment. Specifically, from a causal perspective, the proposed framework intends to improve the worstcase outcomes of individuals caused by sensitive variables that are unavailable at the time of decision. Unlike most existing literature that uses meanoptimal objectives, we propose a robust learning framework via finding a newly defined quantile or infimumoptimal decision rule. The reliable performance of the proposed method is demonstrated through synthetic experiments and three realdata applications. 
Xiaoqing (Ellen) Tan · Zhengling Qi · Christopher Seymour · Lu Tang 🔗 


Adversarial Cheap Talk
(
Poster
)
Adversarial attacks in reinforcement learning (RL) often assume highlyprivileged access to the learning agent’s parameters, environment or data. Instead, this paper proposes a novel adversarial setting called a Cheap Talk MDP in which an Adversary has a minimal range of influence over the Victim. Parameterised as a deterministic policy that only conditions on the current state, an Adversary can merely append information to a Victim’s observation. To motivate the minimumviability, we prove that in this setting the Adversary cannot occlude the ground truth, influence the underlying dynamics of the environment, introduce nonstationarity, add stochasticity, see the Victim’s actions, or access their parameters. Additionally, we present a novel metalearning algorithm to train the Adversary, called adversarial cheap talk (ACT). Using ACT, we demonstrate that the resulting Adversary still manages to influence the Victim’s training and test performance despite these restrictive assumptions. Affecting traintime performance reveals a new attack vector and provides insight into the success and failure modes of existing RL algorithms. More specifically, we show that an ACT Adversary is capable of harming performance by interfering with the learner’s function approximation and helping the Victim’s performance by appending useful features. Finally, we demonstrate that an ACT Adversary can append information during traintime to directly and arbitrarily control the Victim at testtime in a zeroshot manner. 
Christopher Lu · Timon Willi · Alistair Letcher · Jakob Foerster 🔗 


Acting Optimistically in Choosing Safe Actions
(
Poster
)
We investigate a natural but surprisingly unstudied approach to the multiarmed bandit problem under safety risk constraints. Each arm is associated with an unknown law on safety risks and rewards, and the learner's goal is to maximise reward whilst not playing unsafe arms, as determined by a given threshold on the mean risk.We formulate a pseudoregret for this setting that enforces this safety constraint in a perround way by softly penalising any violation, regardless of the gain in reward due to the same. This has practical relevance to scenarios such as clinical trials, where one must maintain safety for each round rather than in an aggregated sense.We describe doubly optimistic strategies for this scenario, which maintain optimistic indices for both safety risk and reward. We show that schema based on both frequentist and Bayesian indices satisfy tight gapdependent logarithmic regret bounds, and further that these play unsafe arms only logarithmically many times in total. This theoretical analysis is complemented by simulation studies demonstrating the effectiveness of the proposed schema, and probing the domains in which their use is appropriate. 
Tianrui Chen · Aditya Gangrade · Venkatesh Saligrama 🔗 


Dynamic Positive Reinforcement For LongTerm Fairness
(
Poster
)
As AIbased decisionmaking becomes increasingly impactful on human society, the study of the influence of fairnessaware policies on the population becomes important. In this work, we propose a framework for sequential decisionmaking aimed at dynamically influencing longterm societal fairness, illustrated via the problem of selecting applicants from a pool consisting of two groups, one of which is underrepresented. We consider a dynamic model for the composition of the applicant pool, where the admission of more applicants from a particular group positively reinforces more such candidates to participate in the selection process. Under such a model, we show the efficacy of the proposed FairGreedy selection policy which systematically trades greedy score maximization against fairness objectives. In addition to experimenting on synthetic data, we adapt static realworld datasets on law school candidates and credit lending to simulate the dynamics of the composition of the applicant pool. 
Bhagyashree Puranik · Upamanyu Madhow · Ramtin Pedarsani 🔗 


An Investigation into the Open World Survival Game Crafter
(
Poster
)
We share our experience with the recently released Crafter benchmark, a 2D open world survival game. Crafter allows tractable investigation of novel agents and their generalization, exploration and longterm reasoning capabilities. We evaluate agents on the original Crafter environment, as well as on a newly introduced set of generalization environments, suitable for evaluating agents' robustness to unseen objects and fastadaptation (metalearning) capabilities. Through several experiments we provide a couple of critical insights that are of general interest for future work on Crafter. We find that: (1) Simple agents with tuned hyperparameters outperform all previous agents. (2) Feedforward agents can unlock almost all achievements by relying on the inventory display. (3) Recurrent agents improve on feedforward ones, also without the inventory information. (4) Baseline agents fail to generalize to OOD objects, objectcentric agents improve over them. We will opensource our code. 
Aleksandar Stanic · Yujin Tang · David Ha · Jürgen Schmidhuber 🔗 


Equity and Equality in Fair Federated Learning
(
Poster
)
Federated Learning (FL) enables data owners to train a shared global model without sharing their private data. Unfortunately, FL is susceptible to an intrinsic fairness issue: due to heterogeneity in clients' data distributions, the final trained model can give disproportionate advantages across the participating clients. In this work, we present Equal and Equitable Federated Learning (E2FL) to produce fair federated learning models by preserving two main fairness properties, equity and equality, concurrently. We validate the efficiency and fairness of E2FL in different realworld FL applications, and show that E2FL outperforms existing baselines in terms of the resulting efficiency, fairness of different groups, and fairness among all individual clients. 
Hamid Mozaffari · Amir Houmansadr 🔗 


Certifiably Robust MultiAgent Reinforcement Learning against Adversarial Communication
(
Poster
)
Communication is important in many multiagent reinforcement learning (MARL) problems for agents to share information and make good decisions. However, when deploying trained communicative agents in a realworld application where noise and potential attackers exist, the safety of communicationbased policies becomes a severe issue that is underexplored. Specifically, if communication messages are manipulated by malicious attackers, agents relying on untrustworthy communication may take unsafe actions that lead to catastrophic consequences. Therefore, it is crucial to ensure that agents will not be misled by corrupted communication, while still benefiting from benign communication. In this work, we consider an environment with $N$ agents, where the attacker may arbitrarily change the communication from any $C<\frac{N1}{2}$ agents to a victim agent. For this strong threat model, we propose a certifiable defense by constructing a messageensemble policy that aggregates multiple randomly ablated message sets. Theoretical analysis shows that this messageensemble policy can utilize benign communication while being certifiably robust to adversarial communication, regardless of the attacking algorithm. Experiments in multiple environments verify that our defense significantly improves the robustness of trained policies against various types of attacks.

Yanchao Sun · Ruijie Zheng · Parisa Hassanzadeh · Yongyuan Liang · Soheil Feizi · Sumitra Ganesh · Furong Huang 🔗 


Prisoners of Their Own Devices: How Models Induce Data Bias in Performative Prediction
(
Poster
)
The unparalleled ability of machine learning algorithms to learn patterns from data also enables them to incorporate biases embedded within. A biased model can then make decisions that disproportionately harm certain groups in society. Much work has been devoted to measuring unfairness in static ML environments, but not in dynamic, performative prediction ones, in which most real world use cases operate. In the latter, the predictive model itself plays a pivotal role in shaping the distribution of the data. However, little attention has been heeded to relating unfairness to these interactions. Thus, to further the understanding of unfairness in these settings, we propose a taxonomy to characterize bias in the data, and study cases where it is shaped by model behaviour. Using a realworld account opening fraud detection case study as an example, we explore the dangers to both performance and fairness of two typical biases in performative prediction: distribution shifts, and the problem of selective labels. 
José Maria Pombal · Pedro Saleiro · Mario Figueiredo · Pedro Bizarro 🔗 


A Decision Metric for the Use of a Deep Reinforcement Learning Policy
(
Poster
)
Uncertainty estimation techniques such as those found in Osband et al. (2018) and Burda et al. (2019) have been shown to be useful for efficient exploration during training. This paper demonstrates that such uncertainty estimation techniques can also be used as part of a timeseries based methodology for outofdistribution (OOD) detection for an offline modelfree deep reinforcement learning policy. In particular, this paper defines a "decision metric" that can be utilized for determining when another decisionmaking process should be used in place of the deep reinforcement learning policy. 
Christina Selby · Edward Staley 🔗 


Safe and Robust Experience Sharing for Deterministic Policy Gradient Algorithms
(
Poster
)
Learning in high dimensional continuous tasks is challenging, mainly when the experience replay memory is very limited. We introduce a simple yet effective experience sharing mechanism for deterministic policies in continuous action domains for the future offpolicy deep reinforcement learning applications in which the allocated memory for the experience replay buffer is limited. To overcome the extrapolation error induced by learning from other agents' experiences, we facilitate our algorithm with a novel offpolicy correction technique without any action probability estimates. We test the effectiveness of our method in challenging OpenAI Gym continuous control tasks and conclude that it can achieve a safe experience sharing across multiple agents and exhibits a robust performance when the replay memory is strictly limited. 
Baturay Sağlam · Dogan Can Cicek · Furkan Burak Mutlu · Suleyman Kozat 🔗 


Planning to Fairly Allocate: Probabilistic Fairness in the Restless Bandit Setting
(
Poster
)
Restless and collapsing bandits are often used to model budgetconstrained resource allocation in settings where receiving the resource increases the probability that an arm will transition to, or remain in, a desirable state. However, SOTA Whittleindexbased approaches to this planning problem either do not consider fairness among arms, or incentivize fairness without guaranteeing it. We introduce ProbFair, an algorithm which finds the best (rewardmaximizing) policy that: (a) satisfies the budget constraint; and (b) enforces bounds $[\ell, u]$ on the probability of being pulled at each timestep. We evaluate our algorithm on a realworld application, where interventions support continuous positive airway pressure (CPAP) therapy adherence among patients, as well as on a broader class of synthetic transition matrices. ProbFair preserves utility while providing fairness guarantees.

Christine Herlihy · Aviva Prins · Aravind Srinivasan · John P Dickerson 🔗 


Exposing Algorithmic Bias through Inverse Design
(
Poster
)
Traditional group fairness notions assess a model’s equality of outcome by computing statistical metrics on the outputs. We argue that these output metrics encounter fundamental obstacles and present a novel approach that aligns with equality of treatment. Through gradientbased inverse design, we generate a canonical set that shows the desired inputs for a model given a preferred output. The canonical set reveals the internal logic of the model and thereby exposes potential unethical biases. For the UCI Adult data set, we find that the biases detected by a canonical set interestingly differ from those of output metrics. 
Carmen Mazijn · Carina Prunkl · Andres Algaba · Jan Danckaert · Vincent Ginis 🔗 


Reward Reports for Reinforcement Learning
(
Poster
)
The desire to build good systems in the face of complex societal effects requires a dynamic approach towards equity and access. Recent approaches to machine learning (ML) documentation have demonstrated the promise of discursive frameworks for deliberation about these complexities. However, these developments have been grounded in a static ML paradigm, leaving the role of feedback and postdeployment performance unexamined. Meanwhile, recent work in reinforcement learning design has shown that the effects of optimization objectives on the resultant system behavior can be wideranging and unpredictable. In this paper we sketch a framework for documenting deployed learning systems, which we call \textit{Reward Reports}. 
Thomas Krendl Gilbert · Sarah Dean · Nathan Lambert · Tom Zick · Aaron Snoswell 🔗 


Rashomon Capacity: Measuring Predictive Multiplicity in Probabilistic Classification
(
Poster
)
Predictive multiplicity occurs when classification models with nearly indistinguishable average performances assign conflicting predictions to individual samples. When used for decisionmaking in applications of consequence (e.g., lending, education, criminal justice), models developed without regard for predictive multiplicity may result in unjustified and arbitrary decisions for specific individuals. We introduce a new measure of predictive multiplicity in probabilistic classification called Rashomon Capacity. We show that Rashomon Capacity yields principled strategies for disclosing conflicting models to stakeholders. Our numerical experiments illustrate how Rashomon Capacity captures predictive multiplicity in various datasets and learning models, including neural networks. The tools introduced in this paper can help data scientists measure, report, and ultimately resolve predictive multiplicity prior to model deployment. 
Hsiang Hsu · Flavio Calmon 🔗 


Counterfactual Metrics for Auditing BlackBox Recommender Systems for Ethical Concerns
(
Poster
)
Recommender systems can shape peoples' online experience in powerful ways which makes close scrutiny of ethical implications imperative. Most existing work in this area attempts to measure induced harm exclusively based on observed recommendations under a set policy. This neglects potential dependencies on other quantities and can lead to misleading conclusions about the behavior of the algorithm. Instead, we propose counterfactual metrics for auditing recommender systems for ethical concerns. By asking how recommendations would change if users behaved differently or if the training data was different, we are able to isolate the effects of the recommendation algorithm from components like user preference and information. We discuss the ethical context of the suggested metrics and propose directions for future work. 
NilJana Akpinar · Liu Leqi · Dylan HadfieldMenell · Zachary Lipton 🔗 


Adaptive Data Debiasing Through Bounded Exploration
(
Poster
)
Biases in existing datasets used to train algorithmic decision rules can raise ethical and economic concerns due to the resulting disparate treatment of different groups. We propose an algorithm for sequentially debiasing such datasets through adaptive and bounded exploration in a classification problem with costly and censored feedback. Our proposed algorithm includes parameters that can be used to balance between the ultimate goal of removing data biases  which will in turn lead to more accurate and fair decisions, and the exploration risks incurred to achieve this goal. We analytically show that such exploration can help debias data in certain distributions. We further investigate how fairness criteria can work in conjunction with our data debiasing algorithm. We illustrate the performance of our algorithm using experiments on synthetic and realworld datasets. 
Yifan Yang · Yang Liu · Parinaz Naghizadeh 🔗 


Fairness Over Utilities Via MultiObjective Rewards
(
Poster
)
Group fairness definitions make assumptions about the underlying decisionproblem that restrict them to classification problems. Numerous bespoke interpretations of group fairness definitions exist as attempts to extend them to specific applications. In an effort to generalize group fairness definitions beyond classification, Blandin & Kash (2021) explore using utility functions to define group fairness measures. In addition to the decisionmaker's utility function, they introduce a benefit function that represents the individual's utility from encountering a given decisionmaker policy. Using this framework, we interpret fairness problems as a multiobjective optimization, where we aim to optimize for both the decisionmaker's utility and the individual's benefit, as well as reduce the individual benefit difference across protected groups. We demonstrate our instantiation of this multiobjective approach in a reinforcement learning simulation. 
Jack Blandin · Ian Kash 🔗 


Defining and Characterizing Reward Gaming
(
Poster
)
We provide the first formal definition of \textbf{reward gaming}, a phenomenon where optimizing an imperfect \emph{proxy reward function}, $\mathcal{\tilde{R}}$, leads to poor performance according to a true reward function, $\mathcal{R}$. We say that a proxy is \emph{ungameable} if increasing the expected proxy return can never decrease the expected true return. Intuitively, it should be possible to create an ungameable proxy by overlooking finegrained distinctions between roughly equivalent outcomes, but we show this is usually not the case. A key insight is that the linearity of reward (as a function of stateaction visit counts) makes ungameability a very strong condition. In particular, for the set of all stochastic policies, two reward functions can only be ungameable if one of them is constant. We thus turn our attention to deterministic policies and finite sets of stochastic policies, where nontrivial ungameable pairs always exist, and establish necessary and sufficient conditions for the existence of simplifications, an important special case of ungameability. Our results reveal a tension between using reward functions to specify narrow tasks and aligning AI systems with human values.

Joar Skalse · Nikolaus Howe · Dmitrii Krasheninnikov · David Krueger 🔗 


Endtoend Auditing of Decision Pipelines
(
Poster
)
Many highstakes policies can be modeled as a sequence of decisions along a \textit{pipeline}. We are interested in auditing such pipelines for both \textit{efficiency} and \textit{equity}. Using a dataset of over 100,000 crowdsourced resident requests for potentially hazardous tree maintenance in New York City, we observe a sequence of city government decisions about whether to inspect and work on a reported incident. At each decision in the pipeline, we define parity definitions and tests to identify inefficient, inequitable treatment. Disparities in resource allocation and scheduling across census tracts are reported as preliminary results. 
Benjamin Laufer · Emma Pierson · Nikhil Garg 🔗 


Efficient Adversarial Training without Attacking: WorstCaseAware Robust Reinforcement Learning
(
Poster
)
Recent studies reveal that a welltrained deep reinforcement learning (RL) policy can be particularly vulnerable to adversarial perturbations on input observations. Therefore, it is crucial to train RL agents that are robust against any attacks with a bounded budget. Existing robust training methods in deep RL either treat correlated steps separately, ignoring the robustness of longterm reward, or train the agents and RLbased attacker together, doubling the computational burden and sample complexity of the training process. In this work, we propose a strong and efficient robust training framework for RL, named Worstcaseaware Robust RL (WocaRRL), that directly estimates and optimizes the worstcase reward of a policy under bounded attacks without requiring extra samples for learning an attacker. Experiments on multiple environments show that WocaRRL achieves stateoftheart performance under various strong attacks, and obtains significantly higher training efficiency than prior stateoftheart robust training methods. 
Yongyuan Liang · Yanchao Sun · Ruijie Zheng · Furong Huang 🔗 


Engineering a Safer Recommender System
(
Poster
)
While recommender systems suffuse our daily life, influencing information we receive, products we purchase, and beliefs we form, few works have systematically examined the safety of these systems. This can be partly attributed to the complex feedback loops. In this work, we take a systems safety perspective and focus on a particular feedback loop in recommender systems where users react to recommendations they receive. We characterize the difficulties of designing a safe recommender within this feedback loop. Further, we connect the causes of widely covered recommender system failures to flaws of the system in treating the feedback loop. Our analysis suggests lines of future work on designing safer recommender systems and more broadly systems that interact with people psychologically. 
Liu Leqi · Sarah Dean 🔗 


RiskyZoo: A Library for RiskSensitive Supervised Learning
(
Poster
)
Supervised learning models are increasingly used in algorithmic decisionmaking. The traditional assumption on the training and testing data being independently and identically distributed is often violated in practical learning settings, due to distribution shifts. To mitigate the effects of such nonstationarities, risksensitive learning is proposed to train models under different (risk) functionals beyond the expected loss. For example, learning under the conditional valueatrisk of the losses is equivalent to training a model under a particular type of worstcase distribution shift. While many risk functionals and learning procedures have been proposed, their implementations are either nonexistent or in individualized repositories. With no common implementations and baseline test beds, it is difficult to decide which risk functionals and learning procedures to use. To address this, we introduce a library (RiskyZoo) for risksensitive supervised learning. The library contains implementations of risksensitive learning objectives and optimization procedures that can be used as addons to the PyTorch library. We also provide datasets to compare these learning methods. We demonstrate usage of our library through comparing models learned under different risk objectives, optimization performances of different methods for a single objective, and risk assessments of pretrained ImageNet models. 
William Wong · Audrey Huang · Liu Leqi · Kamyar Azizzadenesheli · Zachary Lipton 🔗 


Open Problems in (Un)fairness of the Retail Food Safety Inspection Process
(
Poster
)
The inspection of retail food establishments is an essential public health intervention. We discuss existing work on roles AI techniques can play in food inspections and resulting fairness and interpretability challenges. We also examine open problems stemming from the complex and dynamic nature of the inspections. 
Tanya BergerWolf · Allison Howell · Chris Kanich · Ian Kash · Barbara Kowalcyk · Gina Nicholson Kramer · Andrew Perrault · Shubham Singh 🔗 


From Soft Trees to Hard Trees: Gains and Losses
(
Poster
)
Trees are widely used as interpretable models. However, when they are greedily trained they can yield suboptimal predictive performance. Training soft trees, with probabilistic splits rather than deterministic ones, provides a way to supposedly globally optimize tree models. For interpretability purposes, a hard tree can be obtained from a soft tree by binarizing the probabilistic splits, called hardening. Unfortunately, the good performance of the soft model is often lost after hardening. We systematically study two factors contributing to the performance drop: first, the loss surface of the soft tree loss has many local optima (and thus the logic for using the soft tree loss becomes less clear), and second, the relative values of the soft tree loss do not correspond to relative values of the hard tree loss. We also demonstrate that simple mitigation methods in literature do not fully mitigate the performance drop. 
Xin Zeng · Jiayu Yao · Finale DoshiVelez · Weiwei Pan 🔗 


Success of UncertaintyAware Deep Models Depends on Data Manifold Geometry
(
Poster
)
For responsible decision making in safetycritical settings, machine learning models must effectively detect and process edgecase data. Although existing works show that predictive uncertainty is useful for these tasks, it is not evident from literature which uncertaintyaware models are best suited for a given dataset. Thus, we compare six uncertaintyaware deep learning models on a set of edgecase tasks: robustness to adversarial attacks as well as outofdistribution and adversarial detection. We find that the geometry of the data submanifold is an important factor in determining the success of various models. Our finding suggests an interesting direction in the study of uncertaintyaware deep learning models. 
Mark Penrod · Harrison Termotto · Varshini Reddy · Jiayu Yao · Finale DoshiVelez · Weiwei Pan 🔗 


Long Term Fairness for Minority Groups via Performative Distributionally Robust Optimization
(
Poster
)
Fairness researchers in machine learning (ML) have coalesced around several fairness criteria which provide formal definitions of what it means for an ML model to be fair. However, these criteria have some serious limitations. We identify four key shortcomings of these formal fairness criteria, and aim to help to address them by extending performative prediction to include a distributionally robust objective. 
Liam PeetPare · Alona Fyshe · Nidhi Hegde 🔗 


A GameTheoretic Perspective on Trust in Recommendation
(
Poster
)
Recommendation platformssuch as Amazon, Netflix, and Facebookuse various strategies in order to engage and retain users, from tracking their data to showing addictive content. These measures are meant to improve performance, but they can also erode their users' trust. In this work, we study the role of trust in recommendation. We show that, because recommendation platforms rely on users for data, trust is key to every platform's success. Our main contribution is a gametheoretic view of recommender systems and a corresponding formalization of trust. More precisely, if a user trusts their recommendation platform, then their optimal longterm strategy is to act greedilyand thus report their preferences truthfullyat all times. Our definition reflects the intuition that trust arises when the incentives of the user and platform are sufficiently aligned. To illustrate the implications of this definition, we explore two simple examples of trust. We show that distrust can hurt the platform and that trust can be beneficial for both the user and platform. 
Sarah Cen · Andrew Ilyas · Aleksander Madry 🔗 


Optimizing Personalized Assortment Decisions in the Presence of Platform Disengagement
(
Poster
)
We consider a problem where customers repeatedly interact with a platform. During each interaction with the platform, the customer is shown an assortment of items and selects among these items according to a Multinomial Logit choice model. The probability that a customer interacts with the platform in the next period depends on the customer's cumulative number of past purchases. The goal of the platform is to maximize the total revenue obtained from each customer over a finite time horizon. We first study a nonlearning version of the problem where consumer preferences are completely known. We formulate the problem as a dynamic program and prove structural properties of the optimal policy. Next, we provide a formulation in a contextual episodic reinforcement learning setting, where the parameters governing consumer preferences and return probabilities are unknown and learned over multiple episodes. We develop an algorithm based on the principle of optimism under uncertainty for this contextual reinforcement learning problem and provide a regret bound. 
Mika Sumida · Angela Zhou 🔗 


Machine Learning Explainability & Fairness: Insights from Consumer Lending
(
Poster
)
Stakeholders in consumer lending are debating whether lenders can responsibly use machine learning models in compliance with a range of preexisting legal and regulatory requirements. Our work evaluates certain tools designed to help lenders and other model users understand and manage a range of machine learning models relevant to credit underwriting. Here, we focus on how certain explainability tools affect lenders’ ability to manage fairness concerns related to obligations to identify less discriminatory alternatives for models used to extend consumer credit. We evaluate these tools on a “usability” criterion that assesses whether and how well these tools en able lenders to construct alternative models that are less discriminatory. Notably, we find that dropping features identified as drivers of disparities does not lead to less discriminatory alternative models, and often leads to substantial performance deterioration. In contrast, more automated tools that search for a range of less discriminatory alternative models can successfully improve fairness metrics. The findings presented here are extracted from a larger study that evaluates certain proprietary and opensource tools in the context of additional regulatory requirements. 
Sormeh Yazdi · Laura Blattner · Duncan McElfresh · PR Stark · Jann Spiess · Georgy Kalashnov 🔗 


Policy Fairness in Sequential Allocations under Bias Dynamics
(
Poster
)
This work considers a dynamic decision making framework for allocating opportunities over time to advantaged and disadvantaged individuals.Here, individuals in the disadvantaged group are assumed to experience a societal bias that limits their success probability.A policy of allocating opportunities stipulates thresholds on the success probability for the advantaged and disadvantaged group. We analyse the interplay between utility and a novel measure of fairness for different dynamics that dictate how the societal bias changes based on the current thresholds while the group sizes are fixed. Our theoretical analysis is supported by experimental results on synthetic data for the use case of college admissions. 
Meirav Segal · AnneMarie George · Christos Dimitrakakis 🔗 


A law of adversarial risk, interpolation, and label noise
(
Poster
)
In supervised learning, it is known that label noise in the data can be interpolated without penalties on test accuracy. We show that interpolating label noise induces adversarial vulnerability, and prove the first theorem showing the dependence of label noise and adversarial risk in terms of the data distribution. Our results are almost sharp without accounting for the inductive bias of the learning algorithm. We also show that inductive bias makes the effect of label noise much stronger. 
Daniel Paleka · Amartya Sanyal 🔗 


LPI: Learned Positional Invariances for Transfer of Task Structure and Zeroshot Planning
(
Poster
)
Realworld tasks often include interactions with the environment where our actions can drastically change the available or desirable longterm outcomes. One formulation of this in the reinforcement learning setting is in terms of nonMarkovian rewards. Here the reward function, and thus the available rewards, are themselves historydependent, and dynamically change given the agentenvironment interactions. An important challenge for navigating such environments is to be able to capture the structure of this dynamic reward function, in a way that is interpretable and allows for optimal planning. This structure, in conjunction with the particular task setting at hand, then determines the optimal order in which actions should be executed, or subtasks completed. Planning methods face the challenge of combinatorial explosion if all such orderings need to be evaluated, however, learning invariances inherent in the task structure can alleviate this pressure. Here we propose a solution to this problem by allowing the planning method to recognise task segments where temporal ordering is irrelevant for predicting reward outcomes downstream. To facilitate this, our agent simultaneously learns to segment a task and predict the changing reward function resulting from its actions, while also learning about the permutation invariances in the its history that are relevant for this prediction. This dual approach can allow zeroshot or fewshot generalisation for complex, dynamic reinforcement learning tasks. 
Tamas Madarasz 🔗 


The Backfire Effects of Fairness Constraints
(
Poster
)
Recently, the fairness community has shifted from achieving oneshot fair decisions to striving for longterm fairness. In this work, we propose a metric to measure the longterm impact of a policy on the target variable distributions. We theoretically characterize the conditions underwhich threshold policies could lead to a backfire on population groups. We conduct experiments with a set of wellused fairness constraints onboth synthetic and realworld datasets. 
Yi Sun · Alfredo Cuesta Infante · Kalyan Veeramachaneni 🔗 


Beyond Adult and COMPAS: Fairness in MultiClass Prediction
(
Poster
)
We produce fair probabilistic classifiers for multiclass prediction via ``projecting'' a pretrained classifier onto the set of models that satisfy target groupfairness requirements. The new, projected model is given by postprocessing the outputs of the pretrained classifier by a multiplicative factor. We provide a parallelizable iterative algorithm for computing the projected classifier, and derive both sample complexity and convergence guarantees. Comprehensive numerical comparisons with stateoftheart benchmarks demonstrate that our approach maintains competitive performance in terms of accuracyfairness tradeoff curves, while achieving favorable runtime on large datasets. 
Wael Alghamdi · Hsiang Hsu · Haewon Jeong · Hao Wang · Peter Winston Michalak · Shahab Asoodeh · Flavio Calmon 🔗 