Timezone: »
Artificial Intelligence (AI), and Machine Learning systems in particular, often depend on the information provided by multiple agents. The most well-known example is federated learning, but also sensor data, crowdsourced human computation, or human trajectory inputs for inverse reinforcement learning. However, eliciting accurate data can be costly, either due to the effort invested in obtaining it, as in crowdsourcing, or due to the need to maintain automated systems, as in distributed sensor systems. Low-quality data not only degrades the performance of AI systems, but may also pose safety concerns. Thus, it becomes important to verify the correctness of data and be smart in how data is aggregated, and to provide incentives to promote effort and high-quality data. During the recent workshop on Federated Learning at NeurIPS 2019, 4 of 6 panel members mentioned incentives as the most important open issue.
This workshop is proposed to understand this aspect of Machine Learning, both theoretically and empirically. We particularly encourage contributions on the following aspects:
- How to collect high quality and credible data for machine learning systems from self-interested and possibly malicious agents, considering the game-theoretical properties of the problem?
- How to evaluate the quality of data supplied by self-interested and possibly malicious agents and how to optimally aggregate it?
- How to make use of machine learning in game-theoretic mechanisms that will facilitate the collection of high-quality data?
Sat 8:30 a.m. - 11:00 a.m.
|
Plenary session
link »
Please refer to the detailed schedule on the following link: |
🔗 |
-
|
Invited Talk: Follow the money, not the majority: Incentivizing and aggregating expert opinions with Bayesian markets
(
Talk
)
SlidesLive Video » For some questions, such as whether extraterrestrial life exists, it is uncertain if and when the answer will be known. Asking experts for their opinion yields two practical problems. First, how can truth-telling be incentivized if the correct answer is unknowable? Second, if experts disagree, who should be trusted? This paper solves both problems simultaneously. Experts decide whether to endorse a statement and trade an asset whose value depends on the endorsement rate. The respective payoffs of buyers and sellers indicate whom to trust. We demonstrate theoretically and illustrate empirically that ``following the money" outperforms selecting the majority opinion. |
Aurelien Baillon 🔗 |
-
|
Invited Talk: Strategic Considerations in Statistical Estimation and Learning
(
Talk
)
link »
Learning and estimation techniques draw insights about unknown underlying relations based on statistical properties of the observed data. Factors that can change the statistical properties of the observed data thus affect the conclusions drawn by these techniques. One such factor is the strategic nature of data holders. The strategic behavior of data holders can alter the observed data even when the underlying relations are unchanged, hence leading to inaccurate conclusions. The cause of the strategic behavior varies from cost of providing data to vested interests in the learning outcomes. In this talk, I will discuss a few attempts to account for the strategic behavior of data holders in statistical estimation and learning. It calls for a more integrated approach for thinking of data and learning. The talk is available on the following link: |
Yiling Chen 🔗 |
-
|
Invited Talk: What is my data worth? Towards a Principled and Practical Approach for Data Valuation
(
Talk
)
link »
People give massive amounts of their personal data to companies every day and these data are used to generate tremendous business values. Some economists, politicians, and activists argue that people should be paid for their contributions—but the million-dollar question is: by how much? In this talk, I will present some recent work on data valuation. I will start by introducing a principled notion for data value and then present a suite of algorithms that we developed to efficiently compute the data value. I will also discuss the applications of our data valuation techniques to the tasks beyond data pricing, such as detecting bad training data. The talk is available on the following link: |
Ruoxi Jia 🔗 |
-
|
Invited Talk: Dominantly Truthful Multi-task Peer Prediction with a Constant Number of Tasks
(
Talk
)
SlidesLive Video » In the setting where participants are asked multiple similar possibly subjective multi-choice questions (e.g. Do you like Panda Express? Y/N; do you like Chick-fil-A? Y/N), a series of peer prediction mechanisms are designed to incentivize honest reports and some of them achieve dominantly truthfulness: truth-telling is a dominant strategy and strictly dominate other ``non-permutation strategy'' with some mild conditions. However, a major issue hinders the practical usage of those mechanisms: they require the participants to perform an infinite number of tasks. When the participants perform a finite number of tasks, these mechanisms only achieve approximated dominant truthfulness. The existence of a dominantly truthful multi-task peer prediction mechanism that only requires a finite number of tasks remains to be an open question that may have a negative result, even with full prior knowledge. This work answers this open question by proposing a new mechanism, Determinant based Mutual Information Mechanism (DMI-Mechanism), that is dominantly truthful when the number of tasks is at least 2C. C is the number of choices for each question (C=2 for binary-choice questions). DMI-Mechanism also pays truth-telling higher than any strategy profile and strictly higher than uninformative strategy profiles (informed truthfulness). In addition to the truthfulness properties, DMI-Mechanism is also easy to implement since it does not require any prior knowledge (detail-free) and only requires at least two participants. The core of DMI-Mechanism is a novel information measure, Determinant based Mutual Information (DMI). DMI generalizes Shannon's mutual information and the square of DMI has a simple unbiased estimator. In addition to incentivizing honest reports, DMI-Mechanism can also be transferred into an information evaluation rule that identifies high-quality information without verification when there are at least three participants. |
Yuqing Kong 🔗 |
-
|
Invited Talk: Thwarting Dr. Deceit's Malicious Activities in Conference Peer Review
(
Talk
)
SlidesLive Video » Peer review is an essential part of scientific research, and has a considerable influence on careers of researchers. Hence enter Dr. Deceit, who by various dishonest means, tries to game the peer review system (yes, this does happen in reality). Our goal is to thwart Dr. Deceit's malicious activities. Dr. Deceit: As a reviewer, I will manipulate the scores or rankings of the papers that I review in order to increase the chances of my own paper getting accepted. Ha ha ha! Us: We will use an impartial mechanism, e.g., via a partition-based method, which guarantees a reviewer cannot influence their own paper's outcome. We show via an analysis on ICLR data that such a mechanism is feasible in conference peer review, despite the complexity and constraints of the conference peer-review process. Dr. Deceit: But using such a mechanism reduces the efficiency of the process. So if there is no deceitful reviewer like me in the conference, the mechanism will hurt the efficiency of the peer review. Would you really want to use it then? Us: We can help make that decision -- we design statistical tests to detect the existence of such strategic behavior in peer assessment. Dr. Deceit: Ok so you will stop me from manipulating my reviews to help my own paper. But I will strike a quid pro quo deal with another potential reviewer for my paper: the reviewer will try to get to review my paper and give a positive review, and in exchange I'll do the same for them in another conference. Your impartial mechanisms can't do anything about this. Us: We also design randomized reviewer-assignment algorithms which optimally mitigate such arbitrary reviewer-author collusions. Evaluations on data from four conferences show their promise for use in practice. Dr. Deceit: Fine. I will recruit not one, but multiple such reviewers. Us: Hmm..then we get into computational-hardness-land. But there is probably some structure on your colluders (e.g., colluding reviewers are at the same institution). Then we have optimal mitigating strategies computable in polynomial time. Keep trying in vain, Dr. Deceit! Throughout the talk, Dr. Deceit will also throw some more important challenges at us whose solutions are yet unknown. |
Nihar Shah 🔗 |
-
|
Invited Talk: Incentive-Compatible Forecasting Competitions
(
Talk
)
SlidesLive Video » We consider the design of forecasting competitions in which multiple forecasters make predictions about one or more independent events and compete for a single prize. We have two objectives: (1) to award the prize to the most accurate forecaster, and (2) to incentivize forecasters to report truthfully, so that forecasts are informative and forecasters need not spend any cognitive effort strategizing about reports. Proper scoring rules incentivize truthful reporting if all forecasters are paid according to their scores. However, incentives become distorted if only the best-scoring forecaster wins a prize, since forecasters can often increase their probability of having the highest score by reporting extreme beliefs. In this paper, we introduce a truthful forecaster selection mechanism. We lower-bound the probability that our mechanism selects the most accurate forecaster, and give rates for how quickly this bound approaches 1 as the number of events grows. Our techniques can be generalized to the related problems of outputting a ranking over forecasters and hiring a forecaster with high accuracy on future events. |
Jens Witkowski 🔗 |
-
|
Contributed Talk: Incentive-Aware PAC Learning
(
Talk
)
SlidesLive Video » |
Hanrui Zhang · Vincent Conitzer 🔗 |
-
|
Contributed Talk: Loss Functions, Axioms, and Peer Review
(
Talk
)
SlidesLive Video » |
Ritesh Noothigattu · Nihar Shah · Ariel Procaccia 🔗 |
-
|
Contributed Talk: Incentives for Federated Learning: a Hypothesis Elicitation Approach
(
Talk
)
|
Yang Liu · Jiaheng Wei 🔗 |
-
|
Contributed Talk: Linear Models are Robust Optimal Under Strategic Behavior
(
Talk
)
|
Wei Tang · Chien-Ju Ho · Yang Liu 🔗 |
-
|
Contributed Talk: Classification with Strategically Withheld Data
(
Talk
)
SlidesLive Video » |
Irving Rein · Hanrui Zhang · Vincent Conitzer 🔗 |
-
|
Contributed Talk: Incentivizing Bandit Exploration:Recommendations as Instruments
(
Talk
)
link »
SlidesLive Video » The talk is available on the following link: |
Dung Ngo · Logan Stapleton · Vasilis Syrgkanis · Steven Wu 🔗 |
-
|
Contributed Talk: Causal Feature Discovery through Strategic Modification
(
Talk
)
SlidesLive Video » |
Yahav Bechavod · Steven Wu · Juba Ziani 🔗 |
-
|
Contributed Talk: Incentivizing and Rewarding High-Quality Data via Influence Functions
(
Talk
)
SlidesLive Video » |
Adam Richardson · Boi Faltings 🔗 |
-
|
Contributed Talk: Bridging Truthfulness and Corruption-Robustness in Multi-Armed Bandit Mechanisms
(
Talk
)
link »
The talk is available on the following link: |
Jacob Abernethy · Bhuvesh Kumar · Thodoris Lykouris · Yinglun Xu 🔗 |
-
|
Contributed Talk: Multi-Principal Assistance Games
(
Talk
)
SlidesLive Video » |
Arnaud Fickinger · Stuart Russell 🔗 |
-
|
Contributed Talk: Mitigating Manipulation in Peer Review via Randomized Reviewer Assignments
(
Talk
)
SlidesLive Video » |
Steven Jecmen · Hanrui Zhang · Ryan Liu · Nihar Shah · Vincent Conitzer 🔗 |
-
|
Contributed Talk: Debiasing Evaluations That are Biased by Evaluations
(
Talk
)
SlidesLive Video » |
Jingyan Wang · Nihar Shah 🔗 |
-
|
Contributed Talk: Catch Me if I Can: Detecting Strategic Behaviour in Peer Assessment
(
Talk
)
SlidesLive Video » |
Ivan Stelmakh · Nihar Shah · Aarti Singh 🔗 |
-
|
Contributed Talk: From Predictions to Decisions: Using Lookahead Regularization
(
Talk
)
SlidesLive Video » |
Nir Rosenfeld · Sai Srivatsa Ravindranath · David Parkes 🔗 |
-
|
Contributed Talk: Classification with Few Tests through Self-Selection
(
Talk
)
SlidesLive Video » |
Hanrui Zhang · Yu Cheng · Vincent Conitzer 🔗 |
-
|
Contributed Talk (placeholder)
(
Talk
)
|
🔗 |
Author Information
Boi Faltings (EPFL)
Yang Liu (UC Santa Cruz)
David Parkes (Harvard University)
Goran Radanovic (Max Planck Institute for Software Systems)
Dawn Song (University of California, Berkeley)
More from the Same Authors
-
2020 : Contributed Talk: Incentives for Federated Learning: a Hypothesis Elicitation Approach »
Yang Liu · Jiaheng Wei -
2020 : Contributed Talk: Linear Models are Robust Optimal Under Strategic Behavior »
Wei Tang · Chien-Ju Ho · Yang Liu -
2020 : Contributed Talk: Incentivizing and Rewarding High-Quality Data via Influence Functions »
Adam Richardson · Boi Faltings -
2021 : Linear Classifiers that Encourage Constructive Adaptation »
Yatong Chen · Jialu Wang · Yang Liu -
2021 : When Optimizing f-divergence is Robust with Label Noise »
Jiaheng Wei · Yang Liu -
2022 : Adaptive Data Debiasing Through Bounded Exploration »
Yifan Yang · Yang Liu · Parinaz Naghizadeh -
2023 Poster: Performative Reinforcement Learning »
Debmalya Mandal · Stelios Triantafyllou · Goran Radanovic -
2023 Poster: Identifiability of Label Noise Transition Matrix »
Yang Liu · Hao Cheng · Kun Zhang -
2023 Poster: Model Transferability with Responsive Decision Subjects »
Yatong Chen · Zeyu Tang · Kun Zhang · Yang Liu -
2023 Poster: Weak Proxies are Sufficient and Preferrable for Fairness with Missing Sensitive Attributes »
Zhaowei Zhu · Yuanshun Yao · Jiankai Sun · Hang Li · Yang Liu -
2023 Workshop: DMLR Workshop: Data-centric Machine Learning Research »
Ce Zhang · Praveen Paritosh · Newsha Ardalani · Nezihe Merve Gürel · William Gaviria Rojas · Yang Liu · Rotem Dror · Manil Maskey · Lilith Bat-Leah · Tzu-Sheng Kuo · Luis Oala · Max Bartolo · Ludwig Schmidt · Alicia Parrish · Daniel Kondermann · Najoung Kim -
2022 : Model Transferability With Responsive Decision Subjects »
Yang Liu · Yatong Chen · Zeyu Tang · Kun Zhang -
2022 Poster: Estimating Instance-dependent Bayes-label Transition Matrix using a Deep Neural Network »
Shuo Yang · Erkun Yang · Bo Han · Yang Liu · Min Xu · Gang Niu · Tongliang Liu -
2022 Poster: Detecting Corrupted Labels Without Training a Model to Predict »
Zhaowei Zhu · Zihao Dong · Yang Liu -
2022 Poster: Understanding Instance-Level Impact of Fairness Constraints »
Jialu Wang · Xin Eric Wang · Yang Liu -
2022 Spotlight: Understanding Instance-Level Impact of Fairness Constraints »
Jialu Wang · Xin Eric Wang · Yang Liu -
2022 Spotlight: Estimating Instance-dependent Bayes-label Transition Matrix using a Deep Neural Network »
Shuo Yang · Erkun Yang · Bo Han · Yang Liu · Min Xu · Gang Niu · Tongliang Liu -
2022 Poster: Metric-Fair Classifier Derandomization »
Jimmy Wu · Yatong Chen · Yang Liu -
2022 Poster: Beyond Images: Label Noise Transition Matrix Estimation for Tasks with Lower-Quality Features »
Zhaowei Zhu · Jialu Wang · Yang Liu -
2022 Spotlight: Detecting Corrupted Labels Without Training a Model to Predict »
Zhaowei Zhu · Zihao Dong · Yang Liu -
2022 Spotlight: Metric-Fair Classifier Derandomization »
Jimmy Wu · Yatong Chen · Yang Liu -
2022 Spotlight: Beyond Images: Label Noise Transition Matrix Estimation for Tasks with Lower-Quality Features »
Zhaowei Zhu · Jialu Wang · Yang Liu -
2022 Poster: To Smooth or Not? When Label Smoothing Meets Noisy Labels »
Jiaheng Wei · Hangyu Liu · Tongliang Liu · Gang Niu · Masashi Sugiyama · Yang Liu -
2022 Oral: To Smooth or Not? When Label Smoothing Meets Noisy Labels »
Jiaheng Wei · Hangyu Liu · Tongliang Liu · Gang Niu · Masashi Sugiyama · Yang Liu -
2021 Poster: Clusterability as an Alternative to Anchor Points When Learning with Noisy Labels »
Zhaowei Zhu · Yiwen Song · Yang Liu -
2021 Spotlight: Clusterability as an Alternative to Anchor Points When Learning with Noisy Labels »
Zhaowei Zhu · Yiwen Song · Yang Liu -
2021 Poster: Understanding Instance-Level Label Noise: Disparate Impacts and Treatments »
Yang Liu -
2021 Oral: Understanding Instance-Level Label Noise: Disparate Impacts and Treatments »
Yang Liu -
2021 Poster: TeraPipe: Token-Level Pipeline Parallelism for Training Large-Scale Language Models »
Zhuohan Li · Siyuan Zhuang · Shiyuan Guo · Danyang Zhuo · Hao Zhang · Dawn Song · Ion Stoica -
2021 Spotlight: TeraPipe: Token-Level Pipeline Parallelism for Training Large-Scale Language Models »
Zhuohan Li · Siyuan Zhuang · Shiyuan Guo · Danyang Zhuo · Hao Zhang · Dawn Song · Ion Stoica -
2020 Poster: Bayesian Differential Privacy for Machine Learning »
Aleksei Triastcyn · Boi Faltings -
2020 Poster: Peer Loss Functions: Learning from Noisy Labels without Knowing Noise Rates »
Yang Liu · Hongyi Guo -
2020 Poster: Policy Teaching via Environment Poisoning: Training-time Adversarial Attacks against Reinforcement Learning »
Amin Rakhsha · Goran Radanovic · Rati Devidze · Jerry Zhu · Adish Singla -
2019 : Panel Discussion (moderator: Tom Dietterich) »
Max Welling · Kilian Weinberger · Terrance Boult · Dawn Song · Thomas Dietterich -
2019 : Keynote by Dawn Song: Adversarial Machine Learning: Challenges, Lessons, and Future Directions »
Dawn Song -
2019 Workshop: Workshop on the Security and Privacy of Machine Learning »
Nicolas Papernot · Florian Tramer · Bo Li · Dan Boneh · David Evans · Somesh Jha · Percy Liang · Patrick McDaniel · Jacob Steinhardt · Dawn Song -
2019 Poster: Fairness without Harm: Decoupled Classifiers with Preference Guarantees »
Berk Ustun · Yang Liu · David Parkes -
2019 Oral: Fairness without Harm: Decoupled Classifiers with Preference Guarantees »
Berk Ustun · Yang Liu · David Parkes