Poster
in
Workshop: 2nd Workshop on Models of Human Feedback for AI Alignment (MoFA)
Selective Preference Aggregation
Shreyas Kadekodi · Hayden McTavish · Berk Ustun
Abstract:
Many applications in machine learning and decision-making rely on procedures to aggregate the preferences of individuals -- from voting, to search, to alignment. In this paper, we introduce a paradigm for selective aggregation, where we can either abstain from comparison or arbitrate dissent. Given a dataset of individual preferences, we summarize collective preferences as a selective ranking -- a partial order that only allows comparisons for items on which at least $1 - \tau$ proportion of individuals agree. We develop fast algorithms to construct selective rankings that achieve all possible trade-offs between comparability and dissent, paired with practical guarantees to ensure safety and reliability. We conduct extensive experiments to benchmark our approach on real-world datasets for ranking and learning. Our results demonstrate how selective rankings can promote transparency, robustness, and fairness by revealing disagreement and abstaining from arbitration.
Chat is not available.
Successful Page Load