Poster
in
Workshop: The Many Facets of Preference-Based Learning
Who to imitate: Imitating desired behavior from diverse multi-agent datasets
Tim Franzmeyer · Jakob Foerster · Edith Elkind · Phil Torr · Joao Henriques
AI agents are commonly trained with large datasets of unfiltered demonstrations of human behavior.However, not all behaviors are equally safe or desirable.We assume that desired traits for an AI agent can be approximated by a desired value function (DVF), that assigns scores to collective outcomes in the dataset.For example, in a dataset of vehicle interactions, the DVF might refer to the number of occurred incidents.We propose to first assess how well individual agents' behavior is aligned with the DVF, e.g., assessing how likely an agent is to cause incidents, to then only imitate agents with desired behaviors.To identify agents with desired behavior, we propose the concept of an agent's Exchange Value, which quantifies the expected change in collective value when substituting the agent into a random group. This concept is similar to Shapley Values used in Economics, but offers greater flexibility. We further introduce a variance maximization objective to compute Exchange Values from incomplete observations, effectively clustering agents by their unobserved traits. Using both human and simulated datasets, we learn aligned imitation policies that outperform relevant baselines.