Timezone: »
People adapt to changes in their visual field all the time, like when their vision is occluded while driving. Agents trained with RL struggle to do the same. Here, we address how to transfer knowledge acquired in one domain to another when the domains differ in their state representation. For example, a policy may have been trained in an environment where states were represented as colored images, but we would now like to deploy this agent in a domain where images appear black-and-white. We propose \textsc{Tail}--task-agnostic imitation learning--a framework which learns to undo these kinds of changes between domains in order to achieve transfer. This enables an agent, regardless of the task it was trained for, to adapt to perceptual distortions by first mapping the states in the new domain, such as gray-scale images, back to the original domain where they appear in color, and then by acting with the same policy. Our procedure depends on an optimal transport formulation between trajectories in the two domains, shows promise in simple experimental settings, and resembles algorithms from imitation learning.
Author Information
Abhi Gupta (Massachusetts Institute of Technology)
Ted Moskovitz (Gatsby Computational Neuroscience Unit)
David Alvarez-Melis (MSR / Harvard)

I am currently a senior researcher at Microsoft Research New England. Starting in 2023, I will be joining Harvard University as an assistant professor of Computer Science and Applied Mathematics.
Aldo Pacchiano (Broad Institute)
More from the Same Authors
-
2021 : Sample Efficient Reinforcement Learning In Continuous State Spaces: A Perspective Beyond Linearity »
Dhruv Malik · Aldo Pacchiano · Vishwak Srinivasan · Yuanzhi Li -
2021 : Reinforcement Learning in Linear MDPs: Constant Regret and Representation Selection »
Matteo Papini · Andrea Tirinzoni · Aldo Pacchiano · Marcello Restelli · Alessandro Lazaric · Matteo Pirotta -
2021 : Estimating Optimal Policy Value in Linear Contextual Bandits beyond Gaussianity »
Jonathan Lee · Weihao Kong · Aldo Pacchiano · Vidya Muthukumar · Emma Brunskill -
2021 : Meta Learning MDPs with linear transition models »
Robert Müller · Aldo Pacchiano · Jack Parker-Holder -
2021 : On the Theory of Reinforcement Learning with Once-per-Episode Feedback »
Niladri Chatterji · Aldo Pacchiano · Peter Bartlett · Michael Jordan -
2023 : Experiment Planning with Function Approximation »
Aldo Pacchiano · Jonathan Lee · Emma Brunskill -
2023 : Anytime Model Selection in Linear Bandits »
Parnian Kassraie · Aldo Pacchiano · Nicolas Emmenegger · Andreas Krause -
2023 : In-Context Decision-Making from Supervised Pretraining »
Jonathan Lee · Annie Xie · Aldo Pacchiano · Yash Chandak · Chelsea Finn · Ofir Nachum · Emma Brunskill -
2023 : Experiment Planning with Function Approximation »
Aldo Pacchiano · Jonathan Lee · Emma Brunskill -
2023 : Anytime Model Selection in Linear Bandits »
Parnian Kassraie · Aldo Pacchiano · Nicolas Emmenegger · Andreas Krause -
2023 Poster: Leveraging Offline Data in Online Reinforcement Learning »
Andrew Wagenmaker · Aldo Pacchiano -
2023 Poster: ReLOAD: Reinforcement Learning with Optimistic Ascent-Descent for Last-Iterate Convergence in Constrained MDPs »
Ted Moskovitz · Brendan O'Donoghue · Vivek Veeriah · Sebastian Flennerhag · Satinder Singh · Tom Zahavy -
2023 Poster: InfoOT: Information Maximizing Optimal Transport »
Ching-Yao Chuang · Stefanie Jegelka · David Alvarez-Melis -
2023 Affinity Workshop: LatinX in AI (LXAI) Workshop »
Laura Montoya · Jose Gallego-Posada · Pablo Rivas · Vinicius Carida · Mateo Espinosa Zarlenga · Carlos Miranda · Andres Marquez · Ramesh Doddaiah · David Alvarez-Melis · Ivan Dario Arraut Guerrero · Mateo Guaman Castro · Ana Maria Quintero-Ossa · Fabian Latorre · Julio Hurtado · Jaime David Acevedo-Viloria · Miguel Felipe Arevalo-Castiblanco -
2022 Poster: Online Nonsubmodular Minimization with Delayed Costs: From Full Information to Bandit Feedback »
Tianyi Lin · Aldo Pacchiano · Yaodong Yu · Michael Jordan -
2022 Spotlight: Online Nonsubmodular Minimization with Delayed Costs: From Full Information to Bandit Feedback »
Tianyi Lin · Aldo Pacchiano · Yaodong Yu · Michael Jordan -
2021 : On the Theory of Reinforcement Learning with Once-per-Episode Feedback »
Niladri Chatterji · Aldo Pacchiano · Peter Bartlett · Michael Jordan -
2021 : Invited Talk: David Alvarez-Melis. Comparing, Transforming, and Optimizing Datasets with Optimal Transport. »
David Alvarez-Melis -
2021 Poster: Sample Efficient Reinforcement Learning In Continuous State Spaces: A Perspective Beyond Linearity »
Dhruv Malik · Aldo Pacchiano · Vishwak Srinivasan · Yuanzhi Li -
2021 Poster: Dynamic Balancing for Model Selection in Bandits and RL »
Ashok Cutkosky · Christoph Dann · Abhimanyu Das · Claudio Gentile · Aldo Pacchiano · Manish Purohit -
2021 Spotlight: Dynamic Balancing for Model Selection in Bandits and RL »
Ashok Cutkosky · Christoph Dann · Abhimanyu Das · Claudio Gentile · Aldo Pacchiano · Manish Purohit -
2021 Spotlight: Sample Efficient Reinforcement Learning In Continuous State Spaces: A Perspective Beyond Linearity »
Dhruv Malik · Aldo Pacchiano · Vishwak Srinivasan · Yuanzhi Li -
2021 Poster: Dataset Dynamics via Gradient Flows in Probability Space »
David Alvarez-Melis · Nicolo Fusi -
2021 Spotlight: Dataset Dynamics via Gradient Flows in Probability Space »
David Alvarez-Melis · Nicolo Fusi -
2020 : 2.5 Geometric Dataset Distances via Optimal Transport »
David Alvarez-Melis -
2020 Poster: On Thompson Sampling with Langevin Algorithms »
Eric Mazumdar · Aldo Pacchiano · Yian Ma · Michael Jordan · Peter Bartlett -
2020 Poster: Accelerated Message Passing for Entropy-Regularized MAP Inference »
Jonathan Lee · Aldo Pacchiano · Peter Bartlett · Michael Jordan -
2020 Poster: Stochastic Flows and Geometric Optimization on the Orthogonal Group »
Krzysztof Choromanski · David Cheikhi · Jared Quincy Davis · Valerii Likhosherstov · Achille Nazaret · Achraf Bahamou · Xingyou Song · Mrugank Akarte · Jack Parker-Holder · Jacob Bergquist · Yuan Gao · Aldo Pacchiano · Tamas Sarlos · Adrian Weller · Vikas Sindhwani -
2020 Poster: Learning to Score Behaviors for Guided Policy Optimization »
Aldo Pacchiano · Jack Parker-Holder · Yunhao Tang · Krzysztof Choromanski · Anna Choromanska · Michael Jordan -
2020 Poster: Ready Policy One: World Building Through Active Learning »
Philip Ball · Jack Parker-Holder · Aldo Pacchiano · Krzysztof Choromanski · Stephen Roberts -
2019 Poster: Functional Transparency for Structured Data: a Game-Theoretic Approach »
Guang-He Lee · Wengong Jin · David Alvarez-Melis · Tommi Jaakkola -
2019 Oral: Functional Transparency for Structured Data: a Game-Theoretic Approach »
Guang-He Lee · Wengong Jin · David Alvarez-Melis · Tommi Jaakkola -
2019 Poster: Learning Generative Models across Incomparable Spaces »
Charlotte Bunne · David Alvarez-Melis · Andreas Krause · Stefanie Jegelka -
2019 Oral: Learning Generative Models across Incomparable Spaces »
Charlotte Bunne · David Alvarez-Melis · Andreas Krause · Stefanie Jegelka -
2019 Poster: Online learning with kernel losses »
Niladri Chatterji · Aldo Pacchiano · Peter Bartlett -
2019 Oral: Online learning with kernel losses »
Niladri Chatterji · Aldo Pacchiano · Peter Bartlett