Timezone: »
Autonomous agents deployed in the real world need to be robust against adversarial attacks on sensory inputs. Robustifying agent policies requires anticipating the strongest attacks possible.We demonstrate that existing observation-space attacks on reinforcement learning agents have a common weakness: while effective, their lack of temporal consistency makes them \textit{detectable} using automated means or human inspection. Detectability is undesirable to adversaries as it may trigger security escalations.We introduce \textit{perfect illusory attacks}, a novel form of adversarial attack on sequential decision-makers that is both effective and provably \textit{statistically undetectable}. We then propose the more versatile \eattacks{}, which result in observation transitions that are consistent with the state-transition function of the adversary-free environment and can be learned end-to-end.Compared to existing attacks, we empirically find \eattacks{} to be significantly harder to detect with automated methods, and a small study with human subjects\footnote{IRB approval under reference xxxxxx/xxxxx} suggests they are similarly harder to detect for humans. We propose that undetectability should be a central concern in the study of adversarial attacks on mixed-autonomy settings.
Author Information
Tim Franzmeyer (Oxford University)
Stephen Mcaleer (UC Irvine)
Joao Henriques (University of Oxford)
Jakob Foerster (Oxford university)
Jakob Foerster started as an Associate Professor at the department of engineering science at the University of Oxford in the fall of 2021. During his PhD at Oxford he helped bring deep multi-agent reinforcement learning to the forefront of AI research and interned at Google Brain, OpenAI, and DeepMind. After his PhD he worked as a research scientist at Facebook AI Research in California, where he continued doing foundational work. He was the lead organizer of the first Emergent Communication workshop at NeurIPS in 2017, which he has helped organize ever since and was awarded a prestigious CIFAR AI chair in 2019. His past work addresses how AI agents can learn to cooperate and communicate with other agents, most recently he has been developing and addressing the zero-shot coordination problem setting, a crucial step towards human-AI coordination.
Phil Torr (Oxford)
Adel Bibi (University of Oxford)
Christian Schroeder (University of Oxford)
Related Events (a corresponding poster, oral, or spotlight)
-
2023 : Illusory Attacks: Detectability Matters in Adversarial Attacks on Sequential Decision-Makers »
Dates n/a. Room
More from the Same Authors
-
2021 : Combating Adversaries with Anti-Adversaries »
Motasem Alfarra · Juan C Perez · Ali Thabet · Adel Bibi · Phil Torr · Bernard Ghanem -
2021 : Detecting and Quantifying Malicious Activity with Simulation-based Inference »
Andrew Gambardella · Naeemullah Khan · Phil Torr · Atilim Gunes Baydin -
2022 : Adversarial Cheap Talk »
Christopher Lu · Timon Willi · Alistair Letcher · Jakob Foerster -
2022 : Make Some Noise: Reliable and Efficient Single-Step Adversarial Training »
Pau de Jorge Aranda · Adel Bibi · Riccardo Volpi · Amartya Sanyal · Phil Torr · Gregory Rogez · Puneet Dokania -
2022 : Catastrophic overfitting is a bug but also a feature »
Guillermo Ortiz Jimenez · Pau de Jorge Aranda · Amartya Sanyal · Adel Bibi · Puneet Dokania · Pascal Frossard · Gregory Rogez · Phil Torr -
2022 : Illusionary Attacks on Sequential Decision Makers and Countermeasures »
Tim Franzmeyer · Joao Henriques · Jakob Foerster · Phil Torr · Adel Bibi · Christian Schroeder -
2022 : How robust are pre-trained models to distribution shift? »
Yuge Shi · Imant Daunhawer · Julia Vogt · Phil Torr · Amartya Sanyal -
2022 : Discovered Policy Optimisation »
Christopher Lu · Jakub Grudzien Kuba · Alistair Letcher · Luke Metz · Christian Schroeder · Jakob Foerster -
2022 : Adversarial Cheap Talk »
Christopher Lu · Timon Willi · Alistair Letcher · Jakob Foerster -
2022 : Adversarial Cheap Talk »
Christopher Lu · Timon Willi · Alistair Letcher · Jakob Foerster -
2022 : How robust are pre-trained models to distribution shift? »
Yuge Shi · Imant Daunhawer · Julia Vogt · Phil Torr · Amartya Sanyal -
2022 : Adversarial Cheap Talk »
Christopher Lu · Timon Willi · Alistair Letcher · Jakob Foerster -
2023 : Game-Theoretic Robust Reinforcement Learning Handles Temporally-Coupled Perturbations »
Yongyuan Liang · Yanchao Sun · Ruijie Zheng · Xiangyu Liu · Tuomas Sandholm · Furong Huang · Stephen Mcaleer -
2023 : Certified Calibration: Bounding Worst-Case Calibration under Adversarial Attacks »
Cornelius Emde · Francesco Pinto · Thomas Lukasiewicz · Phil Torr · Adel Bibi -
2023 : Certifying Ensembles: A General Certification Theory with S-Lipschitzness »
Aleksandar Petrov · Francisco Eiras · Amartya Sanyal · Phil Torr · Adel Bibi -
2023 : Analyzing the Sample Complexity of Model-Free Opponent Shaping »
Kitty Fung · Qizhen Zhang · Christopher Lu · Timon Willi · Jakob Foerster -
2023 : Structured State Space Models for In-Context Reinforcement Learning »
Christopher Lu · Yannick Schroecker · Albert Gu · Emilio Parisotto · Jakob Foerster · Satinder Singh · Feryal Behbahani -
2023 : Language Models can Solve Computer Tasks »
Geunwoo Kim · Pierre Baldi · Stephen Mcaleer -
2023 : Language Model Tokenizers Introduce Unfairness Between Languages »
Aleksandar Petrov · Emanuele La Malfa · Phil Torr · Adel Bibi -
2023 : Extracting Reward Functions from Diffusion Models »
Felipe Nuti · Tim Franzmeyer · Joao Henriques -
2023 : Who to imitate: Imitating desired behavior from diverse multi-agent datasets »
Tim Franzmeyer · Jakob Foerster · Edith Elkind · Phil Torr · Joao Henriques -
2023 : Provably Correct Physics-Informed Neural Networks »
Francisco Girbal Eiras · Adel Bibi · Rudy Bunel · Krishnamurthy Dvijotham · Phil Torr · M. Pawan Kumar -
2023 Poster: MANSA: Learning Fast and Slow in Multi-Agent Systems »
David Mguni · Haojun Chen · Taher Jafferjee · Jianhong Wang · Longfei Yue · Xidong Feng · Stephen Mcaleer · Feifei Tong · Jun Wang · Yaodong Yang -
2023 Poster: Regret-Minimizing Double Oracle for Extensive-Form Games »
Xiaohang Tang · Le Cong Dinh · Stephen Mcaleer · Yaodong Yang -
2023 Poster: A Game-Theoretic Framework for Managing Risk in Multi-Agent Systems »
Oliver Slumbers · David Mguni · Stefano Blumberg · Stephen Mcaleer · Yaodong Yang · Jun Wang -
2023 Poster: Learning Intuitive Policies Using Action Features »
Mingwei Ma · Jizhou Liu · Samuel Sokota · Max Kleiman-Weiner · Jakob Foerster -
2023 Poster: Graph Inductive Biases in Transformers without Message Passing »
Liheng Ma · Chen Lin · Derek Lim · Adriana Romero Soriano · Puneet Dokania · Mark Coates · Phil Torr · Ser Nam Lim -
2023 Poster: Certifying Ensembles: A General Certification Theory with S-Lipschitzness »
Aleksandar Petrov · Francisco Eiras · Amartya Sanyal · Phil Torr · Adel Bibi -
2023 Poster: Adversarial Cheap Talk »
Christopher Lu · Timon Willi · Alistair Letcher · Jakob Foerster -
2022 : Fixed Points in Cyber Space: Rethinking Optimal Evasion Attacks in the Age of AI-NIDS »
Christian Schroeder · Yongchao Huang · Phil Torr · Martin Strohmeier -
2022 : Fixed Points in Cyber Space: Rethinking Optimal Evasion Attacks in the Age of AI-NIDS »
Christian Schroeder · Yongchao Huang · Phil Torr · Martin Strohmeier -
2022 : Adversarial Cheap Talk »
Christopher Lu · Timon Willi · Alistair Letcher · Jakob Foerster -
2022 Poster: Adversarial Masking for Self-Supervised Learning »
Yuge Shi · Siddharth N · Phil Torr · Adam Kosiorek -
2022 Poster: Evolving Curricula with Regret-Based Environment Design »
Jack Parker-Holder · Minqi Jiang · Michael Dennis · Mikayel Samvelyan · Jakob Foerster · Edward Grefenstette · Tim Rocktäschel -
2022 Poster: COLA: Consistent Learning with Opponent-Learning Awareness »
Timon Willi · Alistair Letcher · Johannes Treutlein · Jakob Foerster -
2022 Poster: Reducing Variance in Temporal-Difference Value Estimation via Ensemble of Deep Networks »
Litian Liang · Yaosheng Xu · Stephen Mcaleer · Dailin Hu · Alexander Ihler · Pieter Abbeel · Roy Fox -
2022 Spotlight: Reducing Variance in Temporal-Difference Value Estimation via Ensemble of Deep Networks »
Litian Liang · Yaosheng Xu · Stephen Mcaleer · Dailin Hu · Alexander Ihler · Pieter Abbeel · Roy Fox -
2022 Spotlight: Evolving Curricula with Regret-Based Environment Design »
Jack Parker-Holder · Minqi Jiang · Michael Dennis · Mikayel Samvelyan · Jakob Foerster · Edward Grefenstette · Tim Rocktäschel -
2022 Spotlight: Adversarial Masking for Self-Supervised Learning »
Yuge Shi · Siddharth N · Phil Torr · Adam Kosiorek -
2022 Spotlight: COLA: Consistent Learning with Opponent-Learning Awareness »
Timon Willi · Alistair Letcher · Johannes Treutlein · Jakob Foerster -
2022 Poster: Communicating via Markov Decision Processes »
Samuel Sokota · Christian Schroeder · Maximilian Igl · Luisa Zintgraf · Phil Torr · Martin Strohmeier · Zico Kolter · Shimon Whiteson · Jakob Foerster -
2022 Spotlight: Communicating via Markov Decision Processes »
Samuel Sokota · Christian Schroeder · Maximilian Igl · Luisa Zintgraf · Phil Torr · Martin Strohmeier · Zico Kolter · Shimon Whiteson · Jakob Foerster -
2022 Poster: Model-Free Opponent Shaping »
Christopher Lu · Timon Willi · Christian Schroeder de Witt · Jakob Foerster -
2022 Poster: Mirror Learning: A Unifying Framework of Policy Optimisation »
Jakub Grudzien Kuba · Christian Schroeder de Witt · Jakob Foerster -
2022 Poster: Generalized Beliefs for Cooperative AI »
Darius Muglich · Luisa Zintgraf · Christian Schroeder de Witt · Shimon Whiteson · Jakob Foerster -
2022 Spotlight: Generalized Beliefs for Cooperative AI »
Darius Muglich · Luisa Zintgraf · Christian Schroeder de Witt · Shimon Whiteson · Jakob Foerster -
2022 Spotlight: Model-Free Opponent Shaping »
Christopher Lu · Timon Willi · Christian Schroeder de Witt · Jakob Foerster -
2022 Spotlight: Mirror Learning: A Unifying Framework of Policy Optimisation »
Jakub Grudzien Kuba · Christian Schroeder de Witt · Jakob Foerster -
2021 Poster: Off-Belief Learning »
Hengyuan Hu · Adam Lerer · Brandon Cui · Luis Pineda · Noam Brown · Jakob Foerster -
2021 Spotlight: Off-Belief Learning »
Hengyuan Hu · Adam Lerer · Brandon Cui · Luis Pineda · Noam Brown · Jakob Foerster -
2021 Poster: Trajectory Diversity for Zero-Shot Coordination »
Andrei Lupu · Brandon Cui · Hengyuan Hu · Jakob Foerster -
2021 Spotlight: Trajectory Diversity for Zero-Shot Coordination »
Andrei Lupu · Brandon Cui · Hengyuan Hu · Jakob Foerster -
2021 Poster: A New Formalism, Method and Open Issues for Zero-Shot Coordination »
Johannes Treutlein · Michael Dennis · Caspar Oesterheld · Jakob Foerster -
2021 Spotlight: A New Formalism, Method and Open Issues for Zero-Shot Coordination »
Johannes Treutlein · Michael Dennis · Caspar Oesterheld · Jakob Foerster -
2020 Poster: “Other-Play” for Zero-Shot Coordination »
Hengyuan Hu · Alexander Peysakhovich · Adam Lerer · Jakob Foerster -
2020 Poster: Evolutionary Reinforcement Learning for Sample-Efficient Multiagent Coordination »
Somdeb Majumdar · Shauharda Khadka · Santiago Miret · Stephen Mcaleer · Kagan Tumer -
2019 Poster: Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning »
Jakob Foerster · Francis Song · Edward Hughes · Neil Burch · Iain Dunning · Shimon Whiteson · Matthew Botvinick · Michael Bowling -
2019 Oral: Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning »
Jakob Foerster · Francis Song · Edward Hughes · Neil Burch · Iain Dunning · Shimon Whiteson · Matthew Botvinick · Michael Bowling -
2019 Poster: A Baseline for Any Order Gradient Estimation in Stochastic Computation Graphs »
Jingkai Mao · Jakob Foerster · Tim Rocktäschel · Maruan Al-Shedivat · Gregory Farquhar · Shimon Whiteson -
2019 Oral: A Baseline for Any Order Gradient Estimation in Stochastic Computation Graphs »
Jingkai Mao · Jakob Foerster · Tim Rocktäschel · Maruan Al-Shedivat · Gregory Farquhar · Shimon Whiteson -
2018 Poster: The Mechanics of n-Player Differentiable Games »
David Balduzzi · Sebastien Racaniere · James Martens · Jakob Foerster · Karl Tuyls · Thore Graepel -
2018 Poster: QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning »
Tabish Rashid · Mikayel Samvelyan · Christian Schroeder · Gregory Farquhar · Jakob Foerster · Shimon Whiteson -
2018 Oral: The Mechanics of n-Player Differentiable Games »
David Balduzzi · Sebastien Racaniere · James Martens · Jakob Foerster · Karl Tuyls · Thore Graepel -
2018 Oral: QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning »
Tabish Rashid · Mikayel Samvelyan · Christian Schroeder · Gregory Farquhar · Jakob Foerster · Shimon Whiteson -
2018 Poster: DiCE: The Infinitely Differentiable Monte Carlo Estimator »
Jakob Foerster · Gregory Farquhar · Maruan Al-Shedivat · Tim Rocktäschel · Eric Xing · Shimon Whiteson -
2018 Oral: DiCE: The Infinitely Differentiable Monte Carlo Estimator »
Jakob Foerster · Gregory Farquhar · Maruan Al-Shedivat · Tim Rocktäschel · Eric Xing · Shimon Whiteson -
2017 Poster: Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning »
Jakob Foerster · Nantas Nardelli · Gregory Farquhar · Triantafyllos Afouras · Phil Torr · Pushmeet Kohli · Shimon Whiteson -
2017 Talk: Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning »
Jakob Foerster · Nantas Nardelli · Gregory Farquhar · Triantafyllos Afouras · Phil Torr · Pushmeet Kohli · Shimon Whiteson -
2017 Poster: Input Switched Affine Networks: An RNN Architecture Designed for Interpretability »
Jakob Foerster · Justin Gilmer · Jan Chorowski · Jascha Sohl-Dickstein · David Sussillo -
2017 Talk: Input Switched Affine Networks: An RNN Architecture Designed for Interpretability »
Jakob Foerster · Justin Gilmer · Jan Chorowski · Jascha Sohl-Dickstein · David Sussillo -
2017 Poster: Warped Convolutions: Efficient Invariance to Spatial Transformations »
Joao Henriques · Andrea Vedaldi -
2017 Talk: Warped Convolutions: Efficient Invariance to Spatial Transformations »
Joao Henriques · Andrea Vedaldi