Timezone: »
Humans performing tasks that involve taking a series of multiple dependent actions over time often learn from experience by reflecting on specific cases and points in time, where different actions could have led to significantly better outcomes. While recent machine learning methods to retrospectively analyze sequential decision making processes promise to aid decision makers in identifying such cases, they have focused on environments with finitely many discrete states. However, in many practical applications, the state of the environment is inherently continuous in nature. In this paper, we aim to fill this gap. We start by formally characterizing a sequence of discrete actions and continuous states using finite horizon Markov decision processes and a broad class of bijective structural causal models. Building upon this characterization, we formalize the problem of finding counterfactually optimal action sequences and show that, in general, we cannot expect to solve it in polynomial time. Then, we develop a search method based on the A∗ algorithm that, under a natural form of Lipschitz continuity of the environment’s dynamics, is guaranteed to return the optimal solution to the problem. Experiments on real clinical data show that our method is very efficient in practice, and it has the potential to offer interesting insights for sequential decision making tasks.
Author Information
Stratis Tsirtsis (Max Planck Institute for Software Systems)
Stratis Tsirtsis is a Ph.D. candidate at the Max Planck Institute for Software Systems. He is interested in building machine learning systems to inform decisions about individuals who present strategic behavior.
Manuel Gomez-Rodriguez (MPI-SWS)

Manuel Gomez Rodriguez is a faculty at Max Planck Institute for Software Systems. Manuel develops human-centric machine learning models and algorithms for the analysis, modeling and control of social, information and networked systems. He has received several recognitions for his research, including an outstanding paper award at NeurIPS’13 and a best research paper honorable mention at KDD’10 and WWW’17. He has served as track chair for FAT* 2020 and as area chair for every major conference in machine learning, data mining and the Web. Manuel has co-authored over 50 publications in top-tier conferences (NeurIPS, ICML, WWW, KDD, WSDM, AAAI) and journals (PNAS, Nature Communications, JMLR, PLOS Computational Biology). Manuel holds a BS in Electrical Engineering from Carlos III University, a MS and PhD in Electrical Engineering from Stanford University, and has received postdoctoral training at the Max Planck Institute for Intelligent Systems.
More from the Same Authors
-
2021 : Learning to Switch Among Agents in a Team »
Manuel Gomez-Rodriguez · Vahid Balazadeh Meresht -
2021 : Counterfactual Explanations in Sequential Decision Making Under Uncertainty »
Stratis Tsirtsis · Abir De · Manuel Gomez-Rodriguez -
2021 : Differentiable Learning Under Triage »
Nastaran Okati · Abir De · Manuel Gomez-Rodriguez -
2023 : Designing Decision Support Systems Using Counterfactual Prediction Sets »
Eleni Straitouri · Manuel Gomez-Rodriguez -
2023 : Human-Aligned Calibration for AI-Assisted Decision Making »
Nina Corvelo Benz · Manuel Gomez-Rodriguez -
2023 Workshop: “Could it have been different?” Counterfactuals in Minds and Machines »
Nina Corvelo Benz · Ricardo Dominguez-Olmedo · Manuel Gomez-Rodriguez · Thorsten Joachims · Amir-Hossein Karimi · Stratis Tsirtsis · Isabel Valera · Sarah A Wu -
2023 Poster: Improving Expert Predictions with Conformal Prediction »
Eleni Straitouri · Luke Lequn Wang · Nastaran Okati · Manuel Gomez-Rodriguez -
2023 Poster: On the Within-Group Fairness of Screening Classifiers »
Nastaran Okati · Stratis Tsirtsis · Manuel Gomez-Rodriguez -
2022 Poster: Improving Screening Processes via Calibrated Subset Selection »
Luke Lequn Wang · Thorsten Joachims · Manuel Gomez-Rodriguez -
2022 Spotlight: Improving Screening Processes via Calibrated Subset Selection »
Luke Lequn Wang · Thorsten Joachims · Manuel Gomez-Rodriguez -
2021 : Differentiable learning Under Algorithmic Triage »
Manuel Gomez-Rodriguez -
2021 Workshop: ICML Workshop on Algorithmic Recourse »
Stratis Tsirtsis · Amir-Hossein Karimi · Ana Lucic · Manuel Gomez-Rodriguez · Isabel Valera · Hima Lakkaraju -
2018 Tutorial: Learning with Temporal Point Processes »
Manuel Gomez-Rodriguez · Isabel Valera