Timezone: »

Task-Optimal Exploration in Linear Dynamical Systems
Andrew Wagenmaker · Max Simchowitz · Kevin Jamieson

Wed Jul 21 06:00 PM -- 06:20 PM (PDT) @ None

Exploration in unknown environments is a fundamental problem in reinforcement learning and control. In this work, we study task-guided exploration and determine what precisely an agent must learn about their environment in order to complete a particular task. Formally, we study a broad class of decision-making problems in the setting of linear dynamical systems, a class that includes the linear quadratic regulator problem. We provide instance- and task-dependent lower bounds which explicitly quantify the difficulty of completing a task of interest. Motivated by our lower bound, we propose a computationally efficient experiment-design based exploration algorithm. We show that it optimally explores the environment, collecting precisely the information needed to complete the task, and provide finite-time bounds guaranteeing that it achieves the instance- and task-optimal sample complexity, up to constant factors. Through several examples of the linear quadratic regulator problem, we show that performing task-guided exploration provably improves on exploration schemes which do not take into account the task of interest. Along the way, we establish that certainty equivalence decision making is instance- and task-optimal, and obtain the first algorithm for the linear quadratic regulator problem which is instance-optimal. We conclude with several experiments illustrating the effectiveness of our approach in practice.

Author Information

Andrew Wagenmaker (University of Washington)
Max Simchowitz (UC Berkeley)
Kevin Jamieson (University of Washington)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors