Skip to yearly menu bar Skip to main content


Poster

Finite Time Logarithmic Regret Bounds for Self-Tuning Regulation

Rahul Singh · Akshay Mete · Avik Kar · P. R. Kumar

Hall C 4-9 #1805
[ ]
Wed 24 Jul 4:30 a.m. PDT — 6 a.m. PDT

Abstract: We establish the first finite-time logarithmic regret bounds for the self-tuning regulation problem. We introduce a modified version of the certainty equivalence algorithm, which we call PIECE, that clips inputs in addition to utilizing probing inputs for exploration. We show that it has a $C \log T$ upper bound on the regret after $T$ time-steps for bounded noise, and $C\log^3 T$ in the case of sub-Gaussian noise, unlike the LQ problem where logarithmic regret is shown to be not possible. The PIECE algorithm is also designed to address the critical challenge of poor initial transient performance of reinforcement learning algorithms for linear systems. Comparative simulation results illustrate the improved performance of PIECE.

Live content is unavailable. Log in and register to view live content