Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Foundations of Reinforcement Learning and Control: Connections and Perspectives

Bridging Distributional and Risk-Sensitive Reinforcement Learning: Balancing Statistical, Computational, and Risk Considerations

Hao Liang


Abstract:

High-stakes applications like finance and healthcare require risk-sensitive methods that maximize a risk measure of the return distribution. Existing risk-sensitive reinforcement learning (RSRL) faces computational and statistical challenges due to non-linearity of risk measures. This paper proposes computationally efficient distributional reinforcement learning (DRL) algorithms with regret guarantees, addressing these challenges. In particular, we introduce two variants of the principled DRL algorithm, \texttt{RODI} \cite{liang2022bridging}, that use a novel distribution representation and projection method, maintaining regret bound while keeping computational efficiency. Our algorithms, \texttt{RODI-Rep}, demonstrate improved regret performance compared to traditional non-distributional RL methods through theoretical analysis and empirical validation.

Chat is not available.