Skip to yearly menu bar Skip to main content


Nearly Minimax Optimal Regret for Learning Infinite-horizon Average-reward MDPs with Linear Function Approximation

Yue Wu · Dongruo Zhou · Quanquan Gu

Abstract

Chat is not available.