Skip to yearly menu bar Skip to main content


Order-Optimal Instance-Dependent Bounds for Offline Reinforcement Learning with Preference Feedback

Zhirui Chen · Vincent Tan

Abstract

Video

Chat is not available.