Skip to yearly menu bar Skip to main content


Convergence of a Human-in-the-Loop Policy-Gradient Algorithm With Eligibility Trace Under Reward, Policy, and Advantage Feedback

Ishaan Shah · David Halpern · Michael L. Littman · Kavosh Asadi

Abstract

Chat is not available.