Timezone: »

Evaluating the Performance of Reinforcement Learning Algorithms
Scott Jordan · Yash Chandak · Daniel Cohen · Mengxue Zhang · Philip Thomas

Tue Jul 14 07:00 AM -- 07:45 AM & Tue Jul 14 06:00 PM -- 06:45 PM (PDT) @ Virtual

Performance evaluations are critical for quantifying algorithmic advances in reinforcement learning. Recent reproducibility analyses have shown that reported performance results are often inconsistent and difficult to replicate. In this work, we argue that the inconsistency of performance stems from the use of flawed evaluation metrics. Taking a step towards ensuring that reported results are consistent, we propose a new comprehensive evaluation methodology for reinforcement learning algorithms that produces reliable measurements of performance both on a single environment and when aggregated across environments. We demonstrate this method by evaluating a broad class of reinforcement learning algorithms on standard benchmark tasks.

Author Information

Scott Jordan (University of Massachusetts)
Yash Chandak (University of Massachusetts Amherst)
Daniel Cohen (University of Massachusetts Amherst)
Mengxue Zhang (umass Amherst )
Philip Thomas (University of Massachusetts Amherst)

More from the Same Authors