Skip to yearly menu bar Skip to main content


Spectral Policy Optimization: Coloring your Incorrect Reasoning in GRPO

Chen · Xiaopeng Li · Ziniu Li · Xi Chen · Tianyi Lin

Abstract

Chat is not available.