Skip to yearly menu bar Skip to main content


Robust Reinforcement Learning from Human Feedback for Large Language Models Fine-Tuning

Kai Ye ⋅ Hongyi Zhou ⋅ Jin Zhu ⋅ Francesco Quinzan ⋅ Chengchun Shi

Abstract

Chat is not available.