Skip to yearly menu bar Skip to main content


Robust Reinforcement Learning from Human Feedback for Large Language Models Fine-Tuning

Kai Ye · Hongyi Zhou · Jin Zhu · Francesco Quinzan · Chengchun Shi

Abstract

Chat is not available.