Skip to yearly menu bar Skip to main content


Search All 2024 Events
 

38 Results

<<   <   Page 4 of 4   >>   >
Poster
Tue 4:30 Human Alignment of Large Language Models through Online Preference Optimisation
Daniele Calandriello · Zhaohan Guo · REMI MUNOS · Mark Rowland · Yunhao Tang · Bernardo Avila Pires · Pierre Richemond · Charline Le Lan · Michal Valko · Tianqi Liu · Rishabh Joshi · Zeyu Zheng · Bilal Piot
Poster
Tue 2:30 Reward Model Learning vs. Direct Policy Optimization: A Comparative Analysis of Learning from Human Preferences
Andi Nika · Debmalya Mandal · Parameswaran Kamalaruban · Georgios Tzannetos · Goran Radanovic · Adish Singla