Skip to yearly menu bar Skip to main content


Search All 2024 Events
 

21 Results

<<   <   Page 1 of 2   >   >>
Workshop
Fri 2:00 MaxMin-RLHF: Towards Equitable Alignment of Large Language Models with Diverse Human Preferences
Souradip Chakraborty · Jiahao Qiu · Hui Yuan · Alec Koppel · Furong Huang · Dinesh Manocha · Amrit Singh Bedi · Mengdi Wang
Poster
Wed 4:30 MaxMin-RLHF: Alignment with Diverse Human Preferences
Souradip Chakraborty · Jiahao Qiu · Hui Yuan · Alec Koppel · Dinesh Manocha · Furong Huang · Amrit Singh Bedi · Mengdi Wang
Workshop
Fri 8:00 MaxMin-RLHF: Towards Equitable Alignment of Large Language Models with Diverse Human Preferences
Souradip Chakraborty · Jiahao Qiu · Hui Yuan · Alec Koppel · Furong Huang · Dinesh Manocha · Amrit Singh Bedi · Mengdi Wang
Poster
Tue 2:30 Exploration-Driven Policy Optimization in RLHF: Theoretical Insights on Efficient Data Utilization
Yihan Du · Anna Winnicki · Gal Dalal · Shie Mannor · R Srikant
Poster
Thu 2:30 Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-constraint
Wei Xiong · Hanze Dong · Chenlu Ye · Ziqi Wang · Han Zhong · Heng Ji · Nan Jiang · Tong Zhang
Poster
Tue 4:30 ODIN: Disentangled Reward Mitigates Hacking in RLHF
Lichang Chen · Chen Zhu · Jiuhai Chen · Davit Soselia · Tianyi Zhou · Tom Goldstein · Heng Huang · Mohammad Shoeybi · Bryan Catanzaro
Workshop
Fri 8:00 RLHF and IIA: Perverse Incentives
Wanqiao Xu · Shi Dong · Xiuyuan Lu · Grace Lam · Zheng Wen · Benjamin Van Roy
Workshop
Fri 1:00 RLHF and IIA: Perverse Incentives
Wanqiao Xu · Shi Dong · Xiuyuan Lu · Grace Lam · Zheng Wen · Benjamin Van Roy
Workshop
It Takes Two: On the Seamlessness between Reward and Policy Model in RLHF
TaiMing Lu · Lingfeng Shen · Xinyu Yang · Weiting Tan · Beidi Chen · Huaxiu Yao
Workshop
A Critical Look At Tokenwise Reward-Guided Text Generation
Ahmad Rashid · Ruotian Wu · Julia Grosse · Agustinus Kristiadi · Pascal Poupart
Workshop
Active Preference Optimization for Sample Efficient RLHF
Nirjhar Das · Souradip Chakraborty · Aldo Pacchiano · Sayak Ray Chowdhury
Workshop
RLHF from Heterogeneous Feedback via Personalization and Preference Aggregation
Chanwoo Park · Mingyang Liu · Dingwen Kong · Kaiqing Zhang · Asuman Ozdaglar