firstbacksecondback
38 Results
Workshop
|
Hummer: Towards Limited Competitive Preference Dataset Li Jiang · Yusen Wu · Junwu Xiong · Jingqing Ruan · Qingpei Guo · zujie wen · JUN ZHOU · Xiaotie Deng |
||
Poster
|
Thu 2:30 |
Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-constraint Wei Xiong · Hanze Dong · Chenlu Ye · Ziqi Wang · Han Zhong · Heng Ji · Nan Jiang · Tong Zhang |
|
Poster
|
Wed 4:30 |
Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback songyang gao · Qiming Ge · Wei Shen · Shihan Dou · Junjie Ye · Xiao Wang · Rui Zheng · Yicheng Zou · Zhi Chen · Hang Yan · Qi Zhang · Dahua Lin |
|
Poster
|
Tue 4:30 |
RIME: Robust Preference-based Reinforcement Learning with Noisy Preferences Jie Cheng · Gang Xiong · Xingyuan Dai · Qinghai Miao · Yisheng Lv · Fei-Yue Wang |
|
Poster
|
Tue 2:30 |
Pragmatic Feature Preferences: Learning Reward-Relevant Preferences from Human Input Andi Peng · Yuying Sun · Tianmin Shu · David Abel |
|
Workshop
|
Fri 8:00 |
Preference Learning Algorithms Do Not Learn Preference Rankings Angelica Chen · Sadhika Malladi · Lily Zhang · Xinyi Chen · Richard Zhang · Rajesh Ranganath · Kyunghyun Cho |
|
Workshop
|
Preference Learning Algorithms Do Not Learn Preference Rankings Angelica Chen · Sadhika Malladi · Lily Zhang · Xinyi Chen · Richard Zhang · Rajesh Ranganath · Kyunghyun Cho |
||
Workshop
|
Fri 4:45 |
Preference Learning Algorithms Do Not Learn Preference Rankings Angelica Chen · Sadhika Malladi · Lily Zhang · Xinyi Chen · Richard Zhang · Rajesh Ranganath · Kyunghyun Cho |
|
Poster
|
Tue 4:30 |
Listwise Reward Estimation for Offline Preference-based Reinforcement Learning Heewoong Choi · Sangwon Jung · Hongjoon Ahn · Taesup Moon |
|
Workshop
|
Fri 8:00 |
Order-Optimal Instance-Dependent Bounds for Offline Reinforcement Learning with Preference Feedback Zhirui Chen · Vincent Tan |
|
Workshop
|
DecisionNCE: Embodied Multimodal Representations via Implicit Preference Learning Jianxiong Li · Jinliang Zheng · Yinan Zheng · Liyuan Mao · Xiao Hu · Sijie Cheng · Haoyi Niu · Jihao Liu · Yu Liu · Jingjing Liu · Ya-Qin Zhang · Xianyuan Zhan |
||
Poster
|
Wed 2:30 |
PEARL: Zero-shot Cross-task Preference Alignment and Robust Reward Learning for Robotic Manipulation Runze Liu · Yali Du · Fengshuo Bai · Jiafei Lyu · Xiu Li |