Skip to yearly menu bar Skip to main content


Poster

Factored Causal Representation Learning for Robust Reward Modeling in RLHF

Yupei Yang ⋅ Lin Yang ⋅ Wanxi Deng ⋅ Lin Qu ⋅ Fan Feng ⋅ Biwei Huang ⋅ Shikui Tu ⋅ Lei Xu

Abstract

Log in and register to view live content