Skip to yearly menu bar Skip to main content


Poster

AlphaDPO: Adaptive Reward Margin for Direct Preference Optimization

Junkang Wu · xue wang · Zhengyi Yang · Jiancan Wu · Jinyang Gao · Bolin Ding · Xiang Wang · Xiangnan He
2025 Poster

Abstract

Lay Summary

Video

Chat is not available.