Toggle Poster Visibility
Oral
Tue Jul 23 01:30 AM -- 01:45 AM (PDT) @ Hall C 1-3 None
Debating with More Persuasive LLMs Leads to More Truthful Answers
[
Slides]
Oral
Tue Jul 23 01:45 AM -- 02:00 AM (PDT) @ Hall C 1-3 None
Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision
Oral
Tue Jul 23 02:00 AM -- 02:15 AM (PDT) @ Hall C 1-3 None
A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity
Oral
Tue Jul 23 02:15 AM -- 02:30 AM (PDT) @ Hall C 1-3 None
Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study