Skip to yearly menu bar Skip to main content


Poster

How do Large Language Models Navigate Conflicts between Honesty and Helpfulness?

Ryan Liu ⋅ Theodore R Sumers ⋅ Ishita Dasgupta ⋅ Thomas Griffiths
2024 Poster

Abstract

Chat is not available.