Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Next Generation of AI Safety

One-Shot Safety Alignment for Large Language Models via Optimal Dualization

Xinmeng Huang ⋅ Shuo Li ⋅ Edgar Dobriban ⋅ Osbert Bastani ⋅ Hamed Hassani ⋅ Dongsheng Ding

Abstract

Video

Chat is not available.