Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Next Generation of AI Safety

Rule Based Rewards for fine-grained LLM Safety

Tong Mu ⋅ Alec Helyar ⋅ Johannes Heidecke ⋅ Joshua Achiam ⋅ Andrea Vallone ⋅ Ian Kivlichan ⋅ Molly Lin ⋅ Alex Beutel ⋅ John Schulman ⋅ Lilian Weng

Abstract

Video

Chat is not available.