Skip to yearly menu bar Skip to main content


Poster

Safety Recovery in Reasoning Models Is Only a Few Early Steering Steps Away

Soumya Suvra Ghosal ⋅ Souradip Chakraborty ⋅ Vaibhav Singh ⋅ Furong Huang ⋅ Dinesh Manocha ⋅ Amrit Singh Bedi

Abstract

Log in and register to view live content