Skip to yearly menu bar Skip to main content


Why LLM Safety Guardrails Collapse After Fine-tuning: A Similarity Analysis Between Alignment and Fine-tuning Datasets

Lei Hsiung · Tianyu Pang · Yung-Chen Tang · Linyue Song · Tsung-Yi Ho · Pin-Yu Chen · Yaoqing Yang

Abstract

Video

Chat is not available.