Skip to yearly menu bar Skip to main content


Poster
in
Workshop: ICML 2024 Workshop on Foundation Models in the Wild

Towards Safe Large Language Models for Medicine

Tessa Han · Aounon Kumar · Chirag Agarwal · Himabindu Lakkaraju

Keywords: [ Trustworthy ML ] [ medical LLM ] [ LLM safety ]


Abstract:

As large language models (LLMs) develop ever-improving capabilities and are applied in real-world settings, their safety is critical. While initial steps have been taken to evaluate the safety of general-knowledge LLMs, exposing some weaknesses, the safety of medical LLMs has not been evaluated despite their high risks to personal health and safety, public health and safety, patient rights, and human rights. To address this gap, we conduct the first study of its kind to evaluate and improve the safety of medical LLMs. We find that 1) current medical LLMs do not meet standards of general or medical safety, as they readily comply with harmful requests and that 2) fine-tuning medical LLMs on safety demonstrations significantly improves their safety. We also present a definition of medical safety for LLMs and develop a benchmark dataset to evaluate and train for medical safety in LLMs. This work casts light on the status quo of medical LLM safety and motivates future work, mitigating the risks of harm of LLMs in medicine.

Chat is not available.