Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Challenges in Deployable Generative AI

Seeing Seeds Beyond Weeds: Green Teaming Generative AI for Beneficial Uses

Logan Stapleton · Jordan Taylor · Sarah Fox · Sherry Tongshuang Wu · Haiyi Zhu

Keywords: [ Generative AI ] [ suicide ] [ values ] [ queer ] [ harms ] [ education ] [ red team ] [ large language models ] [ content filter ]


Abstract:

Large generative AI models (GMs) like GPT and DALL-E are trained to generate content for general, wide-ranging purposes. GM content filters are generalized to filter out content which has a risk of harm in many cases, e.g., hate speech. However, prohibited content is not always harmful -- there are instances where generating prohibited content can be beneficial. So, when GMs filter out content, they preclude beneficial use cases along with harmful ones. Which use cases are precluded reflects the values embedded in GM content filtering. Recent work on red teaming proposes methods to bypass GM content filters to generate harmful content. We coin the term green teaming to describe methods of bypassing GM content filters to design for beneficial use cases. We showcase green teaming by: 1) Using ChatGPT as a virtual patient to simulate a person experiencing suicidal ideation, for suicide support training; 2) Using Codex to intentionally generate buggy solutions to train students on debugging; and 3) Examining an Instagram page using Midjourney to generate images of anti-LGBTQ+ politicians in drag. Finally, we discuss how our use cases demonstrate green teaming as both a practical design method and a mode of critique, which problematizes and subverts current understandings of harms and values in generative AI.

Chat is not available.