Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Challenges in Deployable Generative AI

Concept Bottleneck Generative Models

Aya Ismail · Julius Adebayo · Hector Corrada Bravo · Stephen Ra · Kyunghyun Cho

Keywords: [ Interpretability ] [ Generative Models ]


Abstract:

Despite their increasing prevalence, generative models remain opaque and difficult to steer reliably. To address these challenges, we present concept bottleneck (CB) generative models, a type of generative model where one of its internal layers—a concept bottleneck (CB) layer—is constrained to encode human-understandable features. While concept-botttleneck layers have been used to improved interpretability for supervised learning tasks, here we extend them generative models. The concept bottleneck layer partitions the generative model into three parts: the pre-concept bottleneck portion, the CB layer, and the post-concept bottleneck portion. To train CB generative models, we complement the traditional task-based loss function for training generative models with three additional loss terms: a concept loss, an orthogonality loss, and a concept sensitivity loss. The CB layer and these corresponding loss termsare model agnostic, which we demonstrate by applying them to three different families of generativemodels: generative adversarial networks, variational autoencoders, and diffusion models. Onreal-world datasets, across three types of generative models, steering a generative model with the CB layer outperforms several baselines.

Chat is not available.