Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Models of Human Feedback for AI Alignment

Stochastic Concept Bottleneck Models

Moritz Vandenhirtz · Sonia Laguna · Ričards Marcinkevičs · Julia Vogt

[ ] [ Project Page ]
Fri 26 Jul 8 a.m. PDT — 8 a.m. PDT

Abstract:

Concept Bottleneck Models (CBMs) have emerged as a promising interpretable method whose final prediction is based on intermediate, human-understandable concepts rather than the raw input. Through time-consuming manual interventions, a user can correct wrongly predicted concept values to enhance the model's downstream performance. We propose Stochastic Concept Bottleneck Models (SCBMs), a novel approach that models concept dependencies. In SCBMs, a single-concept intervention affects all correlated concepts. Leveraging the parameterization, we derive an effective intervention strategy based on the confidence region. We show empirically on synthetic tabular and natural image datasets that our approach improves intervention effectiveness significantly. Notably, we showcase the versatility and usability of SCBMs by examining a setting with CLIP-inferred concepts, alleviating the need for manual concept annotations.

Chat is not available.