Poster
in
Workshop: Next Generation of AI Safety
ContextCite: Attributing Model Generation to Context
Benjamin Cohen-Wang · Harshay Shah · Kristian Georgiev · Aleksander Madry
Keywords: [ Generative Models ] [ large language models ] [ citation ] [ attribution ]
How do language models actually use information provided as context when generating a response?Can we infer whether a particular generated statement is actually grounded in the context, a misinterpretation, or fabricated?To help answer these questions, we introduce the problem of context attribution: pinpointing the parts of the context (if any) that led a model to generate a particular statement.We then present ContextCite, a simple and scalable method for context attribution that can be applied on top of any existing language model.