Position: Generative Distributional Integrity against Backdoor Attacks
Abstract
Foundation models, such as Diffusion Models (DMs) and Large Language Models (LLMs), are now widely integrated into digital systems. This widespread use introduces a specific security risk: generative backdoors. Unlike traditional models where backdoors cause simple classification errors, generative backdoors hide within the model’s output distribution. This makes them difficult to detect using standard pattern-based methods.This paper argues that current defensive strategies are insufficient for generative AI. \textbf{We propose Distributional Integrity, a framework that focuses on maintaining the stability and accuracy of the model's data distribution.} We identify two primary threats: backdoors within the model supply chain and the contamination of synthetic data pipelines. To address these, we advocate for a shift toward cross-modal certification and parameter-level verification. These methods aim to secure the AI-generated content (AIGC) ecosystem against inherited vulnerabilities.