Data Cartography for Detecting Memorization Hotspots and Guiding Data Interventions in Generative Models
Abstract
Modern generative models risk overfitting andunintentionally memorizing rare training exam-ples, which can be extracted by adversaries orinflate benchmark performance. We propose Gen-erative Data Cartography (GenDataCarto), adata-centric framework that assigns each pretrain-ing sample a difficulty score (early-epoch loss)and a memorization score (frequency of “forgetevents”), then partitions examples into four quad-rants to guide targeted pruning and up-/down-weighting. We prove that our memorization scorelower-bounds classical influence under smooth-ness assumptions and that down-weighting high-memorization hotspots provably decreases thegeneralization gap via uniform stability bounds.Empirically, GenDataCarto reduces synthetic ca-nary extraction success by over 40% at just 10%data pruning, while increasing validation perplex-ity by less than 0.5%. These results demonstratethat principled data interventions can dramaticallymitigate leakage with minimal cost to generativeperformance.