Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Next Generation of AI Safety

Is My Data Safe? Predicting Membership Inference Success for Individual Instances

Tobias Leemann · Bardh Prenkaj · Gjergji Kasneci

Keywords: [ Privacy ] [ membership inference ]


Abstract:

We perform an extensive empirical investigation of three recent membership inference (MI) attacks on vision and language models. Our investigation includes the newly proposed Gradient Likelihood Ratio (GLiR) attack, a white-box attack with theoretical optimality guarantees. Prior research has suggested that white-box attacks cannot outperform black-box MI attacks. In this work, we challenge this hypothesis by running and evaluating this attack on real-world models with up to 53M parameters for the first time. We find that this white-box attack does indeed have the potential to outperform other attacks. We subsequently focus on the problem of MI susceptibility prediction, which is concerned with efficiently identifying individuals who are most susceptible to attack risk à priori. Doing so, we uncover which characteristics make instances susceptible to MI, and whether the targeted instances are the same across attacks. We implement and study over 20 predictors of attack success. We find that GLiR mostly targets the same points as loss-based attacks and that the vulnerable instances can be efficiently predicted.

Chat is not available.