Skip to yearly menu bar Skip to main content


Oral
in
Workshop: 2nd ICML Workshop on New Frontiers in Adversarial Machine Learning

On the Limitations of Model Stealing with Uncertainty Quantification Models

Keywords: [ model extraction ] [ model stealing ] [ uncertainty quantification ]


Abstract:

Model stealing aims at inferring a victim model's functionality at a fraction of the original training cost.While the goal is clear, in practice the model's architecture, weight dimension, and original training data can not be determined exactly, leading to mutual uncertainty during stealing.In this work, we explicitly tackle this uncertainty by generating multiple possible networks and combining their predictions to improve the quality of the stolen model.For this, we compare five popular uncertainty quantification models in a model stealing task.Surprisingly, our results indicate that the considered models only lead to marginal improvements in terms of label agreement (i.e., fidelity) to the stolen model.To find the cause of this, we inspect the diversity of the model's prediction by looking at the prediction variance as a function of training iterations. We realize that during training, the models tend to have similar predictions, indicating that the network diversity we wanted to leverage using uncertainty quantification models is not (high) enough for improvements on the model stealing task.

Chat is not available.