Timezone: »
Neural network models trained on text data have been found to encode undesired linguistic or sensitive attributes in their representation. Removing such attributes is non-trivial because of a complex relationship between the attribute, text input, and the learnt representation. Recent work has proposed post-hoc and adversarial methods to remove such unwanted attributes from a model's representation. Through an extensive theoretical and empirical analysis, we show that these methods can be counter-productive: they are unable to remove the attributes entirely, and in the worst case may end up destroying all task-relevant features. The reason is the methods' reliance on a probing classifier as a proxy for the attribute, which we prove is difficult to train correctly in presence of spurious correlation.
Author Information
Abhinav Kumar (MICROSOFT RESEARCH)
Chenhao Tan (University of Chicago)
Amit Sharma (Microsoft Research)
More from the Same Authors
-
2021 : DoWhy: Addressing Challenges in Expressing and Validating Causal Assumptions »
Amit Sharma · Vasilis Syrgkanis · cheng zhang · Emre Kiciman -
2022 : Modeling the Data-Generating Process is Necessary for Out-of-Distribution Generalization »
JIVAT NEET KAUR · Emre Kiciman · Amit Sharma -
2023 : Towards Modular Machine Learning Pipelines »
Aditya Modi · JIVAT NEET KAUR · Maggie Makar · Pavan Mallapragada · Amit Sharma · Emre Kiciman · Adith Swaminathan -
2023 : Language Models Can Improve Event Prediction by Few-Shot Abductive Reasoning »
Xiaoming Shi · Siqiao Xue · Kangrui Wang · Fan Zhou · James Zhang · Jun Zhou · Chenhao Tan · Hongyuan Mei -
2022 Poster: Matching Learned Causal Effects of Neural Networks with Domain Priors »
Sai Srinivas Kancheti · Gowtham Reddy Abbavaram · Vineeth N Balasubramanian · Amit Sharma -
2022 Spotlight: Matching Learned Causal Effects of Neural Networks with Domain Priors »
Sai Srinivas Kancheti · Gowtham Reddy Abbavaram · Vineeth N Balasubramanian · Amit Sharma -
2021 Poster: Domain Generalization using Causal Matching »
Divyat Mahajan · Shruti Tople · Amit Sharma -
2021 Oral: Domain Generalization using Causal Matching »
Divyat Mahajan · Shruti Tople · Amit Sharma -
2020 Poster: Alleviating Privacy Attacks via Causal Learning »
Shruti Tople · Amit Sharma · Aditya Nori