Poster
in
Workshop: Topology, Algebra, and Geometry in Machine Learning
A Geometrical Approach to Finding Difficult Examples in Language
Debo Datta · Shashwat Kumar · Laura Barnes · P. Thomas Fletcher
A growing body of evidence has suggested that metrics like accuracy overestimate the classifier's generalization ability. Several state-of-the-art NLP classifiers like BERT and LSTM rely on superficial cue words(e.g., if a movie review has the word “romantic”, the review tends to be positive), or unnecessary words (e.g., learning a proper noun to classify a movie as positive or negative). One approach to test NLP classifiers for such fragilities is analogous to how teachers discover gaps in a student's understanding: by finding problems where small perturbations confuse the student. While several perturbation strategies like contrast sets or random word substitutions have been proposed, they are typically based on heuristics and/or require expensive human involvement. In this work, using tools from information geometry, we propose a principled way to quantify the fragility of an example for an NLP classifier. By discovering such fragile examples for several state-of-the-art NLP models like BERT, LSTM, and CNN, we demonstrate their susceptibility to meaningless perturbations like noun/synonym substitution, causing their accuracy to drop down to 20 percent in some cases. Our approach is simple, architecture agnostic, and can be used to study the fragilities of text classification models. All the code used will be made publicly available, including a tool to explore the fragile examples for multiple datasets and models.