Poster
in
Workshop: DMLR Workshop: Data-centric Machine Learning Research
Do Machine Learning Models Learn Statistical Rules Inferred from Data?
Aaditya Naik · Yinjun Wu · Mayur Naik · Eric Wong
Machine learning models can make basic errors that are easily hidden within vast amounts of data. Such errors often run counter to rules based on human intuition. However, rules based on human knowledge are challenging to scale or even to formalize. We thereby seek to infer statistical rules from the data, and quantify the extent to which a model has learned them. We propose a framework SQRL that integrates logic-based methods with statistical inference to derive these rules from a model’s training data without supervision. We further show how to adapt models at test-time to reduce rule violations and produce more coherent predictions. In an object detection task, SQRL generates 252 rules without human supervision, which uncovers up to 8.1k violations of those rules by state-of-the-art object detection models. Test-time adaptation reduces these violations by up to 31.4% without impacting overall model accuracy.