Timezone: »
There has been recent interest in improving performance of simple models for multiple reasons such as interpretability, robust learning from small data, deployment in memory constrained settings as well as environmental considerations. In this paper, we propose a novel method SRatio that can utilize information from high performing complex models (viz. deep neural networks, boosted trees, random forests) to reweight a training dataset for a potentially low performing simple model of much lower complexity such as a decision tree or a shallow network enhancing its performance. Our method also leverages the per sample hardness estimate of the simple model which is not the case with the prior works which primarily consider the complex model's confidences/predictions and is thus conceptually novel. Moreover, we generalize and formalize the concept of attaching probes to intermediate layers of a neural network to other commonly used classifiers and incorporate this into our method. The benefit of these contributions is witnessed in the experiments where on 6 UCI datasets and CIFAR-10 we outperform competitors in a majority (16 out of 27) of the cases and tie for best performance in the remaining cases. In fact, in a couple of cases, we even approach the complex model's performance. We also conduct further experiments to validate assertions and intuitively understand why our method works. Theoretically, we motivate our approach by showing that the weighted loss minimized by simple models using our weighting upper bounds the loss of the complex model.
Author Information
Amit Dhurandhar (IBM Research)
Karthikeyan Shanmugam (IBM Research NY)
I am currently a Research Staff Member with the IBM Research AI group, NY since 2017. Previously, I was a Herman Goldstine Postdoctoral Fellow in the Math Sciences Division at IBM Research, NY. I obtained my Ph.D. in Electrical and Computer Engineering from UT Austin in summer 2016. My advisor at UT was Alex Dimakis. I obtained my MS degree in Electrical Engineering (2010-2012) from the University of Southern California, B.Tech and M.Tech degrees in Electrical Engineering from IIT Madras in 2010. My research interests broadly lie in Graph algorithms, Machine learning, Optimization, Coding Theory and Information Theory. In machine learning, my recent focus is on graphical model learning, causal inference and explainability. I also work on problems relating to information flow, storage and caching over networks.
Ronny Luss (IBM Research)
More from the Same Authors
-
2021 : Finite-Sample Analysis of Off-Policy TD-Learning via Generalized Bellman Operators »
Zaiwei Chen · Siva Maguluri · Sanjay Shakkottai · Karthikeyan Shanmugam -
2021 : Under-exploring in Bandits with Confounded Data »
Nihal Sharma · Soumya Basu · Karthikeyan Shanmugam · Sanjay Shakkottai -
2023 Poster: Reprogramming Pretrained Language Models for Antibody Sequence Infilling »
Igor Melnyk · Vijil Chenthamarakshan · Pin-Yu Chen · Payel Das · Amit Dhurandhar · Inkit Padhi · Devleena Das -
2023 Poster: PAC Generalization via Invariant Representations »
Advait Parulekar · Karthikeyan Shanmugam · Sanjay Shakkottai -
2020 Workshop: 5th ICML Workshop on Human Interpretability in Machine Learning (WHI) »
Adrian Weller · Alice Xiang · Amit Dhurandhar · Been Kim · Dennis Wei · Kush Varshney · Umang Bhatt -
2020 Poster: Invariant Risk Minimization Games »
Kartik Ahuja · Karthikeyan Shanmugam · Kush Varshney · Amit Dhurandhar -
2019 Poster: Beyond Backprop: Online Alternating Minimization with Auxiliary Variables »
Anna Choromanska · Benjamin Cowen · Sadhana Kumaravel · Ronny Luss · Mattia Rigotti · Irina Rish · Paolo DiAchille · Viatcheslav Gurev · Brian Kingsbury · Ravi Tejwani · Djallel Bouneffouf -
2019 Oral: Beyond Backprop: Online Alternating Minimization with Auxiliary Variables »
Anna Choromanska · Benjamin Cowen · Sadhana Kumaravel · Ronny Luss · Mattia Rigotti · Irina Rish · Paolo DiAchille · Viatcheslav Gurev · Brian Kingsbury · Ravi Tejwani · Djallel Bouneffouf -
2017 : A. Dhurandhar, V. Iyengar, R. Luss, and K. Shanmugam, "A Formal Framework to Characterize Interpretability of Procedures" »
Karthikeyan Shanmugam -
2017 Poster: Identifying Best Interventions through Online Importance Sampling »
Rajat Sen · Karthikeyan Shanmugam · Alexandros Dimakis · Sanjay Shakkottai -
2017 Talk: Identifying Best Interventions through Online Importance Sampling »
Rajat Sen · Karthikeyan Shanmugam · Alexandros Dimakis · Sanjay Shakkottai