Poster
in
Workshop: Subset Selection in Machine Learning: From Theory to Applications

A Data Subset Selection Framework for Efficient Hyper-Parameter Tuning and Automatic Machine Learning

Savan Amitbhai Visalpara · Krishnateja Killamsetty · Rishabh Iyer


Abstract:

In recent years, deep learning models have found great success in various tasks viz., object detection, speech recognition, and translation, making the everyday lives of people easier. Despite the success, training a deep learning model is often challenging as its performance depends mainly on the hyperparameters used. Moreover, finding the best hyperparameter configuration is often time-consuming, even when using state-of-the-art (SOTA) hyper-parameter optimization algorithms as they require multiple training runs over the entire dataset for different possible sets of hyperparameters. Our main insight is that using a subset of the dataset representing the entire dataset for model training runs involved in hyper-parameter optimization allows us to find the optimal hyperparameter configuration significantly faster. In this work, we explore using the data subsets selected using the existing supervised learning-based data subset selection methods, namely \textsc{Craig}, \textsc{Glister}, \textsc{Grad-Match}, for model training runs involved in hyper-parameter optimization. Further, we empirically demonstrate through several experiments on real-world datasets that using data subsets for hyper-parameter optimization achieves significantly faster turnaround times for hyper-parameter selection that achieves comparable performance to the hyper-parameters found using the entire dataset.