Timezone: »

 
Latency-Aware Neural Architecture Search with Multi-Objective Bayesian Optimization
David Eriksson · Pierce Chuang · Samuel Daulton · Peng Xia · Akshat Shrivastava · Arun Babu · Shicong Zhao · Ahmed A Aly · Ganesh Venkatesh · Maximilian Balandat

When tuning the architecture and hyperparameters of large machine learning models for on-device deployment, it is desirable to understand the optimal trade-offs between on-device latency and model accuracy. In this work, we leverage recent methodological advances in Bayesian optimization over high-dimensional search spaces and multi-objective Bayesian optimization to efficiently explore these trade-offs for a production-scale on-device natural language understanding model at Facebook.

Author Information

David Eriksson (Facebook)
Pierce Chuang
Samuel Daulton (Facebook)
Samuel Daulton

I am a research scientist at Meta on the Core Data Science team, PhD candidate in machine learning at the University of Oxford, and co-creator of BoTorch---an open source library for Bayesian optimization research. Within Core Data Science, I work in the Adaptive Experimentation research group. I am a member of the Machine Learning Research Group at Oxford. During my PhD, I am working with Michael Osborne (Oxford), Eytan Bakshy (Meta), and Max Balandat (Meta). My research focuses on methods for principled, sample-efficient optimization including Bayesian optimization and transfer learning. I am particularly interested in practical methods for principled exploration (using probablistic models) that are are robust across applied problems and depend on few, if any, hyperparameters. Furthermore, I aim to democratize such methods by open sourcing reproducible code. Prior to joining Meta, I worked with Finale Doshi-Velez at Harvard University on efficient and robust methods for transfer learning.

Peng Xia
Akshat Shrivastava
Arun Babu
Shicong Zhao
Ahmed A Aly (Facebook)
Ganesh Venkatesh
Maximilian Balandat (Facebook)

More from the Same Authors