Timezone: »
Convolutional Neural Networks (ConvNets) are commonly developed at a fixed resource budget, and then scaled up for better accuracy if more resources are given. In this paper, we systematically study model scaling and identify that carefully balancing network depth, width, and resolution can lead to better performance. Based on this observation, we propose a new scaling method that uniformly scales all dimensions of depth/width/resolution using a simple yet highly effective compound coefficient. We demonstrate the effectiveness of this method on MobileNets and ResNet. To go even further, we use neural architecture search to design a new baseline network and scale it up to obtain a family of models, called EfficientNets, which achieve much better accuracy and efficiency than previous ConvNets. In particular, our EfficientNet-B7 achieves stateof-the-art 84.4% top-1 / 97.1% top-5 accuracy on ImageNet, while being 8.4x smaller and 6.1x faster on inference than the best existing ConvNet (Huang et al., 2018). Our EfficientNets also transfer well and achieve state-of-the-art accuracy on CIFAR-100 (91.7%), Flower (98.8%), and 3 other transfer learning datasets, with an order of magnitude fewer parameters.
Author Information
Mingxing Tan (Google Brain)
Quoc Le (Google Brain)
Related Events (a corresponding poster, oral, or spotlight)
-
2019 Oral: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks »
Wed. Jun 12th 12:00 -- 12:05 AM Room Seaside Ballroom
More from the Same Authors
-
2023 Poster: The Flan Collection: Designing Data and Methods for Effective Instruction Tuning »
Shayne Longpre · Le Hou · Tu Vu · Albert Webson · Hyung Won Chung · Yi Tay · Denny Zhou · Quoc Le · Barret Zoph · Jason Wei · Adam Roberts -
2023 Poster: Brainformers: Trading Simplicity for Efficiency »
Yanqi Zhou · Nan Du · Yanping Huang · Daiyi Peng · Chang Lan · Da Huang · Siamak Shakeri · David So · Andrew Dai · Yifeng Lu · Zhifeng Chen · Quoc Le · Claire Cui · James Laudon · Jeff Dean -
2022 Poster: Transformer Quality in Linear Time »
Weizhe Hua · Zihang Dai · Hanxiao Liu · Quoc Le -
2022 Poster: GLaM: Efficient Scaling of Language Models with Mixture-of-Experts »
Nan Du · Yanping Huang · Andrew Dai · Simon Tong · Dmitry Lepikhin · Yuanzhong Xu · Maxim Krikun · Yanqi Zhou · Adams Wei Yu · Orhan Firat · Barret Zoph · William Fedus · Maarten Bosma · Zongwei Zhou · Tao Wang · Emma Wang · Kellie Webster · Marie Pellat · Kevin Robinson · Kathleen Meier-Hellstern · Toju Duke · Lucas Dixon · Kun Zhang · Quoc Le · Yonghui Wu · Zhifeng Chen · Claire Cui -
2022 Spotlight: GLaM: Efficient Scaling of Language Models with Mixture-of-Experts »
Nan Du · Yanping Huang · Andrew Dai · Simon Tong · Dmitry Lepikhin · Yuanzhong Xu · Maxim Krikun · Yanqi Zhou · Adams Wei Yu · Orhan Firat · Barret Zoph · William Fedus · Maarten Bosma · Zongwei Zhou · Tao Wang · Emma Wang · Kellie Webster · Marie Pellat · Kevin Robinson · Kathleen Meier-Hellstern · Toju Duke · Lucas Dixon · Kun Zhang · Quoc Le · Yonghui Wu · Zhifeng Chen · Claire Cui -
2022 Spotlight: Transformer Quality in Linear Time »
Weizhe Hua · Zihang Dai · Hanxiao Liu · Quoc Le -
2021 Poster: Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision »
Chao Jia · Yinfei Yang · Ye Xia · Yi-Ting Chen · Zarana Parekh · Hieu Pham · Quoc Le · Yun-Hsuan Sung · Zhen Li · Tom Duerig -
2021 Oral: Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision »
Chao Jia · Yinfei Yang · Ye Xia · Yi-Ting Chen · Zarana Parekh · Hieu Pham · Quoc Le · Yun-Hsuan Sung · Zhen Li · Tom Duerig -
2021 Poster: EfficientNetV2: Smaller Models and Faster Training »
Mingxing Tan · Quoc Le -
2021 Poster: Towards Domain-Agnostic Contrastive Learning »
Vikas Verma · Thang Luong · Kenji Kawaguchi · Hieu Pham · Quoc Le -
2021 Spotlight: EfficientNetV2: Smaller Models and Faster Training »
Mingxing Tan · Quoc Le -
2021 Spotlight: Towards Domain-Agnostic Contrastive Learning »
Vikas Verma · Thang Luong · Kenji Kawaguchi · Hieu Pham · Quoc Le -
2020 Poster: Go Wide, Then Narrow: Efficient Training of Deep Thin Networks »
Denny Zhou · Mao Ye · Chen Chen · Tianjian Meng · Mingxing Tan · Xiaodan Song · Quoc Le · Qiang Liu · Dale Schuurmans -
2020 Poster: AutoML-Zero: Evolving Machine Learning Algorithms From Scratch »
Esteban Real · Chen Liang · David So · Quoc Le -
2019 : Poster Session 1 (all papers) »
Matilde Gargiani · Yochai Zur · Chaim Baskin · Evgenii Zheltonozhskii · Liam Li · Ameet Talwalkar · Xuedong Shang · Harkirat Singh Behl · Atilim Gunes Baydin · Ivo Couckuyt · Tom Dhaene · Chieh Lin · Wei Wei · Min Sun · Orchid Majumder · Michele Donini · Yoshihiko Ozaki · Ryan P. Adams · Christian Geißler · Ping Luo · zhanglin peng · · Ruimao Zhang · John Langford · Rich Caruana · Debadeepta Dey · Charles Weill · Xavi Gonzalvo · Scott Yang · Scott Yak · Eugen Hotaj · Vladimir Macko · Mehryar Mohri · Corinna Cortes · Stefan Webb · Jonathan Chen · Martin Jankowiak · Noah Goodman · Aaron Klein · Frank Hutter · Mojan Javaheripi · Mohammad Samragh · Sungbin Lim · Taesup Kim · SUNGWOONG KIM · Michael Volpp · Iddo Drori · Yamuna Krishnamurthy · Kyunghyun Cho · Stanislaw Jastrzebski · Quentin de Laroussilhe · Mingxing Tan · Xiao Ma · Neil Houlsby · Andrea Gesmundo · Zalán Borsos · Krzysztof Maziarz · Felipe Petroski Such · Joel Lehman · Kenneth Stanley · Jeff Clune · Pieter Gijsbers · Joaquin Vanschoren · Felix Mohr · Eyke Hüllermeier · Zheng Xiong · Wenpeng Zhang · Wenwu Zhu · Weijia Shao · Aleksandra Faust · Michal Valko · Michael Y Li · Hugo Jair Escalante · Marcel Wever · Andrey Khorlin · Tara Javidi · Anthony Francis · Saurajit Mukherjee · Jungtaek Kim · Michael McCourt · Saehoon Kim · Tackgeun You · Seungjin Choi · Nicolas Knudde · Alexander Tornede · Ghassen Jerfel -
2019 Poster: The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study »
Daniel Park · Jascha Sohl-Dickstein · Quoc Le · Samuel Smith -
2019 Poster: The Evolved Transformer »
David So · Quoc Le · Chen Liang -
2019 Oral: The Evolved Transformer »
David So · Quoc Le · Chen Liang -
2019 Oral: The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study »
Daniel Park · Jascha Sohl-Dickstein · Quoc Le · Samuel Smith -
2018 Poster: Understanding and Simplifying One-Shot Architecture Search »
Gabriel Bender · Pieter-Jan Kindermans · Barret Zoph · Vijay Vasudevan · Quoc Le -
2018 Poster: Learning Longer-term Dependencies in RNNs with Auxiliary Losses »
Trieu H Trinh · Andrew Dai · Thang Luong · Quoc Le -
2018 Oral: Learning Longer-term Dependencies in RNNs with Auxiliary Losses »
Trieu H Trinh · Andrew Dai · Thang Luong · Quoc Le -
2018 Oral: Understanding and Simplifying One-Shot Architecture Search »
Gabriel Bender · Pieter-Jan Kindermans · Barret Zoph · Vijay Vasudevan · Quoc Le -
2018 Poster: Can Deep Reinforcement Learning Solve Erdos-Selfridge-Spencer Games? »
Maithra Raghu · Alexander Irpan · Jacob Andreas · Bobby Kleinberg · Quoc Le · Jon Kleinberg -
2018 Oral: Can Deep Reinforcement Learning Solve Erdos-Selfridge-Spencer Games? »
Maithra Raghu · Alexander Irpan · Jacob Andreas · Bobby Kleinberg · Quoc Le · Jon Kleinberg -
2018 Poster: Efficient Neural Architecture Search via Parameters Sharing »
Hieu Pham · Melody Guan · Barret Zoph · Quoc Le · Jeff Dean -
2018 Oral: Efficient Neural Architecture Search via Parameters Sharing »
Hieu Pham · Melody Guan · Barret Zoph · Quoc Le · Jeff Dean -
2017 Poster: Large-Scale Evolution of Image Classifiers »
Esteban Real · Sherry Moore · Andrew Selle · Saurabh Saxena · Yutaka Leon Suematsu · Jie Tan · Quoc Le · Alexey Kurakin -
2017 Poster: Neural Optimizer Search using Reinforcement Learning »
Irwan Bello · Barret Zoph · Vijay Vasudevan · Quoc Le -
2017 Poster: Device Placement Optimization with Reinforcement Learning »
Azalia Mirhoseini · Hieu Pham · Quoc Le · benoit steiner · Mohammad Norouzi · Rasmus Larsen · Yuefeng Zhou · Naveen Kumar · Samy Bengio · Jeff Dean -
2017 Talk: Neural Optimizer Search using Reinforcement Learning »
Irwan Bello · Barret Zoph · Vijay Vasudevan · Quoc Le -
2017 Talk: Large-Scale Evolution of Image Classifiers »
Esteban Real · Sherry Moore · Andrew Selle · Saurabh Saxena · Yutaka Leon Suematsu · Jie Tan · Quoc Le · Alexey Kurakin -
2017 Talk: Device Placement Optimization with Reinforcement Learning »
Azalia Mirhoseini · Hieu Pham · Quoc Le · benoit steiner · Mohammad Norouzi · Rasmus Larsen · Yuefeng Zhou · Naveen Kumar · Samy Bengio · Jeff Dean