Timezone: »
We present a new method of blackbox optimization via gradient approximation with the use of structured random orthogonal matrices, providing more accurate estimators than baselines and with provable theoretical guarantees. We show that this algorithm can be successfully applied to learn better quality compact policies than those using standard gradient estimation techniques. The compact policies we learn have several advantages over unstructured ones, including faster training algorithms and faster inference. These benefits are important when the policy is deployed on real hardware with limited resources. Further, compact policies provide more scalable architectures for derivative-free optimization (DFO) in high-dimensional spaces. We show that most robotics tasks from the OpenAI Gym can be solved using neural networks with less than 300 parameters, with almost linear time complexity of the inference phase, with up to 13x fewer parameters relative to the Evolution Strategies (ES) algorithm introduced by Salimans et al. (2017). We do not need heuristics such as fitness shaping to learn good quality policies, resulting in a simple and theoretically motivated training mechanism.
Author Information
Krzysztof Choromanski (Google Brain Robotics)
Mark Rowland (University of Cambridge)
Vikas Sindhwani (Google)
Richard E Turner (University of Cambridge)
Richard Turner holds a Lectureship (equivalent to US Assistant Professor) in Computer Vision and Machine Learning in the Computational and Biological Learning Lab, Department of Engineering, University of Cambridge, UK. He is a Fellow of Christ's College Cambridge. Previously, he held an EPSRC Postdoctoral research fellowship which he spent at both the University of Cambridge and the Laboratory for Computational Vision, NYU, USA. He has a PhD degree in Computational Neuroscience and Machine Learning from the Gatsby Computational Neuroscience Unit, UCL, UK and a M.Sci. degree in Natural Sciences (specialism Physics) from the University of Cambridge, UK. His research interests include machine learning, signal processing and developing probabilistic models of perception.
Adrian Weller (University of Cambridge, Alan Turing Institute)

Adrian Weller is Programme Director for AI at The Alan Turing Institute, the UK national institute for data science and AI, and is a Turing AI Fellow leading work on trustworthy Machine Learning (ML). He is a Principal Research Fellow in ML at the University of Cambridge, and at the Leverhulme Centre for the Future of Intelligence where he is Programme Director for Trust and Society. His interests span AI, its commercial applications and helping to ensure beneficial outcomes for society. Previously, Adrian held senior roles in finance. He received a PhD in computer science from Columbia University, and an undergraduate degree in mathematics from Trinity College, Cambridge.
Related Events (a corresponding poster, oral, or spotlight)
-
2018 Oral: Structured Evolution with Compact Architectures for Scalable Policy Optimization »
Wed. Jul 11th 11:50 AM -- 12:00 PM Room A1
More from the Same Authors
-
2021 : Diverse and Amortised Counterfactual Explanations for Uncertainty Estimates »
· Dan Ley · Umang Bhatt · Adrian Weller -
2021 : Attacking Few-Shot Classifiers with Adversarial Support Poisoning »
Elre Oldewage · John Bronskill · Richard E Turner -
2021 : Diverse and Amortised Counterfactual Explanations for Uncertainty Estimates »
Dan Ley · Umang Bhatt · Adrian Weller -
2021 : On the Fairness of Causal Algorithmic Recourse »
Julius von Kügelgen · Amir-Hossein Karimi · Umang Bhatt · Isabel Valera · Adrian Weller · Bernhard Schölkopf · Amir-Hossein Karimi -
2021 : Towards Principled Disentanglement for Domain Generalization »
Hanlin Zhang · Yi-Fan Zhang · Weiyang Liu · Adrian Weller · Bernhard Schölkopf · Eric Xing -
2021 : Diverse and Amortised Counterfactual Explanations for Uncertainty Estimates »
Dan Ley · Umang Bhatt · Adrian Weller -
2021 : CrossWalk: Fairness-enhanced Node Representation Learning »
Ahmad Khajehnejad · Moein Khajehnejad · Krishna Gummadi · Adrian Weller · Baharan Mirzasoleiman -
2022 : Perspectives on Incorporating Expert Feedback into Model Updates »
Valerie Chen · Umang Bhatt · Hoda Heidari · Adrian Weller · Ameet Talwalkar -
2023 : Algorithms for Optimal Adaptation of Diffusion Models to Reward Functions »
Krishnamurthy Dvijotham · Shayegan Omidshafiei · Kimin Lee · Katie Collins · Deepak Ramachandran · Adrian Weller · Mohammad Ghavamzadeh · Milad Nasresfahani · Ying Fan · Jeremiah Liu -
2023 : Beyond Intuition, a Framework for Applying GPs to Real-World Data »
Kenza Tazi · Jihao Andreas Lin · ST John · Hong Ge · Richard E Turner · Ross Viljoen · Alex Gardner -
2023 : Modeling Accurate Long Rollouts with Temporal Neural PDE Solvers »
Phillip Lippe · Bastiaan Veeling · Paris Perdikaris · Richard E Turner · Johannes Brandstetter -
2023 : The Neuro-Symbolic Inverse Planning Engine (NIPE): Modeling probabilistic social inferences from linguistic inputs »
Lance Ying · Katie Collins · Megan Wei · Cedegao Zhang · Tan Zhi-Xuan · Adrian Weller · Josh Tenenbaum · Catherine Wong -
2023 Oral: Simplex Random Features »
Isaac Reid · Krzysztof Choromanski · Valerii Likhosherstov · Adrian Weller -
2023 Poster: Efficient Graph Field Integrators Meet Point Clouds »
Krzysztof Choromanski · Arijit Sehanobish · Han Lin · YUNFAN ZHAO · Eli Berger · Tetiana Parshakova · Qingkai Pan · David Watkins · Tianyi Zhang · Valerii Likhosherstov · Somnath Basu Roy Chowdhury · Kumar Avinava Dubey · Deepali Jain · Tamas Sarlos · Snigdha Chaturvedi · Adrian Weller -
2023 Poster: Simplex Random Features »
Isaac Reid · Krzysztof Choromanski · Valerii Likhosherstov · Adrian Weller -
2023 Poster: Is Learning Summary Statistics Necessary for Likelihood-free Inference? »
Yanzhi Chen · Michael Gutmann · Adrian Weller -
2022 : Spotlight Presentations »
Adrian Weller · Osbert Bastani · Jake Snell · Tal Schuster · Stephen Bates · Zhendong Wang · Margaux Zaffran · Danielle Rasooly · Varun Babbar -
2022 Workshop: Workshop on Human-Machine Collaboration and Teaming »
Umang Bhatt · Katie Collins · Maria De-Arteaga · Bradley Love · Adrian Weller -
2022 Poster: From block-Toeplitz matrices to differential equations on graphs: towards a general theory for scalable masked Transformers »
Krzysztof Choromanski · Han Lin · Haoxian Chen · Tianyi Zhang · Arijit Sehanobish · Valerii Likhosherstov · Jack Parker-Holder · Tamas Sarlos · Adrian Weller · Thomas Weingarten -
2022 Poster: Measuring Representational Robustness of Neural Networks Through Shared Invariances »
Vedant Nanda · Till Speicher · Camila Kolling · John P Dickerson · Krishna Gummadi · Adrian Weller -
2022 Oral: Measuring Representational Robustness of Neural Networks Through Shared Invariances »
Vedant Nanda · Till Speicher · Camila Kolling · John P Dickerson · Krishna Gummadi · Adrian Weller -
2022 Spotlight: From block-Toeplitz matrices to differential equations on graphs: towards a general theory for scalable masked Transformers »
Krzysztof Choromanski · Han Lin · Haoxian Chen · Tianyi Zhang · Arijit Sehanobish · Valerii Likhosherstov · Jack Parker-Holder · Tamas Sarlos · Adrian Weller · Thomas Weingarten -
2021 Poster: Debiasing a First-order Heuristic for Approximate Bi-level Optimization »
Valerii Likhosherstov · Xingyou Song · Krzysztof Choromanski · Jared Quincy Davis · Adrian Weller -
2021 Spotlight: Debiasing a First-order Heuristic for Approximate Bi-level Optimization »
Valerii Likhosherstov · Xingyou Song · Krzysztof Choromanski · Jared Quincy Davis · Adrian Weller -
2021 Poster: Catformer: Designing Stable Transformers via Sensitivity Analysis »
Jared Quincy Davis · Albert Gu · Krzysztof Choromanski · Tri Dao · Christopher Re · Chelsea Finn · Percy Liang -
2021 Spotlight: Catformer: Designing Stable Transformers via Sensitivity Analysis »
Jared Quincy Davis · Albert Gu · Krzysztof Choromanski · Tri Dao · Christopher Re · Chelsea Finn · Percy Liang -
2020 Workshop: 5th ICML Workshop on Human Interpretability in Machine Learning (WHI) »
Adrian Weller · Alice Xiang · Amit Dhurandhar · Been Kim · Dennis Wei · Kush Varshney · Umang Bhatt -
2020 Poster: Stochastic Flows and Geometric Optimization on the Orthogonal Group »
Krzysztof Choromanski · David Cheikhi · Jared Quincy Davis · Valerii Likhosherstov · Achille Nazaret · Achraf Bahamou · Xingyou Song · Mrugank Akarte · Jack Parker-Holder · Jacob Bergquist · Yuan Gao · Aldo Pacchiano · Tamas Sarlos · Adrian Weller · Vikas Sindhwani -
2020 Poster: Scalable Exact Inference in Multi-Output Gaussian Processes »
Wessel Bruinsma · Eric Perim Martins · William Tebbutt · Scott Hosking · Arno Solin · Richard E Turner -
2020 Poster: TaskNorm: Rethinking Batch Normalization for Meta-Learning »
John Bronskill · Jonathan Gordon · James Requeima · Sebastian Nowozin · Richard E Turner -
2019 Workshop: Human In the Loop Learning (HILL) »
Xin Wang · Xin Wang · Fisher Yu · Shanghang Zhang · Joseph Gonzalez · Yangqing Jia · Sarah Bird · Kush Varshney · Been Kim · Adrian Weller -
2019 Poster: Unifying Orthogonal Monte Carlo Methods »
Krzysztof Choromanski · Mark Rowland · Wenyu Chen · Adrian Weller -
2019 Oral: Unifying Orthogonal Monte Carlo Methods »
Krzysztof Choromanski · Mark Rowland · Wenyu Chen · Adrian Weller -
2019 Poster: TibGM: A Transferable and Information-Based Graphical Model Approach for Reinforcement Learning »
Tameem Adel · Adrian Weller -
2019 Oral: TibGM: A Transferable and Information-Based Graphical Model Approach for Reinforcement Learning »
Tameem Adel · Adrian Weller -
2018 Poster: Blind Justice: Fairness with Encrypted Sensitive Attributes »
Niki Kilbertus · Adria Gascon · Matt Kusner · Michael Veale · Krishna Gummadi · Adrian Weller -
2018 Oral: Blind Justice: Fairness with Encrypted Sensitive Attributes »
Niki Kilbertus · Adria Gascon · Matt Kusner · Michael Veale · Krishna Gummadi · Adrian Weller -
2018 Poster: Bucket Renormalization for Approximate Inference »
Sungsoo Ahn · Michael Chertkov · Adrian Weller · Jinwoo Shin -
2018 Poster: The Mirage of Action-Dependent Baselines in Reinforcement Learning »
George Tucker · Surya Bhupatiraju · Shixiang Gu · Richard E Turner · Zoubin Ghahramani · Sergey Levine -
2018 Oral: Bucket Renormalization for Approximate Inference »
Sungsoo Ahn · Michael Chertkov · Adrian Weller · Jinwoo Shin -
2018 Oral: The Mirage of Action-Dependent Baselines in Reinforcement Learning »
George Tucker · Surya Bhupatiraju · Shixiang Gu · Richard E Turner · Zoubin Ghahramani · Sergey Levine -
2018 Poster: Discovering Interpretable Representations for Both Deep Generative and Discriminative Models »
Tameem Adel · Zoubin Ghahramani · Adrian Weller -
2018 Oral: Discovering Interpretable Representations for Both Deep Generative and Discriminative Models »
Tameem Adel · Zoubin Ghahramani · Adrian Weller -
2017 Workshop: Reliable Machine Learning in the Wild »
Dylan Hadfield-Menell · Jacob Steinhardt · Adrian Weller · Smitha Milli -
2017 : A. Weller, "Challenges for Transparency" »
Adrian Weller -
2017 Workshop: Workshop on Human Interpretability in Machine Learning (WHI) »
Kush Varshney · Adrian Weller · Been Kim · Dmitry Malioutov -
2017 Poster: Magnetic Hamiltonian Monte Carlo »
Nilesh Tripuraneni · Mark Rowland · Zoubin Ghahramani · Richard E Turner -
2017 Talk: Magnetic Hamiltonian Monte Carlo »
Nilesh Tripuraneni · Mark Rowland · Zoubin Ghahramani · Richard E Turner -
2017 Poster: Lost Relatives of the Gumbel Trick »
Matej Balog · Nilesh Tripuraneni · Zoubin Ghahramani · Adrian Weller -
2017 Poster: Sequence Tutor: Conservative fine-tuning of sequence generation models with KL-control »
Natasha Jaques · Shixiang Gu · Dzmitry Bahdanau · Jose Miguel Hernandez-Lobato · Richard E Turner · Douglas Eck -
2017 Talk: Lost Relatives of the Gumbel Trick »
Matej Balog · Nilesh Tripuraneni · Zoubin Ghahramani · Adrian Weller -
2017 Talk: Sequence Tutor: Conservative fine-tuning of sequence generation models with KL-control »
Natasha Jaques · Shixiang Gu · Dzmitry Bahdanau · Jose Miguel Hernandez-Lobato · Richard E Turner · Douglas Eck