Timezone: »
Poster
Hardware-Aware Compression with Random Operation Access Specific Tile (ROAST) Hashing
Aditya Desai · Keren Zhou · Anshumali Shrivastava
Advancements in deep learning are often associated with increasing model sizes. Training and deploying large models require sophisticated hardware and incur significantly higher costs. Thus, model compression is a widely explored approach to solving the problem. However, SOTA techniques fall short in one or more desirable aspects of compression - for instance, pruning does not reduce memory for training, quantization can only provide up to 32$\times$ compression, HashedNet is cache-inefficient, etc. This paper proposes a model-agnostic, cache-friendly, and hardware-aware model compression approach: Random Operation Access Specific Tile (ROAST) hashing. ROAST collapses the parameters by clubbing them through a lightweight mapping. While clubbing these parameters, ROAST utilizes cache hierarchies by aligning the memory access pattern with the parameter access pattern. ROAST is up to ${\sim}25\times$ faster to train and ${\sim}50\times$ faster to infer than the popular parameter sharing method HashedNet. Additionally, ROAST introduces global weight sharing, which is empirically and theoretically superior to local weight sharing in HashedNet, and can be of independent interest. With ROAST, we can efficiently train and deploy the model using a much smaller memory footprint ($\sim 10 - 100\times$ lesser) in text and image classification tasks. ROAST-MM kernel implementation is open-source (https://github.com/apd10/RzLinear/tree/stable)
Author Information
Aditya Desai (Rice University)
Keren Zhou (Rice University)
Anshumali Shrivastava (Rice University)
Anshumali Shrivastava is an associate professor in the computer science department at Rice University. His broad research interests include randomized algorithms for large-scale machine learning. In 2018, Science news named him one of the Top-10 scientists under 40 to watch. He is a recipient of National Science Foundation CAREER Award, a Young Investigator Award from Air Force Office of Scientific Research, and machine learning research award from Amazon. His research on hashing inner products has won Best Paper Award at NIPS 2014 while his work on representing graphs got the Best Paper Award at IEEE/ACM ASONAM 2014. Anshumali finished his Ph.D. in 2015 from Cornell University.
More from the Same Authors
-
2023 Oral: Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time »
Zichang Liu · Jue Wang · Tri Dao · Tianyi Zhou · Binhang Yuan · Zhao Song · Anshumali Shrivastava · Ce Zhang · Yuandong Tian · Christopher Re · Beidi Chen -
2023 Poster: Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time »
Zichang Liu · Jue Wang · Tri Dao · Tianyi Zhou · Binhang Yuan · Zhao Song · Anshumali Shrivastava · Ce Zhang · Yuandong Tian · Christopher Re · Beidi Chen -
2022 Poster: One-Pass Diversified Sampling with Application to Terabyte-Scale Genomic Sequence Streams »
Benjamin Coleman · Benito Geordie · Li Chou · R. A. Leo Elworth · Todd Treangen · Anshumali Shrivastava -
2022 Spotlight: One-Pass Diversified Sampling with Application to Terabyte-Scale Genomic Sequence Streams »
Benjamin Coleman · Benito Geordie · Li Chou · R. A. Leo Elworth · Todd Treangen · Anshumali Shrivastava -
2022 Poster: DRAGONN: Distributed Randomized Approximate Gradients of Neural Networks »
Zhuang Wang · Zhaozhuo Xu · Xinyu Wu · Anshumali Shrivastava · T. S. Eugene Ng -
2022 Spotlight: DRAGONN: Distributed Randomized Approximate Gradients of Neural Networks »
Zhuang Wang · Zhaozhuo Xu · Xinyu Wu · Anshumali Shrivastava · T. S. Eugene Ng -
2021 Poster: A Tale of Two Efficient and Informative Negative Sampling Distributions »
Shabnam Daghaghi · Tharun Medini · Nicholas Meisburger · Beidi Chen · Mengnan Zhao · Anshumali Shrivastava -
2021 Oral: A Tale of Two Efficient and Informative Negative Sampling Distributions »
Shabnam Daghaghi · Tharun Medini · Nicholas Meisburger · Beidi Chen · Mengnan Zhao · Anshumali Shrivastava -
2020 Poster: Sub-linear Memory Sketches for Near Neighbor Search on Streaming Data »
Benjamin Coleman · Richard Baraniuk · Anshumali Shrivastava -
2020 Poster: Angular Visual Hardness »
Beidi Chen · Weiyang Liu · Zhiding Yu · Jan Kautz · Anshumali Shrivastava · Animesh Garg · Anima Anandkumar -
2019 Poster: Compressing Gradient Optimizers via Count-Sketches »
Ryan Spring · Anastasios Kyrillidis · Vijai Mohan · Anshumali Shrivastava -
2019 Oral: Compressing Gradient Optimizers via Count-Sketches »
Ryan Spring · Anastasios Kyrillidis · Vijai Mohan · Anshumali Shrivastava -
2018 Poster: Ultra Large-Scale Feature Selection using Count-Sketches »
Amirali Aghazadeh · Ryan Spring · Daniel LeJeune · Gautam Dasarathy · Anshumali Shrivastava · Richard Baraniuk -
2018 Oral: Ultra Large-Scale Feature Selection using Count-Sketches »
Amirali Aghazadeh · Ryan Spring · Daniel LeJeune · Gautam Dasarathy · Anshumali Shrivastava · Richard Baraniuk -
2017 Poster: Optimal Densification for Fast and Accurate Minwise Hashing »
Anshumali Shrivastava -
2017 Talk: Optimal Densification for Fast and Accurate Minwise Hashing »
Anshumali Shrivastava