Timezone: »
Poster
Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
Zichang Liu · Jue Wang · Tri Dao · Tianyi Zhou · Binhang Yuan · Zhao Song · Anshumali Shrivastava · Ce Zhang · Yuandong Tian · Christopher Re · Beidi Chen
Large language models (LLMs) with hundreds of billions of parameters have sparked a new wave of exciting AI applications. However, they are computationally expensive at inference time. Sparsity is a natural approach to reduce this cost, but existing methods either require costly retraining, have to forgo LLM's in-context learning ability, or do not yield wall-clock time speedup on modern hardware. We hypothesize that contextual sparsity, which are small, input-dependent sets of attention heads and MLP parameters that yield approximately the same output as the dense model for a given input, can address these issues. We show that contextual sparsity exists, that it can be accurately predicted, and that we can exploit it to speed up LLM inference in wall-clock time without compromising LLM's quality or in-context learning ability. Based on these insights, we propose DejaVu, a system that uses a low-cost algorithm to predict contextual sparsity on the fly given inputs to each layer, along with an asynchronous and hardware-aware implementation that speeds up LLM inference. We validate that DejaVu can reduce the inference latency of OPT-175B by over 2$\times$ compared to the state-of-the-art FasterTransformer, and over 6$\times$ compared to the widely used Hugging Face implementation, without compromising model quality. The code is available at https://github.com/FMInference/DejaVu.
Author Information
Zichang Liu (Rice University)
Jue Wang (ETH Zürich)
Tri Dao (Stanford)
Tianyi Zhou
Binhang Yuan (Swiss Federal Institute of Technology)
Zhao Song (Adobe Research)
Anshumali Shrivastava (Rice University)
Anshumali Shrivastava is an associate professor in the computer science department at Rice University. His broad research interests include randomized algorithms for large-scale machine learning. In 2018, Science news named him one of the Top-10 scientists under 40 to watch. He is a recipient of National Science Foundation CAREER Award, a Young Investigator Award from Air Force Office of Scientific Research, and machine learning research award from Amazon. His research on hashing inner products has won Best Paper Award at NIPS 2014 while his work on representing graphs got the Best Paper Award at IEEE/ACM ASONAM 2014. Anshumali finished his Ph.D. in 2015 from Cornell University.
Ce Zhang (ETH Zurich)
Yuandong Tian (Facebook AI Research)
Christopher Re (Stanford University)
Beidi Chen (CMU / FAIR)
Related Events (a corresponding poster, oral, or spotlight)
-
2023 Oral: Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time »
Fri. Jul 28th 01:48 -- 01:56 AM Room Ballroom A
More from the Same Authors
-
2021 : A Standardized Data Collection Toolkit for Model Benchmarking »
Avanika Narayan · Piero Molino · Karan Goel · Christopher Re -
2022 : BARACK: Partially Supervised Group Robustness With Guarantees »
Nimit Sohoni · Maziar Sanjabi · Nicolas Ballas · Aditya Grover · Shaoliang Nie · Hamed Firooz · Christopher Re -
2022 : Contrastive Adapters for Foundation Model Group Robustness »
Michael Zhang · Christopher Re -
2022 : The Importance of Background Information for Out of Distribution Generalization »
Jupinder Parmar · Khaled Saab · Brian Pogatchnik · Daniel Rubin · Christopher Ré -
2022 : Transform Once: Efficient Operator Learning in Frequency Domain »
Michael Poli · Stefano Massaroli · Federico Berto · Jinkyoo Park · Tri Dao · Christopher Re · Stefano Ermon -
2023 : Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer »
Yuandong Tian · Yiping Wang · Beidi Chen · Simon Du -
2023 : Skill-it! A Data-Driven Skills Framework for Understanding and Training Language Models »
Mayee Chen · Nicholas Roberts · Kush Bhatia · Jue Wang · Ce Zhang · Frederic Sala · Christopher Ré -
2023 : Prospectors: Leveraging Short Contexts to Mine Salient Objects in High-dimensional Imagery »
Gautam Machiraju · Arjun Desai · James Zou · Christopher Re · Parag Mallick -
2023 : Accelerating LLM Inference with Staged Speculative Decoding »
Benjamin F Spector · Christopher Re -
2023 : Towards Structured Sparsity in Transformers for Efficient Inference »
Harry Dong · Beidi Chen · Yuejie Chi -
2023 : H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models »
Zhenyu Zhang · Ying Sheng · Tianyi Zhou · Tianlong Chen · Lianmin Zheng · Ruisi Cai · Zhao Song · Yuandong Tian · Christopher Re · Clark Barrett · Zhangyang “Atlas” Wang · Beidi Chen -
2023 : GPT-Zip: Deep Compression of Finetuned Large Language Models »
Berivan Isik · Hermann Kumbong · Wanyi Ning · Xiaozhe Yao · Sanmi Koyejo · Ce Zhang -
2023 : Incremental Low-Rank Learning »
Jiawei Zhao · Yifei Zhang · Beidi Chen · Florian Schaefer · Anima Anandkumar -
2023 : Landscape Surrogate: Learning Decision Losses for Mathematical Optimization Under Partial Information »
Arman Zharmagambetov · Brandon Amos · Aaron Ferber · Taoan Huang · Bistra Dilkina · Yuandong Tian -
2023 : SurCo: Learning Linear SURrogates for COmbinatorial Nonlinear Optimization Problems »
Aaron Ferber · Taoan Huang · Daochen Zha · Martin Schubert · Benoit Steiner · Bistra Dilkina · Yuandong Tian -
2023 : Landscape Surrogate: Learning Decision Losses for Mathematical Optimization Under Partial Information »
Arman Zharmagambetov · Brandon Amos · Aaron Ferber · Taoan Huang · Bistra Dilkina · Yuandong Tian -
2023 : Searching Large Neighborhoods for Integer Linear Programs with Contrastive Learning »
Taoan Huang · Aaron Ferber · Yuandong Tian · Bistra Dilkina · Benoit Steiner -
2023 : Announcement and open discussion on DMLR (Selected members of DMLR Advisory Board) »
Ce Zhang -
2023 Workshop: DMLR Workshop: Data-centric Machine Learning Research »
Ce Zhang · Praveen Paritosh · Newsha Ardalani · Nezihe Merve Gürel · William Gaviria Rojas · Yang Liu · Rotem Dror · Manil Maskey · Lilith Bat-Leah · Tzu-Sheng Kuo · Luis Oala · Max Bartolo · Ludwig Schmidt · Alicia Parrish · Daniel Kondermann · Najoung Kim -
2023 Workshop: ES-FoMo: Efficient Systems for Foundation Models »
Julien Launay · Daniel Y Fu · Tri Dao · Daniel Hesslow · Beidi Chen · Azalia Mirhoseini · Percy Liang -
2023 : Contributed talks 2 »
Simon Du · Wei Huang · Yuandong Tian -
2023 Oral: Hyena Hierarchy: Towards Larger Convolutional Language Models »
Michael Poli · Stefano Massaroli · Eric Nguyen · Daniel Y Fu · Tri Dao · Stephen Baccus · Yoshua Bengio · Stefano Ermon · Christopher Re -
2023 Poster: SurCo: Learning Linear SURrogates for COmbinatorial Nonlinear Optimization Problems »
Aaron Ferber · Taoan Huang · Daochen Zha · Martin Schubert · Benoit Steiner · Bistra Dilkina · Yuandong Tian -
2023 Poster: Federated Adversarial Learning: A Framework with Convergence Analysis »
Xiaoxiao Li · Zhao Song · Jiaming Yang -
2023 Poster: Simple Hardware-Efficient Long Convolutions for Sequence Modeling »
Daniel Y Fu · Elliot L Epstein · Eric Nguyen · Armin Thomas · Michael Zhang · Tri Dao · Atri Rudra · Christopher Re -
2023 Poster: Sketching for First Order Method: Efficient Algorithm for Low-Bandwidth Channel and Vulnerability »
Zhao Song · Yitan Wang · Zheng Yu · Lichen Zhang -
2023 Poster: FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU »
Ying Sheng · Lianmin Zheng · Binhang Yuan · Zhuohan Li · Max Ryabinin · Beidi Chen · Percy Liang · Christopher Re · Ion Stoica · Ce Zhang -
2023 Oral: FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU »
Ying Sheng · Lianmin Zheng · Binhang Yuan · Zhuohan Li · Max Ryabinin · Beidi Chen · Percy Liang · Christopher Re · Ion Stoica · Ce Zhang -
2023 Poster: Hyena Hierarchy: Towards Larger Convolutional Language Models »
Michael Poli · Stefano Massaroli · Eric Nguyen · Daniel Y Fu · Tri Dao · Stephen Baccus · Yoshua Bengio · Stefano Ermon · Christopher Re -
2023 Poster: CocktailSGD: Fine-tuning Foundation Models over 500Mbps Networks »
Jue Wang · Yucheng Lu · Binhang Yuan · Beidi Chen · Percy Liang · Chris De Sa · Christopher Re · Ce Zhang -
2023 Poster: Hardware-Aware Compression with Random Operation Access Specific Tile (ROAST) Hashing »
Aditya Desai · Keren Zhou · Anshumali Shrivastava -
2023 Poster: A Nearly-Optimal Bound for Fast Regression with $\ell_\infty$ Guarantee »
Zhao Song · Mingquan Ye · Junze Yin · Lichen Zhang -
2023 Poster: Searching Large Neighborhoods for Integer Linear Programs with Contrastive Learning »
Taoan Huang · Aaron Ferber · Yuandong Tian · Bistra Dilkina · Benoit Steiner -
2023 Poster: Auto-Differentiation of Relational Computations for Very Large Scale Machine Learning »
Yuxin Tang · Zhimin Ding · Dimitrije Jankov · Binhang Yuan · Daniel Bourgeois · Chris Jermaine -
2023 Poster: Sketching Meets Differential Privacy: Fast Algorithm for Dynamic Kronecker Projection Maintenance »
Zhao Song · Xin Yang · Yuanyuan Yang · Lichen Zhang -
2023 Poster: Learning Compiler Pass Orders using Coreset and Normalized Value Prediction »
Youwei Liang · Kevin Stone · Ali Shameli · Chris Cummins · Mostafa Elhoushi · Jiadong Guo · Benoit Steiner · Xiaomeng Yang · Pengtao Xie · Hugh Leather · Yuandong Tian -
2023 Poster: FedHPO-Bench: A Benchmark Suite for Federated Hyperparameter Optimization »
Zhen WANG · Weirui Kuang · Ce Zhang · Bolin Ding · Yaliang Li -
2022 : FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness »
Tri Dao · Daniel Y Fu · Stefano Ermon · Atri Rudra · Christopher Re -
2022 Poster: It’s Raw! Audio Generation with State-Space Models »
Karan Goel · Albert Gu · Chris Donahue · Christopher Re -
2022 Oral: It’s Raw! Audio Generation with State-Space Models »
Karan Goel · Albert Gu · Chris Donahue · Christopher Re -
2022 Poster: One-Pass Diversified Sampling with Application to Terabyte-Scale Genomic Sequence Streams »
Benjamin Coleman · Benito Geordie · Li Chou · R. A. Leo Elworth · Todd Treangen · Anshumali Shrivastava -
2022 Poster: Perfectly Balanced: Improving Transfer and Robustness of Supervised Contrastive Learning »
Mayee Chen · Daniel Y Fu · Avanika Narayan · Michael Zhang · Zhao Song · Kayvon Fatahalian · Christopher Re -
2022 Spotlight: Perfectly Balanced: Improving Transfer and Robustness of Supervised Contrastive Learning »
Mayee Chen · Daniel Y Fu · Avanika Narayan · Michael Zhang · Zhao Song · Kayvon Fatahalian · Christopher Re -
2022 Spotlight: One-Pass Diversified Sampling with Application to Terabyte-Scale Genomic Sequence Streams »
Benjamin Coleman · Benito Geordie · Li Chou · R. A. Leo Elworth · Todd Treangen · Anshumali Shrivastava -
2022 Poster: DRAGONN: Distributed Randomized Approximate Gradients of Neural Networks »
Zhuang Wang · Zhaozhuo Xu · Xinyu Wu · Anshumali Shrivastava · T. S. Eugene Ng -
2022 Poster: ButterflyFlow: Building Invertible Layers with Butterfly Matrices »
Chenlin Meng · Linqi Zhou · Kristy Choi · Tri Dao · Stefano Ermon -
2022 Poster: Certifying Out-of-Domain Generalization for Blackbox Functions »
Maurice Weber · Linyi Li · Boxin Wang · Zhikuan Zhao · Bo Li · Ce Zhang -
2022 Poster: Monarch: Expressive Structured Matrices for Efficient and Accurate Training »
Tri Dao · Beidi Chen · Nimit Sohoni · Arjun Desai · Michael Poli · Jessica Grogan · Alexander Liu · Aniruddh Rao · Atri Rudra · Christopher Re -
2022 Poster: Correct-N-Contrast: a Contrastive Approach for Improving Robustness to Spurious Correlations »
Michael Zhang · Nimit Sohoni · Hongyang Zhang · Chelsea Finn · Christopher Re -
2022 Oral: Correct-N-Contrast: a Contrastive Approach for Improving Robustness to Spurious Correlations »
Michael Zhang · Nimit Sohoni · Hongyang Zhang · Chelsea Finn · Christopher Re -
2022 Oral: Monarch: Expressive Structured Matrices for Efficient and Accurate Training »
Tri Dao · Beidi Chen · Nimit Sohoni · Arjun Desai · Michael Poli · Jessica Grogan · Alexander Liu · Aniruddh Rao · Atri Rudra · Christopher Re -
2022 Spotlight: DRAGONN: Distributed Randomized Approximate Gradients of Neural Networks »
Zhuang Wang · Zhaozhuo Xu · Xinyu Wu · Anshumali Shrivastava · T. S. Eugene Ng -
2022 Spotlight: Certifying Out-of-Domain Generalization for Blackbox Functions »
Maurice Weber · Linyi Li · Boxin Wang · Zhikuan Zhao · Bo Li · Ce Zhang -
2022 Spotlight: ButterflyFlow: Building Invertible Layers with Butterfly Matrices »
Chenlin Meng · Linqi Zhou · Kristy Choi · Tri Dao · Stefano Ermon -
2021 Poster: Fast Sketching of Polynomial Kernels of Polynomial Degree »
Zhao Song · David Woodruff · Zheng Yu · Lichen Zhang -
2021 Poster: HoroPCA: Hyperbolic Dimensionality Reduction via Horospherical Projections »
Ines Chami · Albert Gu · Dat P Nguyen · Christopher Re -
2021 Spotlight: HoroPCA: Hyperbolic Dimensionality Reduction via Horospherical Projections »
Ines Chami · Albert Gu · Dat P Nguyen · Christopher Re -
2021 Spotlight: Fast Sketching of Polynomial Kernels of Polynomial Degree »
Zhao Song · David Woodruff · Zheng Yu · Lichen Zhang -
2021 Poster: Knowledge Enhanced Machine Learning Pipeline against Diverse Adversarial Attacks »
Nezihe Merve Gürel · Xiangyu Qi · Luka Rimanic · Ce Zhang · Bo Li -
2021 Spotlight: Knowledge Enhanced Machine Learning Pipeline against Diverse Adversarial Attacks »
Nezihe Merve Gürel · Xiangyu Qi · Luka Rimanic · Ce Zhang · Bo Li -
2021 Poster: Mandoline: Model Evaluation under Distribution Shift »
Mayee Chen · Karan Goel · Nimit Sohoni · Fait Poms · Kayvon Fatahalian · Christopher Re -
2021 Poster: FL-NTK: A Neural Tangent Kernel-based Framework for Federated Learning Analysis »
Baihe Huang · Xiaoxiao Li · Zhao Song · Xin Yang -
2021 Spotlight: Mandoline: Model Evaluation under Distribution Shift »
Mayee Chen · Karan Goel · Nimit Sohoni · Fait Poms · Kayvon Fatahalian · Christopher Re -
2021 Spotlight: FL-NTK: A Neural Tangent Kernel-based Framework for Federated Learning Analysis »
Baihe Huang · Xiaoxiao Li · Zhao Song · Xin Yang -
2021 Poster: Catformer: Designing Stable Transformers via Sensitivity Analysis »
Jared Quincy Davis · Albert Gu · Krzysztof Choromanski · Tri Dao · Christopher Re · Chelsea Finn · Percy Liang -
2021 Poster: A Tale of Two Efficient and Informative Negative Sampling Distributions »
Shabnam Daghaghi · Tharun Medini · Nicholas Meisburger · Beidi Chen · Mengnan Zhao · Anshumali Shrivastava -
2021 Poster: 1-bit Adam: Communication Efficient Large-Scale Training with Adam's Convergence Speed »
Hanlin Tang · Shaoduo Gan · Ammar Ahmad Awan · Samyam Rajbhandari · Conglong Li · Xiangru Lian · Ji Liu · Ce Zhang · Yuxiong He -
2021 Poster: Oblivious Sketching-based Central Path Method for Linear Programming »
Zhao Song · Zheng Yu -
2021 Spotlight: 1-bit Adam: Communication Efficient Large-Scale Training with Adam's Convergence Speed »
Hanlin Tang · Shaoduo Gan · Ammar Ahmad Awan · Samyam Rajbhandari · Conglong Li · Xiangru Lian · Ji Liu · Ce Zhang · Yuxiong He -
2021 Spotlight: Catformer: Designing Stable Transformers via Sensitivity Analysis »
Jared Quincy Davis · Albert Gu · Krzysztof Choromanski · Tri Dao · Christopher Re · Chelsea Finn · Percy Liang -
2021 Spotlight: Oblivious Sketching-based Central Path Method for Linear Programming »
Zhao Song · Zheng Yu -
2021 Oral: A Tale of Two Efficient and Informative Negative Sampling Distributions »
Shabnam Daghaghi · Tharun Medini · Nicholas Meisburger · Beidi Chen · Mengnan Zhao · Anshumali Shrivastava -
2021 Poster: Evolving Attention with Residual Convolutions »
Yujing Wang · Yaming Yang · Jiangang Bai · Mingliang Zhang · Jing Bai · JING YU · Ce Zhang · Gao Huang · Yunhai Tong -
2021 Spotlight: Evolving Attention with Residual Convolutions »
Yujing Wang · Yaming Yang · Jiangang Bai · Mingliang Zhang · Jing Bai · JING YU · Ce Zhang · Gao Huang · Yunhai Tong -
2020 Poster: Fast and Three-rious: Speeding Up Weak Supervision with Triplet Methods »
Daniel Y Fu · Mayee Chen · Frederic Sala · Sarah Hooper · Kayvon Fatahalian · Christopher Re -
2020 Poster: Don't Waste Your Bits! Squeeze Activations and Gradients for Deep Neural Networks via TinyScript »
Fangcheng Fu · Yuzheng Hu · Yihan He · Jiawei Jiang · Yingxia Shao · Ce Zhang · Bin Cui -
2020 Poster: Sub-linear Memory Sketches for Near Neighbor Search on Streaming Data »
Benjamin Coleman · Richard Baraniuk · Anshumali Shrivastava -
2020 Poster: Angular Visual Hardness »
Beidi Chen · Weiyang Liu · Zhiding Yu · Jan Kautz · Anshumali Shrivastava · Animesh Garg · Anima Anandkumar -
2020 Poster: Meta-learning for Mixed Linear Regression »
Weihao Kong · Raghav Somani · Zhao Song · Sham Kakade · Sewoong Oh -
2020 Poster: On the Generalization Effects of Linear Transformations in Data Augmentation »
Sen Wu · Hongyang Zhang · Gregory Valiant · Christopher Re -
2019 : Networking Lunch (provided) + Poster Session »
Abraham Stanway · Alex Robson · Aneesh Rangnekar · Ashesh Chattopadhyay · Ashley Pilipiszyn · Benjamin LeRoy · Bolong Cheng · Ce Zhang · Chaopeng Shen · Christian Schroeder · Christian Clough · Clement DUHART · Clement Fung · Cozmin Ududec · Dali Wang · David Dao · di wu · Dimitrios Giannakis · Dino Sejdinovic · Doina Precup · Duncan Watson-Parris · Gege Wen · George Chen · Gopal Erinjippurath · Haifeng Li · Han Zou · Herke van Hoof · Hillary A Scannell · Hiroshi Mamitsuka · Hongbao Zhang · Jaegul Choo · James Wang · James Requeima · Jessica Hwang · Jinfan Xu · Johan Mathe · Jonathan Binas · Joonseok Lee · Kalai Ramea · Kate Duffy · Kevin McCloskey · Kris Sankaran · Lester Mackey · Letif Mones · Loubna Benabbou · Lynn Kaack · Matthew Hoffman · Mayur Mudigonda · Mehrdad Mahdavi · Michael McCourt · Mingchao Jiang · Mohammad Mahdi Kamani · Neel Guha · Niccolo Dalmasso · Nick Pawlowski · Nikola Milojevic-Dupont · Paulo Orenstein · Pedram Hassanzadeh · Pekka Marttinen · Ramesh Nair · Sadegh Farhang · Samuel Kaski · Sandeep Manjanna · Sasha Luccioni · Shuby Deshpande · Soo Kim · Soukayna Mouatadid · Sunghyun Park · Tao Lin · Telmo Felgueira · Thomas Hornigold · Tianle Yuan · Tom Beucler · Tracy Cui · Volodymyr Kuleshov · Wei Yu · yang song · Ydo Wexler · Yoshua Bengio · Zhecheng Wang · Zhuangfang Yi · Zouheir Malki -
2019 Poster: Learning Fast Algorithms for Linear Transforms Using Butterfly Factorizations »
Tri Dao · Albert Gu · Matthew Eichhorn · Atri Rudra · Christopher Re -
2019 Poster: Learning Dependency Structures for Weak Supervision Models »
Paroma Varma · Frederic Sala · Ann He · Alexander J Ratner · Christopher Re -
2019 Poster: Distributed Learning over Unreliable Networks »
Chen Yu · Hanlin Tang · Cedric Renggli · Simon Kassing · Ankit Singla · Dan Alistarh · Ce Zhang · Ji Liu -
2019 Oral: Learning Dependency Structures for Weak Supervision Models »
Paroma Varma · Frederic Sala · Ann He · Alexander J Ratner · Christopher Re -
2019 Oral: Learning Fast Algorithms for Linear Transforms Using Butterfly Factorizations »
Tri Dao · Albert Gu · Matthew Eichhorn · Atri Rudra · Christopher Re -
2019 Oral: Distributed Learning over Unreliable Networks »
Chen Yu · Hanlin Tang · Cedric Renggli · Simon Kassing · Ankit Singla · Dan Alistarh · Ce Zhang · Ji Liu -
2019 Poster: A Kernel Theory of Modern Data Augmentation »
Tri Dao · Albert Gu · Alexander J Ratner · Virginia Smith · Christopher De Sa · Christopher Re -
2019 Poster: Compressing Gradient Optimizers via Count-Sketches »
Ryan Spring · Anastasios Kyrillidis · Vijai Mohan · Anshumali Shrivastava -
2019 Poster: DL2: Training and Querying Neural Networks with Logic »
Marc Fischer · Mislav Balunovic · Dana Drachsler-Cohen · Timon Gehr · Ce Zhang · Martin Vechev -
2019 Oral: DL2: Training and Querying Neural Networks with Logic »
Marc Fischer · Mislav Balunovic · Dana Drachsler-Cohen · Timon Gehr · Ce Zhang · Martin Vechev -
2019 Oral: Compressing Gradient Optimizers via Count-Sketches »
Ryan Spring · Anastasios Kyrillidis · Vijai Mohan · Anshumali Shrivastava -
2019 Oral: A Kernel Theory of Modern Data Augmentation »
Tri Dao · Albert Gu · Alexander J Ratner · Virginia Smith · Christopher De Sa · Christopher Re -
2018 Poster: Ultra Large-Scale Feature Selection using Count-Sketches »
Amirali Aghazadeh · Ryan Spring · Daniel LeJeune · Gautam Dasarathy · Anshumali Shrivastava · Richard Baraniuk -
2018 Poster: Representation Tradeoffs for Hyperbolic Embeddings »
Frederic Sala · Christopher De Sa · Albert Gu · Christopher Re -
2018 Oral: Representation Tradeoffs for Hyperbolic Embeddings »
Frederic Sala · Christopher De Sa · Albert Gu · Christopher Re -
2018 Oral: Ultra Large-Scale Feature Selection using Count-Sketches »
Amirali Aghazadeh · Ryan Spring · Daniel LeJeune · Gautam Dasarathy · Anshumali Shrivastava · Richard Baraniuk -
2018 Poster: Asynchronous Decentralized Parallel Stochastic Gradient Descent »
Xiangru Lian · Wei Zhang · Ce Zhang · Ji Liu -
2018 Poster: $D^2$: Decentralized Training over Decentralized Data »
Hanlin Tang · Xiangru Lian · Ming Yan · Ce Zhang · Ji Liu -
2018 Oral: $D^2$: Decentralized Training over Decentralized Data »
Hanlin Tang · Xiangru Lian · Ming Yan · Ce Zhang · Ji Liu -
2018 Oral: Asynchronous Decentralized Parallel Stochastic Gradient Descent »
Xiangru Lian · Wei Zhang · Ce Zhang · Ji Liu -
2017 Poster: Optimal Densification for Fast and Accurate Minwise Hashing »
Anshumali Shrivastava -
2017 Poster: ZipML: Training Linear Models with End-to-End Low Precision, and a Little Bit of Deep Learning »
Hantian Zhang · Jerry Li · Kaan Kara · Dan Alistarh · Ji Liu · Ce Zhang -
2017 Talk: ZipML: Training Linear Models with End-to-End Low Precision, and a Little Bit of Deep Learning »
Hantian Zhang · Jerry Li · Kaan Kara · Dan Alistarh · Ji Liu · Ce Zhang -
2017 Talk: Optimal Densification for Fast and Accurate Minwise Hashing »
Anshumali Shrivastava -
2017 Poster: Learning the Structure of Generative Models without Labeled Data »
Stephen Bach · Bryan He · Alexander J Ratner · Christopher Re -
2017 Talk: Learning the Structure of Generative Models without Labeled Data »
Stephen Bach · Bryan He · Alexander J Ratner · Christopher Re