Timezone: »
APP: Anytime Progressive Pruning
Diganta Misra · Bharat Runwal · Tianlong Chen · Zhangyang “Atlas” Wang · Irina Rish
With the latest advances in deep learning, several methods have been investigated for optimal learning settings in scenarios where the data stream is continuous over time. However, sparse networks training in such settings have often been overlooked. In this paper, we explore the problem of training a neural network with a target sparsity in a particular case of online learning: the anytime learning at macroscale paradigm (ALMA). We propose a novel way of progressive pruning, referred to as \textit{Anytime Progressive Pruning} (APP); the proposed approach significantly outperforms the baseline dense and Anytime OSP models across multiple architectures and datasets under short, moderate, and long-sequence training. Our method, for example, shows an improvement in accuracy of $\approx 7\%$ and a reduction in the generalisation gap by $\approx 22\%$, while being $\approx 1/3$ rd the size of the dense baseline model in few-shot restricted imagenet training. The code and experiment dashboards can be accessed at \url{https://github.com/landskape-ai/Progressive-Pruning} and \url{https://wandb.ai/landskape/APP}, respectively.
Author Information
Diganta Misra (Mila)

I am a Research Masters student at MILA, Montreal with Prof Irina Rish working on continual learning and robustness. I also serve as a student researcher at Morgan Stanley where I work on information theoretic approaches on time series modality. In addition I am the founder of a non-profit research organisation called Landskape where I work on sparsity and reprogramming in active collaboration with IBM NY, VITA UT-Austin and Google Research.
Bharat Runwal (Indian Institute Of Technology(IIT) Delhi)
Tianlong Chen (University of Texas at Austin)
Zhangyang “Atlas” Wang (University of Texas at Austin)
Irina Rish (Mila/UdeM)
More from the Same Authors
-
2023 : H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models »
Zhenyu Zhang · Ying Sheng · Tianyi Zhou · Tianlong Chen · Lianmin Zheng · Ruisi Cai · Zhao Song · Yuandong Tian · Christopher Re · Clark Barrett · Zhangyang “Atlas” Wang · Beidi Chen -
2023 : Panel: How will new technologies such as foundation models/generative models/LLMs change the way we do scientific discoveries? »
Peter Melchior · Yashar Hezaveh · Megan Ansdell · Yuan-Sen Ting · David W. Hogg · Irina Rish -
2023 : Irina Rish: Backpropagation Alternatives and Scalable AI »
Irina Rish -
2023 : Atlas Wang »
Zhangyang “Atlas” Wang -
2023 Poster: Learning to Optimize Differentiable Games »
Xuxi Chen · Nelson Vadori · Tianlong Chen · Zhangyang “Atlas” Wang -
2023 Poster: Robust Weight Signatures: Gaining Robustness as Easy as Patching Weights? »
Ruisi Cai · Zhenyu Zhang · Zhangyang “Atlas” Wang -
2023 Poster: Data Efficient Neural Scaling Law via Model Reusing »
Peihao Wang · Rameswar Panda · Zhangyang “Atlas” Wang -
2023 Oral: Instant Soup: Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models »
Ajay Jaiswal · Shiwei Liu · Tianlong Chen · Ding · Zhangyang “Atlas” Wang -
2023 Poster: Are Large Kernels Better Teachers than Transformers for ConvNets? »
Tianjin Huang · Lu Yin · Zhenyu Zhang · Li Shen · Meng Fang · Mykola Pechenizkiy · Zhangyang “Atlas” Wang · Shiwei Liu -
2023 Poster: Outline, Then Details: Syntactically Guided Coarse-To-Fine Code Generation »
Wenqing Zheng · S P Sharan · Ajay Jaiswal · Kevin Wang · Yihan Xi · Dejia Xu · Zhangyang “Atlas” Wang -
2023 Poster: Towards Constituting Mathematical Structures for Learning to Optimize »
Jialin Liu · Xiaohan Chen · Zhangyang “Atlas” Wang · Wotao Yin · HanQin Cai -
2023 Poster: Graph Ladling: Shockingly Simple Parallel GNN Training without Intermediate Communication »
Ajay Jaiswal · Shiwei Liu · Tianlong Chen · Ding · Zhangyang “Atlas” Wang -
2023 Poster: Lowering the Pre-training Tax for Gradient-based Subset Training: A Lightweight Distributed Pre-Training Toolkit »
Yeonju Ro · Zhangyang “Atlas” Wang · Vijay Chidambaram · Aditya Akella -
2023 Poster: Instant Soup: Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models »
Ajay Jaiswal · Shiwei Liu · Tianlong Chen · Ding · Zhangyang “Atlas” Wang -
2022 : Invited talk #8 Atlas Wang. Title: “Free Knowledge” in Chest X-rays: Contrastive Learning of Images and Their Radiomics »
Zhangyang “Atlas” Wang -
2022 Poster: Data-Efficient Double-Win Lottery Tickets from Robust Pre-training »
Tianlong Chen · Zhenyu Zhang · Sijia Liu · Yang Zhang · Shiyu Chang · Zhangyang “Atlas” Wang -
2022 Poster: Linearity Grafting: Relaxed Neuron Pruning Helps Certifiable Robustness »
Tianlong Chen · Huan Zhang · Zhenyu Zhang · Shiyu Chang · Sijia Liu · Pin-Yu Chen · Zhangyang “Atlas” Wang -
2022 Spotlight: Data-Efficient Double-Win Lottery Tickets from Robust Pre-training »
Tianlong Chen · Zhenyu Zhang · Sijia Liu · Yang Zhang · Shiyu Chang · Zhangyang “Atlas” Wang -
2022 Spotlight: Linearity Grafting: Relaxed Neuron Pruning Helps Certifiable Robustness »
Tianlong Chen · Huan Zhang · Zhenyu Zhang · Shiyu Chang · Sijia Liu · Pin-Yu Chen · Zhangyang “Atlas” Wang -
2022 Poster: Universality of Winning Tickets: A Renormalization Group Perspective »
William T. Redman · Tianlong Chen · Zhangyang “Atlas” Wang · Akshunna S. Dogra -
2022 Poster: VariGrow: Variational Architecture Growing for Task-Agnostic Continual Learning based on Bayesian Novelty »
Randy Ardywibowo · Zepeng Huo · Zhangyang “Atlas” Wang · Bobak Mortazavi · Shuai Huang · Xiaoning Qian -
2022 Poster: Partial and Asymmetric Contrastive Learning for Out-of-Distribution Detection in Long-Tailed Recognition »
Haotao Wang · Aston Zhang · Yi Zhu · Shuai Zheng · Mu Li · Alex Smola · Zhangyang “Atlas” Wang -
2022 Poster: Training Your Sparse Neural Network Better with Any Mask »
Ajay Jaiswal · Haoyu Ma · Tianlong Chen · Ying Ding · Zhangyang “Atlas” Wang -
2022 Oral: Partial and Asymmetric Contrastive Learning for Out-of-Distribution Detection in Long-Tailed Recognition »
Haotao Wang · Aston Zhang · Yi Zhu · Shuai Zheng · Mu Li · Alex Smola · Zhangyang “Atlas” Wang -
2022 Spotlight: Universality of Winning Tickets: A Renormalization Group Perspective »
William T. Redman · Tianlong Chen · Zhangyang “Atlas” Wang · Akshunna S. Dogra -
2022 Spotlight: Training Your Sparse Neural Network Better with Any Mask »
Ajay Jaiswal · Haoyu Ma · Tianlong Chen · Ying Ding · Zhangyang “Atlas” Wang -
2022 Spotlight: VariGrow: Variational Architecture Growing for Task-Agnostic Continual Learning based on Bayesian Novelty »
Randy Ardywibowo · Zepeng Huo · Zhangyang “Atlas” Wang · Bobak Mortazavi · Shuai Huang · Xiaoning Qian -
2022 Poster: Coarsening the Granularity: Towards Structurally Sparse Lottery Tickets »
Tianlong Chen · Xuxi Chen · Xiaolong Ma · Yanzhi Wang · Zhangyang “Atlas” Wang -
2022 Poster: Removing Batch Normalization Boosts Adversarial Training »
Haotao Wang · Aston Zhang · Shuai Zheng · Xingjian Shi · Mu Li · Zhangyang “Atlas” Wang -
2022 Poster: Neural Implicit Dictionary Learning via Mixture-of-Expert Training »
Peihao Wang · Zhiwen Fan · Tianlong Chen · Zhangyang “Atlas” Wang -
2022 Spotlight: Removing Batch Normalization Boosts Adversarial Training »
Haotao Wang · Aston Zhang · Shuai Zheng · Xingjian Shi · Mu Li · Zhangyang “Atlas” Wang -
2022 Spotlight: Neural Implicit Dictionary Learning via Mixture-of-Expert Training »
Peihao Wang · Zhiwen Fan · Tianlong Chen · Zhangyang “Atlas” Wang -
2022 Spotlight: Coarsening the Granularity: Towards Structurally Sparse Lottery Tickets »
Tianlong Chen · Xuxi Chen · Xiaolong Ma · Yanzhi Wang · Zhangyang “Atlas” Wang -
2021 Poster: Sparse and Imperceptible Adversarial Attack via a Homotopy Algorithm »
Mingkang Zhu · Tianlong Chen · Zhangyang “Atlas” Wang -
2021 Oral: Sparse and Imperceptible Adversarial Attack via a Homotopy Algorithm »
Mingkang Zhu · Tianlong Chen · Zhangyang “Atlas” Wang -
2021 Poster: Graph Contrastive Learning Automated »
Yuning You · Tianlong Chen · Yang Shen · Zhangyang “Atlas” Wang -
2021 Poster: Self-Damaging Contrastive Learning »
Ziyu Jiang · Tianlong Chen · Bobak Mortazavi · Zhangyang “Atlas” Wang -
2021 Oral: Graph Contrastive Learning Automated »
Yuning You · Tianlong Chen · Yang Shen · Zhangyang “Atlas” Wang -
2021 Spotlight: Self-Damaging Contrastive Learning »
Ziyu Jiang · Tianlong Chen · Bobak Mortazavi · Zhangyang “Atlas” Wang -
2021 Poster: A Unified Lottery Ticket Hypothesis for Graph Neural Networks »
Tianlong Chen · Yongduo Sui · Xuxi Chen · Aston Zhang · Zhangyang “Atlas” Wang -
2021 Poster: Efficient Lottery Ticket Finding: Less Data is More »
Zhenyu Zhang · Xuxi Chen · Tianlong Chen · Zhangyang “Atlas” Wang -
2021 Spotlight: Efficient Lottery Ticket Finding: Less Data is More »
Zhenyu Zhang · Xuxi Chen · Tianlong Chen · Zhangyang “Atlas” Wang -
2021 Spotlight: A Unified Lottery Ticket Hypothesis for Graph Neural Networks »
Tianlong Chen · Yongduo Sui · Xuxi Chen · Aston Zhang · Zhangyang “Atlas” Wang -
2021 Tutorial: Continual Learning with Deep Architectures »
Vincenzo Lomonaco · Irina Rish -
2021 : Introduction and State-of-the-art »
Irina Rish -
2020 Poster: Self-PU: Self Boosted and Calibrated Positive-Unlabeled Training »
Xuxi Chen · Wuyang Chen · Tianlong Chen · Ye Yuan · Chen Gong · Kewei Chen · Zhangyang “Atlas” Wang -
2020 Poster: When Does Self-Supervision Help Graph Convolutional Networks? »
Yuning You · Tianlong Chen · Zhangyang “Atlas” Wang · Yang Shen -
2020 Poster: Automated Synthetic-to-Real Generalization »
Wuyang Chen · Zhiding Yu · Zhangyang “Atlas” Wang · Anima Anandkumar -
2020 Poster: Eliminating the Invariance on the Loss Landscape of Linear Autoencoders »
Reza Oftadeh · Jiayi Shen · Zhangyang “Atlas” Wang · Dylan Shell -
2020 Poster: NADS: Neural Architecture Distribution Search for Uncertainty Awareness »
Randy Ardywibowo · Shahin Boluki · Xinyu Gong · Zhangyang “Atlas” Wang · Xiaoning Qian -
2020 Poster: AutoGAN-Distiller: Searching to Compress Generative Adversarial Networks »
Yonggan Fu · Wuyang Chen · Haotao Wang · Haoran Li · Yingyan Lin · Zhangyang “Atlas” Wang