Timezone: »
The great success of machine learning with massive amounts of data comes at a price of huge computation costs and storage for training and tuning. Recent studies on dataset condensation attempt to reduce the dependence on such massive data by synthesizing a compact training dataset. However, the existing approaches have fundamental limitations in optimization due to the limited representability of synthetic datasets without considering any data regularity characteristics. To this end, we propose a novel condensation framework that generates multiple synthetic data with a limited storage budget via efficient parameterization considering data regularity. We further analyze the shortcomings of the existing gradient matching-based condensation methods and develop an effective optimization technique for improving the condensation of training data information. We propose a unified algorithm that drastically improves the quality of condensed data against the current state-of-the-art on CIFAR-10, ImageNet, and Speech Commands.
Author Information
Jang-Hyun Kim (Seoul National University)
Jinuk Kim (Seoul National University)
Seong Joon Oh (AI Lab, Naver)
Sangdoo Yun ( Clova AI Research, NAVER Corp.)
Hwanjun Song (NAVER AI Lab)
Joonhyun Jeong (Clova Image Vision, NAVER Corp.)
Jung-Woo Ha (NAVER AI Lab)

Jung-Woo Ha got his BS and PhD degrees in computer science from Seoul National University in 2004 and 2015. He got the 2014 Fall semester outstanding PhD dissertation award from Computer Science Dept. of Seoul National University. He worked as a research scientist and tech lead at NAVER LABS and research head of NAVER CLOVA. Currently, he works as the head of NAVER AI Lab in NAVER Cloud. He has contributed to the AI research community as Datasets and Benchmarks Co-chair for NeurIPS and Social Co-chair for ICML 2023 and NeurIPS 2022. Also, he has joined a senior technical program committee member, such as, Area chair for NeurIPS 2023 and 2022, Area chair for ICML 2023, and Senior area chair for COLING. His research interests include large language models, generative models, multimodal representation learning and their practical applications for real-world problems. In particular, he has mainly focused on practical task definition and evaluation protocol for continual learning in various domains.
Hyun Oh Song (Seoul National University)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Poster: Dataset Condensation via Efficient Synthetic-Data Parameterization »
Thu. Jul 21st through Fri the 22nd Room Hall E #225
More from the Same Authors
-
2022 : SelecMix: Debiased Learning by Mixing up Contradicting Pairs »
Inwoo Hwang · Sangjun Lee · Yunhyeok Kwak · Seong Joon Oh · Damien Teney · Jin-Hwa Kim · Byoung-Tak Zhang -
2023 Poster: Probabilistic Contrastive Learning Recovers the Correct Aleatoric Uncertainty of Ambiguous Inputs »
Michael Kirchhof · Enkelejda Kasneci · Seong Joon Oh -
2023 Poster: Efficient Latency-Aware CNN Depth Compression via Two-Stage Dynamic Programming »
Jinuk Kim · Yeonwoo Jeong · Deokjae Lee · Hyun Oh Song -
2022 : Neural Architecture Search with Loss Flatness-aware Measure »
Joonhyun Jeong · Joonsang Yu · Dongyoon Han · YoungJoon Yoo -
2022 Poster: Dataset Condensation with Contrastive Signals »
Saehyung Lee · SANGHYUK CHUN · Sangwon Jung · Sangdoo Yun · Sungroh Yoon -
2022 Spotlight: Dataset Condensation with Contrastive Signals »
Saehyung Lee · SANGHYUK CHUN · Sangwon Jung · Sangdoo Yun · Sungroh Yoon -
2022 Poster: Query-Efficient and Scalable Black-Box Adversarial Attacks on Discrete Sequential Data via Bayesian Optimization »
Deokjae Lee · Seungyong Moon · Junhyeok Lee · Hyun Oh Song -
2022 Spotlight: Query-Efficient and Scalable Black-Box Adversarial Attacks on Discrete Sequential Data via Bayesian Optimization »
Deokjae Lee · Seungyong Moon · Junhyeok Lee · Hyun Oh Song -
2022 Poster: Time Is MattEr: Temporal Self-supervision for Video Transformers »
Sukmin Yun · Jaehyung Kim · Dongyoon Han · Hwanjun Song · Jung-Woo Ha · Jinwoo Shin -
2022 Spotlight: Time Is MattEr: Temporal Self-supervision for Video Transformers »
Sukmin Yun · Jaehyung Kim · Dongyoon Han · Hwanjun Song · Jung-Woo Ha · Jinwoo Shin -
2020 Poster: Puzzle Mix: Exploiting Saliency and Local Statistics for Optimal Mixup »
Jang-Hyun Kim · Wonho Choo · Hyun Oh Song -
2020 Poster: Learning De-biased Representations with Biased Representations »
Hyojin Bahng · SANGHYUK CHUN · Sangdoo Yun · Jaegul Choo · Seong Joon Oh -
2019 Poster: Learning Discrete and Continuous Factors of Data via Alternating Disentanglement »
Yeonwoo Jeong · Hyun Oh Song -
2019 Oral: Learning Discrete and Continuous Factors of Data via Alternating Disentanglement »
Yeonwoo Jeong · Hyun Oh Song -
2019 Poster: Parsimonious Black-Box Adversarial Attacks via Efficient Combinatorial Optimization »
Seungyong Moon · Gaon An · Hyun Oh Song -
2019 Oral: Parsimonious Black-Box Adversarial Attacks via Efficient Combinatorial Optimization »
Seungyong Moon · Gaon An · Hyun Oh Song -
2019 Poster: EMI: Exploration with Mutual Information »
Hyoungseok Kim · Jaekyeom Kim · Yeonwoo Jeong · Sergey Levine · Hyun Oh Song -
2019 Oral: EMI: Exploration with Mutual Information »
Hyoungseok Kim · Jaekyeom Kim · Yeonwoo Jeong · Sergey Levine · Hyun Oh Song -
2018 Poster: Efficient end-to-end learning for quantizable representations »
Yeonwoo Jeong · Hyun Oh Song -
2018 Oral: Efficient end-to-end learning for quantizable representations »
Yeonwoo Jeong · Hyun Oh Song