Timezone: »
Despite the impressive capability of large language models (LLMs) in solving different downstream tasks, new concerns about proper performance evaluation have been raised, especially for test-data leakage caused by accidentally including them during pretraining, or by indirectly exposing them through API calls for evaluation. Motivated by these, in this paper, we propose a new evaluation workflow that generates steerable synthetic language datasets and proxy tasks for benchmarking the performance of pertained LLMs on sentence classification tasks. This approach allows for better characterization of the joint analysis on the robustness and accuracy of LLMs without risking sensitive information leakage. Verified on various pretrained LLMs, the proposed approach demonstrates promising high correlation with real downstream performance.
Author Information
Ching-Yun (Irene) Ko (MIT)
Pin-Yu Chen (IBM Research AI)
Payel Das (IBM Research AI)
Yung-Sung Chuang (MIT CSAIL)
Hi! I'm a second-year PhD student in Electrical Engineering and Computer Science at Massachusetts Institute of Technology, where I work with Jim Glass. My research interest broadly covers the deep learning technique for natural language processing and speech processing. In particular, I aim to utilize the ability of machines to help people grasp large information in text/audio form in efficient ways. Previously, I was an undergraduate student in Electrical Engineering at National Taiwan University. I joined Speech Processing Lab supervised by Hung-Yi Lee and Lin-shan Lee, and Machine Intelligence Understanding Lab supervised by Yun-Nung (Vivian) Chen. I received the NTU Presidential Award for top 5% students four times in 2018-2020, Irving T. Ho Memorial Scholarship in 2018 and 2019. Here is my Curriculum Vitae.
Luca Daniel (Massachusetts Institute of Technology)
More from the Same Authors
-
2022 : Fast Convergence for Unstable Reinforcement Learning Problems by Logarithmic Mapping »
Wang Zhang · Lam Nguyen · Subhro Das · Alexandre Megretsky · Luca Daniel · Tsui-Wei Weng -
2022 : Protein Representation Learning by Geometric Structure Pretraining »
Zuobai Zhang · Zuobai Zhang · Minghao Xu · Minghao Xu · Arian Jamasb · Arian Jamasb · Vijil Chenthamarakshan · Vijil Chenthamarakshan · Aurelie Lozano · Payel Das · Payel Das · Jian Tang · Jian Tang -
2023 : Which Features are Learned by Contrastive Learning? On the Role of Simplicity Bias in Class Collapse and Feature Suppression »
Yihao Xue · Siddharth Joshi · Eric Gan · Pin-Yu Chen · Baharan Mirzasoleiman -
2023 : On Robustness-Accuracy Characterization of Large Language Models using Synthetic Datasets »
Ching-Yun (Irene) Ko · Pin-Yu Chen · Payel Das · Yung-Sung Chuang · Luca Daniel -
2023 Workshop: 2nd ICML Workshop on New Frontiers in Adversarial Machine Learning »
Sijia Liu · Pin-Yu Chen · Dongxiao Zhu · Eric Wong · Kathrin Grosse · Baharan Mirzasoleiman · Sanmi Koyejo -
2023 Poster: Hierarchical Grammar-Induced Geometry for Data-Efficient Molecular Property Prediction »
Minghao Guo · Veronika Thost · Samuel Song · Adithya Balachandran · Payel Das · Jie Chen · Wojciech Matusik -
2023 Poster: ConCerNet: A Contrastive Learning Based Framework for Automated Conservation Law Discovery and Trustworthy Dynamical System Prediction »
Wang Zhang · Lily Weng · Subhro Das · Alexandre Megretsky · Luca Daniel · Lam Nguyen -
2023 Poster: Reprogramming Pretrained Language Models for Antibody Sequence Infilling »
Igor Melnyk · Vijil Chenthamarakshan · Pin-Yu Chen · Payel Das · Amit Dhurandhar · Inkit Padhi · Devleena Das -
2022 Workshop: New Frontiers in Adversarial Machine Learning »
Sijia Liu · Pin-Yu Chen · Dongxiao Zhu · Eric Wong · Kathrin Grosse · Hima Lakkaraju · Sanmi Koyejo -
2022 Poster: Sharp-MAML: Sharpness-Aware Model-Agnostic Meta Learning »
Momin Abbas · Quan Xiao · Lisha Chen · Pin-Yu Chen · Tianyi Chen -
2022 Poster: Linearity Grafting: Relaxed Neuron Pruning Helps Certifiable Robustness »
Tianlong Chen · Huan Zhang · Zhenyu Zhang · Shiyu Chang · Sijia Liu · Pin-Yu Chen · Zhangyang “Atlas” Wang -
2022 Spotlight: Sharp-MAML: Sharpness-Aware Model-Agnostic Meta Learning »
Momin Abbas · Quan Xiao · Lisha Chen · Pin-Yu Chen · Tianyi Chen -
2022 Spotlight: Linearity Grafting: Relaxed Neuron Pruning Helps Certifiable Robustness »
Tianlong Chen · Huan Zhang · Zhenyu Zhang · Shiyu Chang · Sijia Liu · Pin-Yu Chen · Zhangyang “Atlas” Wang -
2022 Poster: Generalization Guarantee of Training Graph Convolutional Networks with Graph Topology Sampling »
Hongkang Li · Meng Wang · Sijia Liu · Pin-Yu Chen · Jinjun Xiong -
2022 Poster: Biological Sequence Design with GFlowNets »
Moksh Jain · Emmanuel Bengio · Alex Hernandez-Garcia · Jarrid Rector-Brooks · Bonaventure Dossou · Chanakya Ekbote · Jie Fu · Tianyu Zhang · Michael Kilgour · Dinghuai Zhang · Lena Simine · Payel Das · Yoshua Bengio -
2022 Spotlight: Generalization Guarantee of Training Graph Convolutional Networks with Graph Topology Sampling »
Hongkang Li · Meng Wang · Sijia Liu · Pin-Yu Chen · Jinjun Xiong -
2022 Spotlight: Biological Sequence Design with GFlowNets »
Moksh Jain · Emmanuel Bengio · Alex Hernandez-Garcia · Jarrid Rector-Brooks · Bonaventure Dossou · Chanakya Ekbote · Jie Fu · Tianyu Zhang · Michael Kilgour · Dinghuai Zhang · Lena Simine · Payel Das · Yoshua Bengio -
2022 Poster: Revisiting Contrastive Learning through the Lens of Neighborhood Component Analysis: an Integrated Framework »
Ching-Yun (Irene) Ko · Jeet Mohapatra · Sijia Liu · Pin-Yu Chen · Luca Daniel · Lily Weng -
2022 Spotlight: Revisiting Contrastive Learning through the Lens of Neighborhood Component Analysis: an Integrated Framework »
Ching-Yun (Irene) Ko · Jeet Mohapatra · Sijia Liu · Pin-Yu Chen · Luca Daniel · Lily Weng -
2021 Poster: CRFL: Certifiably Robust Federated Learning against Backdoor Attacks »
Chulin Xie · Minghao Chen · Pin-Yu Chen · Bo Li -
2021 Spotlight: CRFL: Certifiably Robust Federated Learning against Backdoor Attacks »
Chulin Xie · Minghao Chen · Pin-Yu Chen · Bo Li -
2021 Poster: Fold2Seq: A Joint Sequence(1D)-Fold(3D) Embedding-based Generative Model for Protein Design »
yue cao · Payel Das · Vijil Chenthamarakshan · Pin-Yu Chen · Igor Melnyk · Yang Shen -
2021 Spotlight: Fold2Seq: A Joint Sequence(1D)-Fold(3D) Embedding-based Generative Model for Protein Design »
yue cao · Payel Das · Vijil Chenthamarakshan · Pin-Yu Chen · Igor Melnyk · Yang Shen -
2021 Poster: Voice2Series: Reprogramming Acoustic Models for Time Series Classification »
Huck Yang · Yun-Yun Tsai · Pin-Yu Chen -
2021 Spotlight: Voice2Series: Reprogramming Acoustic Models for Time Series Classification »
Huck Yang · Yun-Yun Tsai · Pin-Yu Chen -
2020 Poster: Is There a Trade-Off Between Fairness and Accuracy? A Perspective Using Mismatched Hypothesis Testing »
Sanghamitra Dutta · Dennis Wei · Hazar Yueksel · Pin-Yu Chen · Sijia Liu · Kush Varshney -
2020 Poster: Proper Network Interpretability Helps Adversarial Robustness in Classification »
Akhilan Boopathy · Sijia Liu · Gaoyuan Zhang · Cynthia Liu · Pin-Yu Chen · Shiyu Chang · Luca Daniel -
2020 Poster: Transfer Learning without Knowing: Reprogramming Black-box Machine Learning Models with Scarce Data and Limited Resources »
Yun Yun Tsai · Pin-Yu Chen · Tsung-Yi Ho -
2020 Poster: Fast Learning of Graph Neural Networks with Guaranteed Generalizability: One-hidden-layer Case »
shuai zhang · Meng Wang · Sijia Liu · Pin-Yu Chen · Jinjun Xiong -
2019 Poster: Fast Incremental von Neumann Graph Entropy Computation: Theory, Algorithm, and Applications »
Pin-Yu Chen · Lingfei Wu · Sijia Liu · Indika Rajapakse -
2019 Poster: POPQORN: Quantifying Robustness of Recurrent Neural Networks »
CHING-YUN KO · Zhaoyang Lyu · Tsui-Wei Weng · Luca Daniel · Ngai Wong · Dahua Lin -
2019 Poster: PROVEN: Verifying Robustness of Neural Networks with a Probabilistic Approach »
Tsui-Wei Weng · Pin-Yu Chen · Lam Nguyen · Mark Squillante · Akhilan Boopathy · Ivan Oseledets · Luca Daniel -
2019 Oral: Fast Incremental von Neumann Graph Entropy Computation: Theory, Algorithm, and Applications »
Pin-Yu Chen · Lingfei Wu · Sijia Liu · Indika Rajapakse -
2019 Oral: PROVEN: Verifying Robustness of Neural Networks with a Probabilistic Approach »
Tsui-Wei Weng · Pin-Yu Chen · Lam Nguyen · Mark Squillante · Akhilan Boopathy · Ivan Oseledets · Luca Daniel -
2019 Oral: POPQORN: Quantifying Robustness of Recurrent Neural Networks »
CHING-YUN KO · Zhaoyang Lyu · Tsui-Wei Weng · Luca Daniel · Ngai Wong · Dahua Lin