Timezone: »
Despite the impressive capability of large language models (LLMs) in solving different downstream tasks, new concerns about proper performance evaluation have been raised, especially for test-data leakage caused by accidentally including them during pretraining, or by indirectly exposing them through API calls for evaluation. Motivated by these, in this paper, we propose a new evaluation workflow that generates steerable synthetic language datasets and proxy tasks for benchmarking the performance of pertained LLMs on sentence classification tasks. This approach allows for better characterization of the joint analysis on the robustness and accuracy of LLMs without risking sensitive information leakage. Verified on various pretrained LLMs, the proposed approach demonstrates promising high correlation with real downstream performance.
Author Information
Ching-Yun (Irene) Ko (MIT)
Pin-Yu Chen (IBM Research)
Payel Das (IBM Research AI)
Yung-Sung Chuang (MIT CSAIL)
Hi! I'm a second-year PhD student in Electrical Engineering and Computer Science at Massachusetts Institute of Technology, where I work with Jim Glass. My research interest broadly covers the deep learning technique for natural language processing and speech processing. In particular, I aim to utilize the ability of machines to help people grasp large information in text/audio form in efficient ways. Previously, I was an undergraduate student in Electrical Engineering at National Taiwan University. I joined Speech Processing Lab supervised by Hung-Yi Lee and Lin-shan Lee, and Machine Intelligence Understanding Lab supervised by Yun-Nung (Vivian) Chen. I received the NTU Presidential Award for top 5% students four times in 2018-2020, Irving T. Ho Memorial Scholarship in 2018 and 2019. Here is my Curriculum Vitae.
Luca Daniel (Massachusetts Institute of Technology)
More from the Same Authors
-
2021 : Generalizing Adversarial Training to Composite Semantic Perturbations »
Yun-Yun Tsai · Lei Hsiung · Pin-Yu Chen · Tsung-Yi Ho -
2021 : On the Effectiveness of Poisoning against Unsupervised Domain Adaptation »
Akshay Mehra · Bhavya Kailkhura · Pin-Yu Chen · Jihun Hamm -
2022 : Fast Convergence for Unstable Reinforcement Learning Problems by Logarithmic Mapping »
Wang Zhang · Lam Nguyen · Subhro Das · Alexandre Megretsky · Luca Daniel · Tsui-Wei Weng -
2022 : Protein Representation Learning by Geometric Structure Pretraining »
Zuobai Zhang · Zuobai Zhang · Minghao Xu · Minghao Xu · Arian Jamasb · Arian Jamasb · Vijil Chenthamarakshan · Vijil Chenthamarakshan · Aurelie Lozano · Payel Das · Payel Das · Jian Tang · Jian Tang -
2023 : Which Features are Learned by Contrastive Learning? On the Role of Simplicity Bias in Class Collapse and Feature Suppression »
Yihao Xue · Siddharth Joshi · Eric Gan · Pin-Yu Chen · Baharan Mirzasoleiman -
2023 : On Robustness-Accuracy Characterization of Large Language Models using Synthetic Datasets »
Ching-Yun (Irene) Ko · Pin-Yu Chen · Payel Das · Yung-Sung Chuang · Luca Daniel -
2023 : Which Features are Learned by Contrastive Learning? On the Role of Simplicity Bias in Class Collapse and Feature Suppression »
Yihao Xue · Siddharth Joshi · Eric Gan · Pin-Yu Chen · Baharan Mirzasoleiman -
2023 Oral: Patch-level Routing in Mixture-of-Experts is Provably Sample-efficient for Convolutional Neural Networks »
Mohammed Nowaz Rabbani Chowdhury · Shuai Zhang · Meng Wang · Sijia Liu · Pin-Yu Chen -
2023 Poster: MultiRobustBench: Benchmarking Robustness Against Multiple Attacks »
Sophie Dai · Saeed Mahloujifar · Chong Xiang · Vikash Sehwag · Pin-Yu Chen · Prateek Mittal -
2023 Poster: Which Features are Learnt by Contrastive Learning? On the Role of Simplicity Bias in Class Collapse and Feature Suppression »
Yihao Xue · Siddharth Joshi · Eric Gan · Pin-Yu Chen · Baharan Mirzasoleiman -
2023 Poster: Compressed Decentralized Proximal Stochastic Gradient Method for Nonconvex Composite Problems with Heterogeneous Data »
Yonggui Yan · Jie Chen · Pin-Yu Chen · Xiaodong Cui · Songtao Lu · Yangyang Xu -
2023 Poster: Identification of the Adversary from a Single Adversarial Example »
Minhao Cheng · Rui Min · Haochen Sun · Pin-Yu Chen -
2023 Poster: Hierarchical Grammar-Induced Geometry for Data-Efficient Molecular Property Prediction »
Minghao Guo · Veronika Thost · Samuel Song · Adithya Balachandran · Payel Das · Jie Chen · Wojciech Matusik -
2023 Poster: ConCerNet: A Contrastive Learning Based Framework for Automated Conservation Law Discovery and Trustworthy Dynamical System Prediction »
Wang Zhang · Lily Weng · Subhro Das · Alexandre Megretsky · Luca Daniel · Lam Nguyen -
2023 Oral: Which Features are Learnt by Contrastive Learning? On the Role of Simplicity Bias in Class Collapse and Feature Suppression »
Yihao Xue · Siddharth Joshi · Eric Gan · Pin-Yu Chen · Baharan Mirzasoleiman -
2023 Poster: Patch-level Routing in Mixture-of-Experts is Provably Sample-efficient for Convolutional Neural Networks »
Mohammed Nowaz Rabbani Chowdhury · Shuai Zhang · Meng Wang · Sijia Liu · Pin-Yu Chen -
2023 Poster: Reprogramming Pretrained Language Models for Antibody Sequence Infilling »
Igor Melnyk · Vijil Chenthamarakshan · Pin-Yu Chen · Payel Das · Amit Dhurandhar · Inkit Padhi · Devleena Das -
2022 Poster: Biological Sequence Design with GFlowNets »
Moksh Jain · Emmanuel Bengio · Alex Hernandez-Garcia · Jarrid Rector-Brooks · Bonaventure Dossou · Chanakya Ekbote · Jie Fu · Tianyu Zhang · Michael Kilgour · Dinghuai Zhang · Lena Simine · Payel Das · Yoshua Bengio -
2022 Spotlight: Biological Sequence Design with GFlowNets »
Moksh Jain · Emmanuel Bengio · Alex Hernandez-Garcia · Jarrid Rector-Brooks · Bonaventure Dossou · Chanakya Ekbote · Jie Fu · Tianyu Zhang · Michael Kilgour · Dinghuai Zhang · Lena Simine · Payel Das · Yoshua Bengio -
2022 Poster: Revisiting Contrastive Learning through the Lens of Neighborhood Component Analysis: an Integrated Framework »
Ching-Yun (Irene) Ko · Jeet Mohapatra · Sijia Liu · Pin-Yu Chen · Luca Daniel · Lily Weng -
2022 Spotlight: Revisiting Contrastive Learning through the Lens of Neighborhood Component Analysis: an Integrated Framework »
Ching-Yun (Irene) Ko · Jeet Mohapatra · Sijia Liu · Pin-Yu Chen · Luca Daniel · Lily Weng -
2021 Poster: Fold2Seq: A Joint Sequence(1D)-Fold(3D) Embedding-based Generative Model for Protein Design »
yue cao · Payel Das · Vijil Chenthamarakshan · Pin-Yu Chen · Igor Melnyk · Yang Shen -
2021 Spotlight: Fold2Seq: A Joint Sequence(1D)-Fold(3D) Embedding-based Generative Model for Protein Design »
yue cao · Payel Das · Vijil Chenthamarakshan · Pin-Yu Chen · Igor Melnyk · Yang Shen -
2020 Poster: Proper Network Interpretability Helps Adversarial Robustness in Classification »
Akhilan Boopathy · Sijia Liu · Gaoyuan Zhang · Cynthia Liu · Pin-Yu Chen · Shiyu Chang · Luca Daniel -
2019 Poster: POPQORN: Quantifying Robustness of Recurrent Neural Networks »
CHING-YUN KO · Zhaoyang Lyu · Tsui-Wei Weng · Luca Daniel · Ngai Wong · Dahua Lin -
2019 Poster: PROVEN: Verifying Robustness of Neural Networks with a Probabilistic Approach »
Tsui-Wei Weng · Pin-Yu Chen · Lam Nguyen · Mark Squillante · Akhilan Boopathy · Ivan Oseledets · Luca Daniel -
2019 Oral: PROVEN: Verifying Robustness of Neural Networks with a Probabilistic Approach »
Tsui-Wei Weng · Pin-Yu Chen · Lam Nguyen · Mark Squillante · Akhilan Boopathy · Ivan Oseledets · Luca Daniel -
2019 Oral: POPQORN: Quantifying Robustness of Recurrent Neural Networks »
CHING-YUN KO · Zhaoyang Lyu · Tsui-Wei Weng · Luca Daniel · Ngai Wong · Dahua Lin