Poster
in
Workshop: ICML 2024 Workshop on Foundation Models in the Wild
GROD: Enhancing Generalization of Transformer with Out-of-Distribution Detection
Yijin Zhou · Yu Guang Wang
Keywords: [ out-of-distribution detection ] [ Transformer networks ]
Transformer networks face challenges in generalizing to Out-of-Distribution (OOD) datasets, that is, data whose distribution differs from that seen during training. Utilizing an OOD detection framework based on Probably Approximately Correct (PAC) theory, the proposed \textit{Generate Rounded OOD Data} (GROD) algorithm, a novel approach to enhancing transformer networks' generalization across various natural language processing and computer vision datasets, improves transformers' ability to in-distribution (ID) data boundary decision-making and detect outliers effectively. By incorporating synthetic outlier generation and penalizing OOD misclassification within the loss function, GROD refines model parameters and ensures robust performance. Empirical evaluations show that GROD achieves state-of-the-art (SOTA) results in natural language processing (NLP) and computer vision (CV) tasks, significantly reducing the SOTA FPR@95 from 21.97% to 0.12%, and improving AUROC from 93.62% to 99.98% on image classification tasks, and the SOTA FPR@95 by 12.89% and AUROC by 2.27% in detecting semantic text outliers. The code is available at https://anonymous.4open.science/r/GROD-OOD-Detection-with-transformers-B70F.