Abstract:
In this paper, we present two key theorems that should have significant implications for machine learning practitioners working with binary classification models. The first theorem provides a formula to calculate the maximum and minimum Precision-Recall AUC (AUCPR) for a fixed Receiver Operating Characteristic AUC (AUCROC), demonstrating the variability of AUCPR even with a high AUCROC. This is particularly relevant for imbalanced datasets, where a good AUCROC does not necessarily imply a high AUCPR. The second theorem inversely establishes the bounds of AUCROC given a fixed AUCPR. Our findings highlight that in certain situations, especially for imbalanced datasets, it is more informative to prioritize AUCPR over AUCROC. Additionally, we introduce a method to determine when a higher AUCROC in one model implies a higher AUCPR in another and vice versa, streamlining the model evaluation process.
Chat is not available.