Abstract:
In this paper, we present two key theorems that should have significant implications for machine learning practitioners working with binary classification models. The first theorem provides a formula to calculate the maximum and minimum Precision-Recall AUC ($AUC_{PR}$) for a fixed Receiver Operating Characteristic AUC ($AUC_{ROC}$), demonstrating the variability of $AUC_{PR}$ even with a high $AUC_{ROC}$. This is particularly relevant for imbalanced datasets, where a good $AUC_{ROC}$ does not necessarily imply a high $AUC_{PR}$. The second theorem inversely establishes the bounds of $AUC_{ROC}$ given a fixed $AUC_{PR}$. Our findings highlight that in certain situations, especially for imbalanced datasets, it is more informative to prioritize $AUC_{PR}$ over $AUC_{ROC}$. Additionally, we introduce a method to determine when a higher $AUC_{ROC}$ in one model implies a higher $AUC_{PR}$ in another and vice versa, streamlining the model evaluation process.
Chat is not available.