Are Explanations Faithful Across Skin Tones? Assessing Foundation-Model Dermatology Classifiers
Abstract
Foundation models are increasingly used in dermatology, where saliency methods like Grad-CAM explain predictions to clinicians. While performance disparities across skin tones are well-documented, less attention has been paid to whether model explanations are equally faithful across demographic groups. We introduce the Explanation Faithfulness Gap (EFG) to quantify such disparities, operationalized via Deletion and Insertion AUC on a diverse dermatology benchmark. Evaluating two vision foundation models (DINOv3 and PanDerm) with GradCAM and GradCAM++, we find that explanation fairness is highly sensitive to model architecture and fidelity metric. DINOv3 exhibits significant explanation bias that reverses direction depending on the metric used, while PanDerm shows statistically equitable explanations alongside lower accuracy disparities. These results suggest that standard accuracy metrics alone may mask underlying inequities in model reasoning, and that rigorously auditing explanation faithfulness is a necessary step toward developing trustworthy dermatology AI systems.