Bio: Dr. Jihun Hamm has been an Associate Professor of Computer Science at Tulane University since 2019. He received his PhD degree from the University of Pennsylvania in 2008 supervised by Dr. Daniel Lee. Dr. Hamm's research interest is in machine learning, from theory and to applications. He has worked on the theory and practice of robust learning, adversarial learning, privacy and security, optimization, and deep learning. Dr. Hamm also has a background in biomedical engineering and has worked on machine learning applications in medical data analysis. His work in machine learning has been published in top venues such as ICML, NeurIPS, CVPR, JMLR, and IEEE-TPAMI. His work has also been published in medical research venues such as MICCAI, MedIA, and IEEE-TMI. Among other awards, he has earned the Best Paper Award from MedIA, Finalist for MICCAI Young Scientist Publication Impact Award, and Google Faculty Research Award.
Title: Analyzing Transfer Learning Bounds through Distributional Robustness
Abstract: The success of transfer learning at improving performance, especially with the use of large pre-trained models has made transfer learning an essential tool in the machine learning toolbox. However, the conditions under which performance transferability to downstream tasks is possible are not very well understood. In this talk, I will present several approaches to bounding the target-domain classification loss through distribution shift between the source and the target domains. For domain adaptation/generalization problems where the source and the target task are the same, distribution shift as measured by Wasserstein distance is sufficient to predict the loss bound. Furthermore, distributional robustness improves predictability (i.e., low bound) which may come at the price of performance decrease. For transfer learning where the source and the target task are different, distributions cannot be compared directly. We therefore propose a simple approach that transforms the source distribution (and classifier) by changing the class prior, label, and feature spaces. This allows us to relate the loss of the downstream task (i.e., transferability) to that of the source task. Wasserstein distance again plays an important role in the bound. I will show empirical results using state-of-the-art pre-trained models, and demonstrate how factors such as task relatedness, pretraining method, and model architecture affect transferability.