Incorporating Importance Weighting in Optimal Transport Based Domain Alignment
Okan Koç ⋅ Alexander Soen ⋅ Shanglin Li ⋅ Masashi Sugiyama
Abstract
Domain adaptation theory studies upper bounds on the target risk in order to mitigate performance loss of machine learning models due to distribution shift. In this paper, we take a closer look at the optimization of one such bound based on optimal transport (OT) and propose various strategies that improve the optimization in practice. We first introduce *gradual shift* and *probabilistic margin* assumptions to control the incomputable entanglement term that appears in the bounds. We prove that under these assumptions, better optimization of the computable part of the bound can translate to better target accuracies. Motivated by this fact, we tighten the bound, via importance weighting of the source (output) distribution, to obtain the *weighted* Wasserstein regularized risk ($\mathrm{W}^2\mathrm{R}^{2}$), that is often easier to minimize than the original bound. $\mathrm{W}^2\mathrm{R}^{2}$ is shown to be equivalent to an unbalanced OT problem, which in the limit converges to a nearest neighbor based alignment strategy. We highlight the tradeoffs faced with such an approach and show that a suitably regularized $\mathrm{W}^2\mathrm{R}^{2}$ improves over the state of the art and is robust to multiple distribution shifts under different models, confirming, moreover, the validity of our assumptions.
Successful Page Load