ICML Unsupervised Learning under Latent Label Shift

Poster
in
Workshop: Spurious correlations, Invariance, and Stability (SCIS)

Unsupervised Learning under Latent Label Shift

Pranav Mani · Manley Roberts · Saurabh Garg · Zachary Lipton

Keywords: [ domain adaptation ] [ topic modeling ] [ mixture proportion estimation ] [ anchor word ] [ Label shift ] [ Unsupervised Learning ] [ Deep Learning ] [ unsupervised structure discovery ]

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract: What sorts of structure might enable a learner to discover classes from unlabeled data? Traditional unsupervised learning approaches risk recovering incorrect classes based on spurious data-space similarity. In this paper, we introduce unsupervised learning under Latent Label Shift (LLS), where the label marginals

$p_d(y)$ shift but the class conditionals

$p(\mathbf{x}|y)$ do not. This setting suggests a new principle for identifying classes: elements that shift together across domains belong to the same true class. For finite input spaces, we establish an isomorphism between LLS and topic modeling; for continuous data, we show that if each label's support contains a separable region, analogous to an anchor word, oracle access to

$p(d|\mathbf{x})$ suffices to identify

$p_d(y)$ and

$p_d(y|\mathbf{x})$ up to permutation. Thus motivated, we introduce a practical algorithm that leverages domain-discriminative models as follows: (i) push examples through domain discriminator

$p(d|\mathbf{x})$ ; (ii) discretize the data by clustering examples in

$p(d|\mathbf{x})$ space; (iii) perform non-negative matrix factorization on the discrete data; (iv) combine recovered

$p(y|d)$ with discriminator outputs

$p(d|\mathbf{x})$ to compute

$p_d(y|\mathbf{x}) \; \forall d$ . In semi-synthetic experiments, we show that our algorithm can use domain information to overcome a failure mode of standard unsupervised classification in which data-space similarity does not indicate true groupings.

Chat is not available.

Poster in Workshop: Spurious correlations, Invariance, and Stability (SCIS)

Unsupervised Learning under Latent Label Shift

Pranav Mani · Manley Roberts · Saurabh Garg · Zachary Lipton

Poster
in
Workshop: Spurious correlations, Invariance, and Stability (SCIS)