Poster
Optimal Estimator for Unlabeled Linear Regression
Hang Zhang · Ping Li
Keywords: [ Combinatorial Optimization ] [ Information Theory and Estimation ] [ Learning Theory ]
Abstract:
Unlabeled linear regression, or ``linear regression with an
unknown permutation'', has attracted increasing
attentions due to its applications in (e.g.,) linkage record and
de-anonymization.
However, the computation of unlabeled linear regression proves to be cumbersome and
existing algorithms typically require considerable time, especially in the
high dimensional regime. In this paper, we propose a one-step estimator which
is optimal from both the computational and the statistical aspects.
From the computational perspective, our estimator
exhibits the same order of computational complexity
as that of the oracle case (which means the regression coefficients
are known in advance and only the permutation
needs recovery).
From the statistical
perspective, when comparing with the necessary conditions for permutation recovery, our requirement on the {\it signal-to-noise ratio}
($\mathsf{SNR}$) agrees up to merely $\Omega\left(\log \log n\right)$ difference
when the stable rank of the regression coefficients $\ensuremath{\mathbf{B}}^{\natural}$ is much less than $\log n/\log \log n$. %i.e.,
Numerical experiments are also provided to
corroborate the theoretical claims.
Chat is not available.