Abstract:
We show that DBSCAN can estimate the connected components of the $\lambda$-density level set $\{ x : f(x) \ge \lambda\}$ given $n$ i.i.d. samples from an unknown density $f$. We characterize the regularity of the level set boundaries using parameter $\beta > 0$ and analyze the estimation error under the Hausdorff metric. When the data lies in $\mathbb{R}^D$ we obtain a rate of $\widetilde{O}(n^{-1/(2\beta + D)})$, which matches known lower bounds up to logarithmic factors. When the data lies on an embedded unknown $d$-dimensional manifold in $\mathbb{R}^D$, then we obtain a rate of $\widetilde{O}(n^{-1/(2\beta + d\cdot \max\{1, \beta \})})$. Finally, we provide adaptive parameter tuning in order to attain these rates with no a priori knowledge of the intrinsic dimension, density, or $\beta$.
Live content is unavailable. Log in and register to view live content