Session
Spectral Methods 1
On the Spectrum of Random Features Maps of High Dimensional Data
Zhenyu Liao · Romain Couillet
Random feature maps are ubiquitous in modern statistical machine learning, where they generalize random projections by means of powerful, yet often difficult to analyze nonlinear operators. In this paper we leverage the "concentration" phenomenon induced by random matrix theory to perform a spectral analysis on the Gram matrix of these random feature maps, here for Gaussian mixture models of simultaneously large dimension and size. Our results are instrumental to a deeper understanding on the interplay of the nonlinearity and the statistics of the data, thereby allowing for a better tuning of random feature-based techniques.
SMAC: Simultaneous Mapping and Clustering Using Spectral Decompositions
chandrajit bajaj · Tingran Gao · Zihang He · Qixing Huang · Zhenxiao Liang
We introduce a principled approach for \emph{simultaneous mapping and clustering} (SMAC) for establishing consistent maps across heterogeneous object collections (e.g., 2D images or 3D shapes). Our approach takes as input a heterogeneous object collection and a set of maps computed between some pairs of objects, and outputs a homogeneous object clustering together with a new set of maps possessing optimal intra- and inter-cluster consistency. Our approach is based on the spectral decomposition of a data matrix storing all pairwise maps in its blocks. We additionally provide tight theoretical guarantees on the exactness of SMAC under established noise models. We also demonstrate the usefulness of the approach on synthetic and real datasets.
Spectrally Approximating Large Graphs with Smaller Graphs
Andreas Loukas · Pierre Vandergheynst
How does coarsening affect the spectrum of a general graph? We provide conditions such that the principal eigenvalues and eigenspaces of a coarsened and original graph Laplacian matrices are close. The achieved approximation is shown to depend on standard graph-theoretic properties, such as the degree and eigenvalue distributions, as well as on the ratio between the coarsened and actual graph sizes. Our results carry implications for learning methods that utilize coarsening. For the particular case of spectral clustering, they imply that coarse eigenvectors can be used to derive good quality assignments even without refinement—this phenomenon was previously observed, but lacked formal justification.
Submodular Hypergraphs: p-Laplacians, Cheeger Inequalities and Spectral Clustering
Pan Li · Olgica Milenkovic
We introduce submodular hypergraphs, a family of hypergraphs that have different submodular weights associated with different cuts of hyperedges. Submodular hypergraphs arise in cluster- ing applications in which higher-order structures carry relevant information. For such hypergraphs, we define the notion of p-Laplacians and derive corresponding nodal domain theorems and k-way Cheeger inequalities. We conclude with the description of algorithms for computing the spectra of 1- and 2-Laplacians that constitute the basis of new spectral hypergraph clustering methods.
Rates of Convergence of Spectral Methods for Graphon Estimation
Jiaming Xu
This paper studies the problemof estimating the graphon function -- a generativemechanism for a class of random graphs that are usefulapproximations to real networks.Specifically, a graph of $n$ vertices is generated such that each pair of two vertices $i$ and $j$ are connected independently with probability$\rho_n \times f(x_i,x_j)$, where $x_i$ is the unknown $d$-dimensional label of vertex $i$, $f$ is an unknown symmetric function, and $\rho_n$,assumed to be $\Omega(\log n/n)$,is a scaling parameter characterizing the graph sparsity. The taskis to estimate graphon $f$ given the graph. Recent studies have identified the minimax optimal estimation error rate for $d=1$. However, there exists a wide gap betweenthe known error rates of polynomial-time estimators and the minimax optimal error rate. We improve on the previously known error rates of polynomial-time estimators, by analyzing a spectral method, namely universal singular value thresholding (USVT) algorithm.When $f$ belongs to either H\"{o}lder or Sobolev space with smoothness index $\alpha$, we show the error rates of USVT are at most $(n\rho)^{ -2 \alpha / (2\alpha+d)}$.These error rates approach the minimax optimal error rate $\log (n\rho)/(n\rho)$ proved in prior work for $d=1$, as $\alpha$ increases, i.e., $f$ becomes smoother. Furthermore, when $f$ is analytic with infinitely many times differentiability, we show the error rate of USVT is at most $\log^d (n\rho)/(n\rho)$. When $f$ is a step function which corresponds to the stochastic block modelwith $k$ blocks for some $k$, the error rate of USVT is at most $k/(n\rho)$, whichis larger than the minimax optimal error rate by at most a multiplicative factor $k/\log k$.This coincides with the computational gap observed in community detection. A key ingredient of our analysis is to derive the eigenvalue decaying rate of the edge probability matrix using piecewise polynomial approximations of the graphon function $f$.