JOURNAL OF MULTIVARIATE ANALYSIS | 卷:173 |
Good (K-means) clusterings are unique (up to small perturbations) | |
Article | |
Meila, Marina1  | |
[1] Univ Washington, Dept Stat, Box 345322, Seattle, WA 98195 USA | |
关键词: K-means clustering; Spectral clustering; Cluster validation; Model free; Clusterability; | |
DOI : 10.1016/j.jmva.2018.12.008 | |
来源: Elsevier | |
【 摘 要 】
If we have found a good clustering C of a data set, can we prove that C is not far from the (unknown) best clustering C Pt of these data? Perhaps surprisingly, the answer to this question is sometimes yes. This paper gives spectral bounds on the distance d(C, C-opt for the case when goodness is measured by a quadratic cost, such as the squared distortion of K-means clustering or the Normalized Cut criterion of spectral clustering. The bounds exist only if the data admit a good, low-cost clustering. The results in this paper are non-asymptotic and model-free, in the sense that no assumptions are made on the data generating process. The bounds do not depend on undefined constants, and can be computed tractably from the data. (C) 2019 Elsevier Inc. All rights reserved.
【 授权许可】
Free
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
10_1016_j_jmva_2018_12_008.pdf | 1755KB | download |