Generalizations and Unification of Centroid-based Clustering Methods
k-means;data mining;cluster analysis
Canas, Daniel Alberto ; Dr. Robert Funderlic, Committee Chair,Dr, Jon Doyle, Committee Member,Dr. Steffen Heber, Committee Member,Canas, Daniel Alberto ; Dr. Robert Funderlic ; Committee Chair ; Dr ; Jon Doyle ; Committee Member ; Dr. Steffen Heber ; Committee Member
There are many clustering methods that are referred to as k-means-like.We give the minimal necessary and sufficient components for the mechanism of the k-means (iterative and partitional) clustering method of a finite set of objects, X.Thus k-means is generalized and the methods that mimic k-means are unified.We name these k-center clustering methods.The fundamental mechanism of k-center methods exposes the usual misconceptions of k-means such as (a) "distance" satisfies some of properties of a mathematical metric, (b) there is a need to measure "distance" between objects in X, and (c)the centers of clusters have the same nature as the objects of X.Moreover, k-center methods have a common formula to choose or calculate centers of clusters.We characterize the convergent common objective function by expressing it in terms of(a) a distance measure for closeness between center objects and the objects in X and (b) the coherence of clusters. We give a three object example to demonstrate the components of the formal mechanism of a k-center method. We then give examples of various known methods that belong to the class ofk-center methods.We exhibit an extensive and thorough comparison of the qualitative k-modes and the numerical spherical k-means. Included are paradigm applications, a matrix environment, an understanding of the duality of a dissimilarity and similarity measure, and an understanding of normalized X and the normalized centers of subsets of X.
【 预 览 】
附件列表
Files
Size
Format
View
Generalizations and Unification of Centroid-based Clustering Methods