学位论文详细信息
A Theoretical Study of Clusterability and Clustering Quality
clustering;clustering quality;clusterability;data mining;Computer Science
Ackerman, Margareta
University of Waterloo
关键词: clustering;    clustering quality;    clusterability;    data mining;    Computer Science;   
Others  :  https://uwspace.uwaterloo.ca/bitstream/10012/3478/1/thesis.pdf
瑞士|英语
来源: UWSPACE Waterloo Institutional Repository
PDF
【 摘 要 】

Clustering is a widely used technique, with applications rangingfrom data mining, bioinformatics and image analysis to marketing,psychology, and city planning. Despite the practical importance ofclustering, there is very limited theoretical analysis of the topic.We make a step towards building theoretical foundations forclustering by carrying out an abstract analysis of two centralconcepts in clustering; clusterability and clustering quality.We compare a number of notions of clusterability found in theliterature. While all these notions attempt to measure the sameproperty, and all appear to be reasonable, we show that they arepairwise inconsistent. In addition, we give the first computationalcomplexity analysis of a few notions of clusterability.In the second part of the thesis, we discuss how the quality of agiven clustering can be defined (and measured). Users often need tocompare the quality of clusterings obtained by different methods.Perhaps more importantly, users need to determine whether a givenclustering is sufficiently good for being used in further datamining analysis. We analyze whata measure of clustering qualityshould look like. We do that by introducing a set of requirements(`axioms;;) of clustering quality measures. We propose a number ofclustering quality measures that satisfy these requirements.

【 预 览 】
附件列表
Files Size Format View
A Theoretical Study of Clusterability and Clustering Quality 415KB PDF download
  文献评价指标  
  下载次数:32次 浏览次数:78次