期刊论文详细信息
Entropy
On Clustering Histograms with k-Means by Using Mixed α-Divergences
Frank Nielsen2  Richard Nock1 
[1]NICTA and The Australian National University, Locked Bag 9013, Alexandria NSW 1435, Australia
[2]Sony Computer Science Laboratories, Inc, Tokyo 141-0022, Japan
关键词: bag-of-X;    α-divergence;    Jeffreys divergence;    centroid;    k-means clustering;    k-means seeding;   
DOI  :  10.3390/e16063273
来源: mdpi
PDF
【 摘 要 】

Clustering sets of histograms has become popular thanks to the success of the generic method of bag-of-X used in text categorization and in visual categorization applications. In this paper, we investigate the use of a parametric family of distortion measures, called the α-divergences, for clustering histograms. Since it usually makes sense to deal with symmetric divergences in information retrieval systems, we symmetrize the α-divergences using the concept of mixed divergences. First, we present a novel extension of k-means clustering to mixed divergences. Second, we extend the k-means++ seeding to mixed α-divergences and report a guaranteed probabilistic bound. Finally, we describe a soft clustering technique for mixed α-divergences.

【 授权许可】

CC BY   
© 2014 by the authors; licensee MDPI, Basel, Switzerland

【 预 览 】
附件列表
Files Size Format View
RO202003190024900ZK.pdf 353KB PDF download
  文献评价指标  
  下载次数:7次 浏览次数:16次