期刊论文详细信息
BMC Bioinformatics
Interactive visual exploration and refinement of cluster assignments
Software
Nils Gehlenborg1  Chris R. Johnson2  Alexander Lex2  Michael Kern3 
[1] Department of Biomedical Informatics, Harvard Medical School, 02115, Boston, USA;Scientific Computing and Imaging Institute, University of Utah, 72 Sout Central Campus Drive, 84112, Salt Lake City, USA;Scientific Computing and Imaging Institute, University of Utah, 72 Sout Central Campus Drive, 84112, Salt Lake City, USA;Department of Informatics, Technical University of Munich, 85747, Garching bei München, Germany;
关键词: Cluster analysis;    Visualization;    Biology visualization;    Omics data;   
DOI  :  10.1186/s12859-017-1813-7
 received in 2017-04-07, accepted in 2017-08-29,  发布年份 2017
来源: Springer
PDF
【 摘 要 】

BackgroundWith ever-increasing amounts of data produced in biology research, scientists are in need of efficient data analysis methods. Cluster analysis, combined with visualization of the results, is one such method that can be used to make sense of large data volumes. At the same time, cluster analysis is known to be imperfect and depends on the choice of algorithms, parameters, and distance measures. Most clustering algorithms don’t properly account for ambiguity in the source data, as records are often assigned to discrete clusters, even if an assignment is unclear. While there are metrics and visualization techniques that allow analysts to compare clusterings or to judge cluster quality, there is no comprehensive method that allows analysts to evaluate, compare, and refine cluster assignments based on the source data, derived scores, and contextual data.ResultsIn this paper, we introduce a method that explicitly visualizes the quality of cluster assignments, allows comparisons of clustering results and enables analysts to manually curate and refine cluster assignments. Our methods are applicable to matrix data clustered with partitional, hierarchical, and fuzzy clustering algorithms. Furthermore, we enable analysts to explore clustering results in context of other data, for example, to observe whether a clustering of genomic data results in a meaningful differentiation in phenotypes.ConclusionsOur methods are integrated into Caleydo StratomeX, a popular, web-based, disease subtype analysis tool. We show in a usage scenario that our approach can reveal ambiguities in cluster assignments and produce improved clusterings that better differentiate genotypes and phenotypes.

【 授权许可】

CC BY   
© The Author(s) 2017

【 预 览 】
附件列表
Files Size Format View
RO202311100750722ZK.pdf 3614KB PDF download
【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  • [27]
  • [28]
  • [29]
  • [30]
  • [31]
  • [32]
  • [33]
  • [34]
  • [35]
  • [36]
  • [37]
  • [38]
  文献评价指标  
  下载次数:3次 浏览次数:0次