期刊论文详细信息
BMC Genomics
A jackknife-like method for classification and uncertainty assessment of multi-category tumor samples using gene expression information
Methodology Article
Kelly Robbins1  Yupeng Wang1  Keith Bertrand1  Wensheng Zhang1  Romdhane Rekaya2 
[1] Department of Animal and Dairy Science, University of Georgia, 30602, Athens, GA, USA;Department of Animal and Dairy Science, University of Georgia, 30602, Athens, GA, USA;Department of Statistics, University of Georgia, 30602, Athens, GA, USA;Institute of Bioinformatics, University of Georgia, 30602, Athens, GA, USA;
关键词: Support Vector Machine;    Feature Selection;    Tumor Type;    Prediction Accuracy;    Training Sample;   
DOI  :  10.1186/1471-2164-11-273
 received in 2009-03-10, accepted in 2010-04-29,  发布年份 2010
来源: Springer
PDF
【 摘 要 】

BackgroundThe use of gene expression profiling for the classification of human cancer tumors has been widely investigated. Previous studies were successful in distinguishing several tumor types in binary problems. As there are over a hundred types of cancers, and potentially even more subtypes, it is essential to develop multi-category methodologies for molecular classification for any meaningful practical application.ResultsA jackknife-based supervised learning method called paired-samples test algorithm (PST), coupled with a binary classification model based on linear regression, was proposed and applied to two well known and challenging datasets consisting of 14 (GCM dataset) and 9 (NC160 dataset) tumor types. The results showed that the proposed method improved the prediction accuracy of the test samples for the GCM dataset, especially when t-statistic was used in the primary feature selection. For the NCI60 dataset, the application of PST improved prediction accuracy when the numbers of used genes were relatively small (100 or 200). These improvements made the binary classification method more robust to the gene selection mechanism and the size of genes to be used. The overall prediction accuracies were competitive in comparison to the most accurate results obtained by several previous studies on the same datasets and with other methods. Furthermore, the relative confidence R(T) provided a unique insight into the sources of the uncertainty shown in the statistical classification and the potential variants within the same tumor type.ConclusionWe proposed a novel bagging method for the classification and uncertainty assessment of multi-category tumor samples using gene expression information. The strengths were demonstrated in the application to two bench datasets.

【 授权许可】

Unknown   
© Zhang et al; licensee BioMed Central Ltd. 2010. This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

【 预 览 】
附件列表
Files Size Format View
RO202311098170126ZK.pdf 749KB PDF download
【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  文献评价指标  
  下载次数:13次 浏览次数:0次