期刊论文详细信息
PLoS One
Identification of Single- and Multiple-Class Specific Signature Genes from Gene Expression Profiles by Group Marker Index
Kripamoy Aguan1  Nikhil R. Pal2  Yu-Shuen Tsai3  I-Fang Chung3 
[1] Department of Biotechnology & Bioinformatics, North Eastern Hill University, Shillong, India;Electronics & Communication Sciences Unit, Indian Statistical Institute, Calcutta, India;Institute of Biomedical Informatics, National Yang-Ming University, Taipei, Taiwan
关键词: Gene expression;    Lung and intrathoracic tumors;    Leukemias;    Biomarkers;    Marker genes;    Central nervous system;    Ewing sarcoma;    Microarrays;   
DOI  :  10.1371/journal.pone.0024259
学科分类:医学(综合)
来源: Public Library of Science
PDF
【 摘 要 】

Informative genes from microarray data can be used to construct prediction model and investigate biological mechanisms. Differentially expressed genes, the main targets of most gene selection methods, can be classified as single- and multiple-class specific signature genes. Here, we present a novel gene selection algorithm based on a Group Marker Index (GMI), which is intuitive, of low-computational complexity, and efficient in identification of both types of genes. Most gene selection methods identify only single-class specific signature genes and cannot identify multiple-class specific signature genes easily. Our algorithm can detect de novo certain conditions of multiple-class specificity of a gene and makes use of a novel non-parametric indicator to assess the discrimination ability between classes. Our method is effective even when the sample size is small as well as when the class sizes are significantly different. To compare the effectiveness and robustness we formulate an intuitive template-based method and use four well-known datasets. We demonstrate that our algorithm outperforms the template-based method in difficult cases with unbalanced distribution. Moreover, the multiple-class specific genes are good biomarkers and play important roles in biological pathways. Our literature survey supports that the proposed method identifies unique multiple-class specific marker genes (not reported earlier to be related to cancer) in the Central Nervous System data. It also discovers unique biomarkers indicating the intrinsic difference between subtypes of lung cancer. We also associate the pathway information with the multiple-class specific signature genes and cross-reference to published studies. We find that the identified genes participate in the pathways directly involved in cancer development in leukemia data. Our method gives a promising way to find genes that can involve in pathways of multiple diseases and hence opens up the possibility of using an existing drug on other diseases as well as designing a single drug for multiple diseases.

【 授权许可】

CC BY   

【 预 览 】
附件列表
Files Size Format View
RO201904020526521ZK.pdf 668KB PDF download
  文献评价指标  
  下载次数:1次 浏览次数:2次