学位论文详细信息
Text Mining Biomedical Literature for Genomic Knowledge Discovery
Text mining;Biomedical literature;Gene function;Clustering;Genomic knowledge discovery;Microarray
Liu, Ying ; Computing
University:Georgia Institute of Technology
Department:Computing
关键词: Text mining;    Biomedical literature;    Gene function;    Clustering;    Genomic knowledge discovery;    Microarray;   
Others  :  https://smartech.gatech.edu/bitstream/1853/7242/1/liu_ying_200508_phd.pdf
美国|英语
来源: SMARTech Repository
PDF
【 摘 要 】

The last decade has been marked by unprecedented growth in both the production of biomedical data and the amount of published literature discussing it. Almost every known or postulated piece of information pertaining to genes, proteins, and their role in biological processes is reported somewhere in the vast amount of published biomedical literature. We believe the ability to rapidly survey and analyze this literature and extract pertinent information constitutes a necessary step toward both the design and the interpretation of any large-scale experiment. Moreover, automated literature mining offers a yet untapped opportunity to integrate many fragments of information gathered by researchers from multiple fields of expertise into a complete picture exposing the interrelated roles of various genes, proteins, and chemical reactions in cells and organisms. In this thesis, we show that functional keywords in biomedical literature, particularly Medline, represent very valuable information and can be used to discover new genomic knowledge. To validate our claim we present an investigation into text mining biomedical literature to assist microarray data analysis, yeast gene function classification, and biomedical literature categorization. We conduct following studies:1. We test sets of genes to discover common functional keywords among them and use these keywords to cluster them into groups; 2. We show that it is possible to link genes to diseases by an expert human interpretation of the functional keywords for the genes- none of these diseases are as yet mentioned in public databases; 3. By clustering genes based on commonality of functional keywords it is possible to group genes into meaningful clusters that reveal more information about their functions, link to diseases and roles in metabolism pathways; 4. Using extracted functional keywords, we are able to demonstrate that for yeast genes, we can make a better functional grouping of genes in comparison to available public microarray and phylogenetic databases; 5. We show an application of our approach to literature classification. Using functional keywords as features, we are able to extract epidemiological abstracts automatically from Medline with higher sensitivity and accuracy than a human expert.

【 预 览 】
附件列表
Files Size Format View
Text Mining Biomedical Literature for Genomic Knowledge Discovery 3132KB PDF download
  文献评价指标  
  下载次数:23次 浏览次数:26次