期刊论文详细信息
BioData Mining
An automated framework for hypotheses generation using literature
Vida Abedi2  Ramin Zand1  Mohammed Yeasin2  Fazle Elahi Faisal2 
[1] Department of Neurology, University of Tennessee Health Science Center, Memphis, TN, 38163, USA
[2] College of Arts and Sciences, Bioinformatics Program, Memphis University, Memphis, TN, 38152, USA
关键词: MeSH ontology;    Knowledge discovery;    Hypothesis generation;    Biological literature-mining;    Disease model;    Disease network;   
Others  :  797259
DOI  :  10.1186/1756-0381-5-13
 received in 2012-03-30, accepted in 2012-07-13,  发布年份 2012
PDF
【 摘 要 】

Background

In bio-medicine, exploratory studies and hypothesis generation often begin with researching existing literature to identify a set of factors and their association with diseases, phenotypes, or biological processes. Many scientists are overwhelmed by the sheer volume of literature on a disease when they plan to generate a new hypothesis or study a biological phenomenon. The situation is even worse for junior investigators who often find it difficult to formulate new hypotheses or, more importantly, corroborate if their hypothesis is consistent with existing literature. It is a daunting task to be abreast with so much being published and also remember all combinations of direct and indirect associations. Fortunately there is a growing trend of using literature mining and knowledge discovery tools in biomedical research. However, there is still a large gap between the huge amount of effort and resources invested in disease research and the little effort in harvesting the published knowledge. The proposed hypothesis generation framework (HGF) finds “crisp semantic associations” among entities of interest - that is a step towards bridging such gaps.

Methodology

The proposed HGF shares similar end goals like the SWAN but are more holistic in nature and was designed and implemented using scalable and efficient computational models of disease-disease interaction. The integration of mapping ontologies with latent semantic analysis is critical in capturing domain specific direct and indirect “crisp” associations, and making assertions about entities (such as disease X is associated with a set of factors Z).

Results

Pilot studies were performed using two diseases. A comparative analysis of the computed “associations” and “assertions” with curated expert knowledge was performed to validate the results. It was observed that the HGF is able to capture “crisp” direct and indirect associations, and provide knowledge discovery on demand.

Conclusions

The proposed framework is fast, efficient, and robust in generating new hypotheses to identify factors associated with a disease. A full integrated Web service application is being developed for wide dissemination of the HGF. A large-scale study by the domain experts and associated researchers is underway to validate the associations and assertions computed by the HGF.

【 授权许可】

   
2012 Abedi et al.; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20140706045556937.pdf 462KB PDF download
Figure 4. 43KB Image download
Figure 3. 34KB Image download
Figure 2. 41KB Image download
Figure 1. 84KB Image download
【 图 表 】

Figure 1.

Figure 2.

Figure 3.

Figure 4.

【 参考文献 】
  • [1]Gao Y, Kinoshita J, Wu E, Miller E, Lee R, Seaborne A, Cayzer S, Clark T: SWAN: A Distributed Knowledge Infrastructure for Alzheimer Disease Research. Journal of Web Semantics 2006, 4(3):222-228.
  • [2]Goh KI, Cusick ME, Valle D, Childs B, Vidal M, Barabási AL: The human disease network. Proc Natl Acad Sci USA 2007, 104(21):8685-8690.
  • [3]Ideker T, Thorsson V, Ranish JA, Christmas R, Buhler J, Eng JK, Bumgarner R, Goodlett DR, Aebersold R, Hood L: Integrated genomic and proteomic analyses of a systematically perturbed metabolic network. Science 2001, 292(5518):929-934.
  • [4]Zhang X, Zhang R, Jiang Y, Sun P, Tang G, Wang X, Lv H, Li X: The expanded human disease network combining protein–protein interaction information. Eur J Hum Genet 2011, 19(7):783-788.
  • [5]Rzhetsky A, Seringhaus M, Gerstein M: Seeking a new biology through text mining. Cell 2008, 134(1):9-13.
  • [6]Hirschman L, Morgan AA, Yeh AS: Rutabaga by any other name: extracting biological names. J Biomed Inform 2002, 35(4):247-259.
  • [7]Wilbur WJ, Hazard GF, Divita G, Mork JG, Aronson AR, Browne AC: Analysis of biomedical text for chemical names: a comparison of three methods. Proc AMIA Symp 1999,  :176-180. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2232672/ webcite
  • [8]Landauer TK, Dumais ST: A solution to plato’s problem: the latent semantic analysis theory of the acquisition, induction, and representation of knowledge. Psychol Rev 1997, 104:211-240.
  • [9]Lee DD, Seung HS: Learning the parts of objects by non-negative matrix factorization. Nature 1999, 401:788-791.
  • [10]Paatero P, Tapper U: Positive matrix factorization: a non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 1994, 5:111-126.
  • [11]Berry MW, Browne M: Understanding Search Engines: Mathematical Modeling and Text Retrieval. Philadelphia, USA: SIAM; 1990.
  • [12]Swanson D, Smalheiser N: Assessing a gap in the biomedical literature: magnesium deficiency and neurologic disease. Neurosci Res Commun 1994, 15:1-9.
  • [13]Srinivasan P, Libbus B: Mining MEDLINE for implicit links between dietary substances and diseases. Bioinformatics 2004, 20(Suppl 1):i290-i296.
  • [14]Yeasin M, Malempati H, Homayouni R, Sorower MS: A systematic study on latent semantic analysis model parameters for mining biomedical literature. Conference Proceedings: BMC Bioinformatics 2009, 10(Suppl. 7):A6.
  • [15]Medlink Neurology. [http://www.medlink.com/medlinkcontent.asp webcite]
  • [16]Catling LA, Abubakar I, Lake IR, Swift L, Hunter PR: A systematic review of analytical observational studies investigating the association between cardiovascular disease and drinking water hardness. J Water Health 2008, 6(4):433-442.
  • [17]Menown IA, Shand JA: Recent advances in cardiology. Future Cardiol 2010, 6(1):11-17.
  • [18]Tafet GE, Idoyaga-Vargas VP, Abulafia DP, Calandria JM, Roffman SS, Chiovetta A, Shinitzky M: Correlation between cortisol level and serotonin uptake in patients with chronic stress and depression. Cogn Affect Behav Neurosci 2001, 1(4):388-393.
  • [19]Williams GP: The role of oestrogen in the pathogenesis of obesity, type 2 diabetes, breast cancer and prostate disease. Eur J Cancer Prev 2010, 19(4):256-271.
  • [20]Schürks M, Glynn RJ, Rist PM, Tzourio C, Kurth T: Effects of vitamin E on stroke subtypes: meta-analysis of randomised controlled trials. BMJ 2010, 341:c5702.
  • [21]Benkler M, Agmon-Levin N, Shoenfeld Y: Parkinson’s disease, autoimmunity, and olfaction. Int J Neurosci 2009, 119(12):2133-2143.
  • [22]Moscavitch SD, Szyper-Kravitz M, Shoenfeld Y: Autoimmune pathology accounts for common manifestations in a wide range of neuro-psychiatric disorders: the olfactory and immune system interrelationship. Clin Immunol 2009, 130(3):235-243.
  • [23]Faria AM, Weiner HL: Oral tolerance. Immunol Rev 2005, 206:232-259.
  • [24]Teixeira G, Paschoal PO, de Oliveira VL, Pedruzzi MM, Campos SM, Andrade L, Nobrega A: Diet selection in immunologically manipulated mice. Immunobiology 2008, 213(1):1-12.
  • [25]Schiffman SS, Sattely-Miller EA, Taylor EL, Graham BG, Landerman LR, Zervakis J, Campagna LK, Cohen HJ, Blackwell S, Garst JL: Combination of flavor enhancement and chemosensory education improves nutritional status in older cancer patients. J Nutr Health Aging 2007, 11(5):439-454.
  • [26]Murphy C, Davidson TM, Jellison W, Austin S, Mathews WC, Ellison DW, Schlotfeldt C: Sinonasal disease and olfactory impairment in HIV disease: endoscopic sinus surgery and outcome measures. Laryngoscope 2000, 110(10 Pt 1):1707-1710.
  • [27]Zucco GM, Ingegneri G: Olfactory deficits in HIV-infected patients with and without AIDS dementia complex. Physiol Behav 2004, 80(5):669-674.
  • [28]Tandeter H, Levy A, Gutman G, Shvartzman P: Subclinical thyroid disease in patients with Parkinson’s disease. Arch Gerontol Geriatr 2001, 33(3):295-300.
  • [29]Chinnakkaruppan A, Das S, Sarkar PK: Age related and hypothyroidism related changes on the stoichiometry of neurofilament subunits in the developing rat brain. Int J Dev Neurosci 2009, 27(3):257-261.
  • [30]García-Moreno JM, Chacón-Peña J: Hypothyroidism and Parkinson’s disease and the issue of diagnostic confusion. Mov Disord 2003, 18(9):1058-1059.
  • [31]Munhoz RP, Teive HA, Troiano AR, Hauck PR, Herdoiza Leiva MH, Graff H, Werneck LC: Parkinson’s disease and thyroid dysfunction. Parkinsonism Relat Disord 2004, 10(6):381-383.
  • [32]Ferreira JJ, Neutel D, Mestre T, Coelho M, Rosa MM, Rascol O, Sampaio C: Skin cancer and Parkinson’s disease. Mov Disord 2010, 25(2):139-148.
  文献评价指标  
  下载次数:13次 浏览次数:8次