期刊论文详细信息
BMC Bioinformatics
Predicting the functions of a protein from its ability to associate with other molecules
Research Article
Kamal Taha1  Paul D. Yoo2 
[1] Department of Electrical and Computer Engineering, Khalifa University, Abu Dhabi, United Arab Emirates;Faculty of Science and Technology, Bournemouth University, Bournemouth, UK;
关键词: Gene Ontology;    Semantic Relationship;    Test Protein;    Semantic Rule;    PubMed Abstract;   
DOI  :  10.1186/s12859-016-0882-3
 received in 2015-05-12, accepted in 2016-01-05,  发布年份 2016
来源: Springer
PDF
【 摘 要 】

BackgroundAll proteins associate with other molecules. These associated molecules are highly predictive of the potential functions of proteins. The association of a protein and a molecule can be determined from their co-occurrences in biomedical abstracts. Extensive semantically related co-occurrences of a protein’s name and a molecule’s name in the sentences of biomedical abstracts can be considered as indicative of the association between the protein and the molecule. Dependency parsers extract textual relations from a text by determining the grammatical relations between words in a sentence. They can be used for determining the textual relations between proteins and molecules. Despite their success, they may extract textual relations with low precision. This is because they do not consider the semantic relationships between terms in a sentence (i.e., they consider only the structural relationships between the terms). Moreover, they may not be well suited for complex sentences and for long-distance textual relations.ResultsWe introduce an information extraction system called PPFBM that predicts the functions of unannotated proteins from the molecules that associate with these proteins. PPFBM represents each protein by the other molecules that associate with it in the abstracts referenced in the protein’s entries in reliable biological databases. It automatically extracts each co-occurrence of a protein-molecule pair that represents semantic relationship between the pair. Towards this, we present novel semantic rules that identify the semantic relationship between each co-occurrence of a protein-molecule pair using the syntactic structures of sentences and linguistics theories. PPFBM determines the functions of an un-annotated protein p as follows. First, it determines the set Sr of annotated proteins that is semantically similar to p by matching the molecules representing p and the annotated proteins. Then, it assigns p the functional category FC if the significance of the frequency of occurrences of Sr in abstracts associated with proteins annotated with FC is statistically significantly different than the significance of the frequency of occurrences of Sr in abstracts associated with proteins annotated with all other functional categories. We evaluated the quality of PPFBM by comparing it experimentally with two other systems. Results showed marked improvement.ConclusionsThe experimental results demonstrated that PPFBM outperforms other systems that predict protein function from the textual information found within biomedical abstracts. This is because these system do not consider the semantic relationships between terms in a sentence (i.e., they consider only the structural relationships between the terms). PPFBM’s performance over these system increases steadily as the number of training protein increases. That is, PPFBM’s prediction performance becomes more accurate constantly, as the size of training proteins gets larger. This is because every time a new set of test proteins is added to the current set of training proteins. A demo of PPFBM that annotates each input Yeast protein (SGD (Saccharomyces Genome Database). Available at: http://www.yeastgenome.org/download-data/curation) with the functions of Gene Ontology terms is available at: (see Appendix for more details about the demo)http://ecesrvr.kustar.ac.ae:8080/PPFBM/.

【 授权许可】

CC BY   
© Taha and Yoo. 2016

【 预 览 】
附件列表
Files Size Format View
RO202311101547400ZK.pdf 2242KB PDF download
Fig. 1 562KB Image download
MediaObjects/13068_2023_2416_MOESM2_ESM.xls 32KB Other download
Scheme. 1 8432KB Image download
MediaObjects/13068_2023_2416_MOESM4_ESM.xls 40KB Other download
Fig. 2 265KB Image download
MediaObjects/13068_2023_2416_MOESM5_ESM.xls 44KB Other download
MediaObjects/13068_2023_2416_MOESM6_ESM.xls 54KB Other download
MediaObjects/12888_2023_5218_MOESM1_ESM.docx 893KB Other download
12951_2015_155_Article_IEq77.gif 1KB Image download
Fig. 4 603KB Image download
Fig. 3 1360KB Image download
MediaObjects/13011_2023_568_MOESM2_ESM.docx 26KB Other download
Fig. 7 1070KB Image download
12951_2015_155_Article_IEq78.gif 1KB Image download
40538_2023_473_Article_IEq1.gif 1KB Image download
Fig. 8 474KB Image download
MediaObjects/12951_2023_2117_MOESM1_ESM.docx 4908KB Other download
12951_2016_246_Article_IEq6.gif 1KB Image download
Fig. 1 258KB Image download
12951_2016_246_Article_IEq7.gif 1KB Image download
Fig. 8 2685KB Image download
Fig. 2 663KB Image download
Fig. 4 2807KB Image download
Fig. 1 285KB Image download
Fig. 10 2860KB Image download
Fig. 2 2277KB Image download
Fig. 1 127KB Image download
Fig. 5 629KB Image download
MediaObjects/13046_2023_2842_MOESM1_ESM.docx 6521KB Other download
Fig. 3 204KB Image download
12951_2017_255_Article_IEq48.gif 1KB Image download
Fig. 1 334KB Image download
Fig. 1 105KB Image download
Fig. 6 1312KB Image download
Fig. 5 993KB Image download
12951_2016_246_Article_IEq8.gif 1KB Image download
42004_2023_1031_Article_IEq16.gif 1KB Image download
12951_2016_246_Article_IEq9.gif 1KB Image download
42004_2023_1031_Figa_HTML.png 4KB Image download
MediaObjects/12888_2023_5225_MOESM1_ESM.docx 1153KB Other download
MediaObjects/42004_2023_1031_MOESM1_ESM.pdf 4101KB PDF download
MediaObjects/12951_2023_2146_MOESM1_ESM.doc 46918KB Other download
Fig. 6 412KB Image download
Fig. 5 3768KB Image download
Fig. 1 182KB Image download
12936_2017_1904_Article_IEq1.gif 1KB Image download
12951_2017_255_Article_IEq49.gif 1KB Image download
MediaObjects/41408_2023_927_MOESM6_ESM.tif 3545KB Other download
12951_2017_255_Article_IEq50.gif 1KB Image download
MediaObjects/12944_2023_1941_MOESM2_ESM.xlsx 10KB Other download
12951_2016_223_Article_IEq1.gif 1KB Image download
Scheme 1 2400KB Image download
MediaObjects/13046_2023_2857_MOESM1_ESM.pdf 6527KB PDF download
Fig. 2 2232KB Image download
Fig. 1 1626KB Image download
Fig. 1 573KB Image download
Fig. 10 4904KB Image download
Fig. 4 371KB Image download
Fig. 1 245KB Image download
Fig. 1 111KB Image download
MediaObjects/12974_2023_2910_MOESM3_ESM.tif 3321KB Other download
Fig. 2 155KB Image download
Fig. 4 3333KB Image download
12951_2017_255_Article_IEq51.gif 1KB Image download
MediaObjects/41021_2023_280_MOESM1_ESM.docx 35KB Other download
12951_2017_255_Article_IEq52.gif 1KB Image download
【 图 表 】

12951_2017_255_Article_IEq52.gif

12951_2017_255_Article_IEq51.gif

Fig. 4

Fig. 2

Fig. 1

Fig. 1

Fig. 4

Fig. 10

Fig. 1

Fig. 1

Fig. 2

Scheme 1

12951_2016_223_Article_IEq1.gif

12951_2017_255_Article_IEq50.gif

12951_2017_255_Article_IEq49.gif

12936_2017_1904_Article_IEq1.gif

Fig. 1

Fig. 5

Fig. 6

42004_2023_1031_Figa_HTML.png

12951_2016_246_Article_IEq9.gif

42004_2023_1031_Article_IEq16.gif

12951_2016_246_Article_IEq8.gif

Fig. 5

Fig. 6

Fig. 1

Fig. 1

12951_2017_255_Article_IEq48.gif

Fig. 3

Fig. 5

Fig. 1

Fig. 2

Fig. 10

Fig. 1

Fig. 4

Fig. 2

Fig. 8

12951_2016_246_Article_IEq7.gif

Fig. 1

12951_2016_246_Article_IEq6.gif

Fig. 8

40538_2023_473_Article_IEq1.gif

12951_2015_155_Article_IEq78.gif

Fig. 7

Fig. 3

Fig. 4

12951_2015_155_Article_IEq77.gif

Fig. 2

Scheme. 1

Fig. 1

【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  • [27]
  • [28]
  • [29]
  • [30]
  • [31]
  • [32]
  • [33]
  • [34]
  • [35]
  • [36]
  • [37]
  • [38]
  • [39]
  • [40]
  • [41]
  • [42]
  • [43]
  • [44]
  • [45]
  • [46]
  • [47]
  • [48]
  • [49]
  • [50]
  • [51]
  文献评价指标  
  下载次数:2次 浏览次数:0次