期刊论文详细信息
Life
A Manual Curation Strategy to Improve Genome Annotation: Application to a Set of Haloarchael Genomes
Friedhelm Pfeiffer1  Dieter Oesterhelt2  Hans-Peter Klenk2  Michael W. W. Adams2 
[1] Department of Membrane Biochemistry, Max-Planck-Institute of Biochemisty, Am Klopferspitz 18, Martinsried 82152, Germany; E-Mail
关键词: genome annotation;    Gold Standard Protein;    Halobacteria;    halophilic archaea;    manual curation;   
DOI  :  10.3390/life5021427
来源: mdpi
PDF
【 摘 要 】

Genome annotation errors are a persistent problem that impede research in the biosciences. A manual curation effort is described that attempts to produce high-quality genome annotations for a set of haloarchaeal genomes (Halobacterium salinarum and Hbt. hubeiense, Haloferax volcanii and Hfx. mediterranei, Natronomonas pharaonis and Nmn. moolapensis, Haloquadratum walsbyi strains HBSQ001 and C23, Natrialba magadii, Haloarcula marismortui and Har. hispanica, and Halohasta litchfieldiae). Genomes are checked for missing genes, start codon misassignments, and disrupted genes. Assignments of a specific function are preferably based on experimentally characterized homologs (Gold Standard Proteins). To avoid overannotation, which is a major source of database errors, we restrict annotation to only general function assignments when support for a specific substrate assignment is insufficient. This strategy results in annotations that are resistant to the plethora of errors that compromise public databases. Annotation consistency is rigorously validated for ortholog pairs from the genomes surveyed. The annotation is regularly crosschecked against the UniProt database to further improve annotations and increase the level of standardization. Enhanced genome annotations are submitted to public databases (EMBL/GenBank, UniProt), to the benefit of the scientific community. The enhanced annotations are also publically available via HaloLex.

【 授权许可】

CC BY   
© 2015 by the authors; licensee MDPI, Basel, Switzerland.

【 预 览 】
附件列表
Files Size Format View
RO202003190011557ZK.pdf 909KB PDF download
  文献评价指标  
  下载次数:7次 浏览次数:5次