期刊论文详细信息
Bioscience of Microbiota, Food and Health
DFAST and DAGA: web-based integrated genome annotation tools and resources
Yasukazu NAKAMURA2  Takatomo FUJISAWA2  Yasuhiro TANIZAWA1  Masanori ARITA2  Eli KAMINUMA2 
[1]Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba 277-8561, Japan
[2]Center for Information Biology, National Institute of Genetics, Shizuoka 411-8540, Japan
关键词: lactic acid bacteria;    genome;    annotation;    database;    Lactobacillus;    Pediococcus;   
DOI  :  10.12938/bmfh.16-003
学科分类:生物科学(综合)
来源: Nihon Bifizusukin Senta / Japan Bifidus Foundation
PDF
【 摘 要 】
References(49)Quality assurance and correct taxonomic affiliation of data submitted to public sequence databases have been an everlasting problem. The DDBJ Fast Annotation and Submission Tool (DFAST) is a newly developed genome annotation pipeline with quality and taxonomy assessment tools. To enable annotation of ready-to-submit quality, we also constructed curated reference protein databases tailored for lactic acid bacteria. DFAST was developed so that all the procedures required for DDBJ submission could be done seamlessly online. The online workspace would be especially useful for users not familiar with bioinformatics skills. In addition, we have developed a genome repository, DFAST Archive of Genome Annotation (DAGA), which currently includes 1,421 genomes covering 179 species and 18 subspecies of two genera, Lactobacillus and Pediococcus, obtained from both DDBJ/ENA/GenBank and Sequence Read Archive (SRA). All the genomes deposited in DAGA were annotated consistently and assessed using DFAST. To assess the taxonomic position based on genomic sequence information, we used the average nucleotide identity (ANI), which showed high discriminative power to determine whether two given genomes belong to the same species. We corrected mislabeled or misidentified genomes in the public database and deposited the curated information in DAGA. The repository will improve the accessibility and reusability of genome resources for lactic acid bacteria. By exploiting the data deposited in DAGA, we found intraspecific subgroups in Lactobacillus gasseri and Lactobacillus jensenii, whose variation between subgroups is larger than the well-accepted ANI threshold of 95% to differentiate species. DFAST and DAGA are freely accessible at https://dfast.nig.ac.jp.
【 授权许可】

Unknown   

【 预 览 】
附件列表
Files Size Format View
RO201911300539434ZK.pdf 1439KB PDF download
  文献评价指标  
  下载次数:6次 浏览次数:18次