期刊论文详细信息
PeerJ
AnnotationBustR : an R package to extract subsequences from GenBank annotations
article
Samuel R. Borstein1  Brian C. O’Meara1 
[1] Department of Ecology & Evolutionary Biology, University of Tennessee
关键词: Sequence data;    GenBank;    ACNUC;    R Package;    Subsequences;    DNA barcodes;    Phylogenetics;    mtDNA;    cpDNA;   
DOI  :  10.7717/peerj.5179
学科分类:社会科学、人文和艺术(综合)
来源: Inra
PDF
【 摘 要 】

Background DNA sequences are pivotal for a wide array of research in biology. Large sequence databases, like GenBank, provide an amazing resource to utilize DNA sequences for large scale analyses. However, many sequence records on GenBank contain more than one gene or are portions of genomes. Inconsistencies in the way genes are annotated and the numerous synonyms a single gene may be listed under provide major challenges for extracting large numbers of subsequences for comparative analysis across taxa. At present, there is no easy way to extract portions from many GenBank accessions based on annotations where gene names may vary extensively. Results The R package AnnotationBustR allows users to extract sequences based on GenBank annotations through the ACNUC retrieval system given search terms of gene synonyms and accession numbers. AnnotationBustR extracts subsequences of interest and then writes them to a FASTA file for users to employ in their research endeavors. Conclusion FASTA files of extracted subsequences and accession tables generated by AnnotationBustR allow users to quickly find and extract subsequences from GenBank accessions. These sequences can then be incorporated in various analyses, like the construction of phylogenies to test a wide range of ecological and evolutionary hypotheses.

【 授权许可】

CC BY   

【 预 览 】
附件列表
Files Size Format View
RO202307100012211ZK.pdf 868KB PDF download
  文献评价指标  
  下载次数:8次 浏览次数:2次