期刊论文详细信息
BMC Research Notes
CooVar: Co-occurring variant analyzer
Nansheng Chen1  Christian Frech1  Ismael A Vergara2 
[1]Department of Molecular Biology and Biochemistry, Simon Fraser University, 8888 University Drive, Burnaby, B.C., V5A 1S6, Canada
[2]GenomeDx Biosciences Inc, 1595 West 3rd Avenue, Vancouver, BC, V6J 1J8, Canada
关键词: Deletion;    Insertion;    SNV;    Indel;    Protein-coding transcript;    Sequence analysis;    Genomic variation;    Variant annotation;    Variant effect prediction;   
Others  :  1165294
DOI  :  10.1186/1756-0500-5-615
 received in 2012-07-20, accepted in 2012-10-26,  发布年份 2012
PDF
【 摘 要 】

Background

Evaluating the impact of genomic variations (GV) on protein-coding transcripts is an important step in identifying variants of functional significance. Currently available programs for variant annotation depend on external databases or annotate multiple variants affecting the same transcript independently, which limits program use to organisms available in these databases or results in potentially incorrect or incomplete annotations.

Findings

We have developed CooVar (Co-occurring Variant Analyzer), a database-independent program for assessing the impact of GVs on protein-coding transcripts. CooVar takes GVs, reference genome sequence, and protein-coding exons as input and provides annotated GVs and transcripts as output. Other than similar programs, CooVar considers the combined impact of all GVs affecting the same transcript, generating biologically more accurate annotations. CooVar is operated from the command-line and supports standard file formats VCF, GFF/GTF, and GVF, which makes it easy to integrate into existing computational pipelines. We have extensively tested CooVar on worm and human data sets and demonstrate that it generates correct annotations in only a short amount of time.

Conclusions

CooVar is an easy-to-use and lightweight variant annotation tool that considers the combined impact of GVs on protein-coding transcripts. CooVar is freely available at http://genome.sfu.ca/projects/coovar/ webcite.

【 授权许可】

   
2012 Vergara et al.; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20150416025617647.pdf 995KB PDF download
Figure 2. 51KB Image download
Figure 1. 107KB Image download
【 图 表 】

Figure 1.

Figure 2.

【 参考文献 】
  • [1]MacArthur DG, Tyler-Smith C: Loss-of-function variants in the genomes of healthy humans. Hum Mol Genet 2010, 19(R2):R125-R130.
  • [2]Stankiewicz P, Lupski JR: Structural variation in the human genome and its role in disease. Annu Rev Med 2010, 61:437-455.
  • [3]Shendure J, Ji H: Next-generation DNA sequencing. Nat Biotechnol 2008, 26(10):1135-1145.
  • [4]Medvedev P, Stanciu M, Brudno M: Computational methods for discovering structural variation with next-generation sequencing. Nat Methods 2009, 6(11 Suppl):S13-S20.
  • [5]McLaren W, Pritchard B, Rios D, Chen Y, Flicek P, Cunningham F: Deriving the consequences of genomic variants with the Ensembl API and SNP Effect Predictor. Bioinformatics 2010, 26(16):2069-2070.
  • [6]McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, et al.: The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 2010, 20(9):1297-1303.
  • [7]Ge D, Ruzzo EK, Shianna KV, He M, Pelak K, Heinzen EL, Need AC, Cirulli ET, Maia JM, Dickson SP, et al.: SVA: software for annotating and visualizing sequenced human genomes. Bioinformatics 2011, 27(14):1998-2000.
  • [8]Wang K, Li M, Hakonarson H: ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res 2010, 38(16):e164.
  • [9]MacArthur DG, Balasubramanian S, Frankish A, Huang N, Morris J, Walter K, Jostins L, Habegger L, Pickrell JK, Montgomery SB, et al.: A systematic survey of loss-of-function variants in human protein-coding genes. Science 2012, 335(6070):823-828.
  • [10]Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, Handsaker RE, Lunter G, Marth GT, Sherry ST, et al.: The variant call format and VCFtools. Bioinformatics 2011, 27(15):2156-2158.
  • [11]Reese MG, Moore B, Batchelor C, Salas F, Cunningham F, Marth GT, Stein L, Flicek P, Yandell M, Eilbeck K: A standard variation file format for human genome sequences. Genome Biol 2010, 11(8):R88. BioMed Central Full Text
  • [12]Eilbeck K, Lewis SE, Mungall CJ, Yandell M, Stein L, Durbin R, Ashburner M: The Sequence Ontology: a tool for the unification of genome annotations. Genome Biol 2005, 6(5):R44. BioMed Central Full Text
  • [13]Grantham R: Amino acid difference formula to help explain protein evolution. Science 1974, 185(4154):862-864.
  • [14]Li WH, Wu CI, Luo CC: Nonrandomness of point mutation as reflected in nucleotide substitutions in pseudogenes and its evolutionary implications. J Mol Evol 1984, 21(1):58-71.
  • [15]Krzywinski M, Schein J, Birol I, Connors J, Gascoyne R, Horsman D, Jones SJ, Marra MA: Circos: an information aesthetic for comparative genomics. Genome Res 2009, 19(9):1639-1645.
  • [16]Harris TW, Antoshechkin I, Bieri T, Blasiar D, Chan J, Chen WJ, De La Cruz N, Davis P, Duesbury M, Fang R, et al.: WormBase: a comprehensive resource for nematode research. Nucleic Acids Res 2010, 38(Database issue):D463-D467.
  • [17]Drmanac R, Sparks AB, Callow MJ, Halpern AL, Burns NL, Kermani BG, Carnevali P, Nazarenko I, Nilsen GB, Yeung G, et al.: Human genome sequencing using unchained base reads on self-assembling DNA nanoarrays. Science 2010, 327(5961):78-81.
  • [18]Complete Genomics 69 Genomes Data. ftp://ftp2.completegenomics.com/Multigenome_summaries/Complete_Public_Genomes_69genomes_B37_mkvcf.vcf.bz2 webcite
  文献评价指标  
  下载次数:20次 浏览次数:19次