期刊论文详细信息
BMC Genomics
Hybrid assembly with long and short reads improves discovery of gene family expansions
Methodology Article
James Gurtowski1  W. Richard McCombie1  Robert M. Stupar2  Junqi Liu2  Li Song3  Peter Tiffin4  Nevin D. Young4  Peng Zhou4  Roxanne Denny5  Michael C. Schatz6  Jason R. Miller7  Kevin A. T. Silverstein8  Thiruvarangan Ramaraj9  Joann Mudge9  Brian P. Walenz1,10  Lyza G. Maron1,11  Namrata Singh1,11  Susan R. McCouch1,11  Hayan Lee1,12 
[1] Cold Spring Harbor Laboratory, Harbor, Cold Spring, NY, USA;Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN, USA;Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA;Department of Plant Biology, University of Minnesota, Saint Paul, MN, USA;Department of Plant Pathology, University of Minnesota, St. Paul, MN, USA;Departments of Computer Science and Biology, Johns Hopkins University, Baltimore, MD, USA;J. Craig Venter Institute, 9714 Medical Center Drive, 20850, Rockville, MD, USA;Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, MN, USA;National Center for Genome Resources, Santa Fe, NM, USA;National Human Genome Research Institute, Bethesda, MD, USA;School of Integrative Plant Sciences, Plant Breeding and Genetics section, Cornell University, 14850, Ithaca, NY, USA;Stanford School of Medicine, Stanford, CA, USA;
关键词: Genome assembly;    Hybrid assembly pipeline;    Tandem repeats;    Medicago truncatula;   
DOI  :  10.1186/s12864-017-3927-8
 received in 2016-11-09, accepted in 2017-07-06,  发布年份 2017
来源: Springer
PDF
【 摘 要 】

BackgroundLong-read and short-read sequencing technologies offer competing advantages for eukaryotic genome sequencing projects. Combinations of both may be appropriate for surveys of within-species genomic variation.MethodsWe developed a hybrid assembly pipeline called “Alpaca” that can operate on 20X long-read coverage plus about 50X short-insert and 50X long-insert short-read coverage. To preclude collapse of tandem repeats, Alpaca relies on base-call-corrected long reads for contig formation.ResultsCompared to two other assembly protocols, Alpaca demonstrated the most reference agreement and repeat capture on the rice genome. On three accessions of the model legume Medicago truncatula, Alpaca generated the most agreement to a conspecific reference and predicted tandemly repeated genes absent from the other assemblies.ConclusionOur results suggest Alpaca is a useful tool for investigating structural and copy number variation within de novo assemblies of sampled populations.

【 授权许可】

CC BY   
© The Author(s). 2017

【 预 览 】
附件列表
Files Size Format View
RO202311096089047ZK.pdf 545KB PDF download
【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  • [27]
  • [28]
  • [29]
  • [30]
  • [31]
  • [32]
  • [33]
  • [34]
  • [35]
  • [36]
  • [37]
  • [38]
  • [39]
  • [40]
  • [41]
  • [42]
  • [43]
  • [44]
  • [45]
  • [46]
  • [47]
  • [48]
  • [49]
  • [50]
  • [51]
  • [52]
  文献评价指标  
  下载次数:16次 浏览次数:0次