期刊论文详细信息
BMC Genomics
Hybrid assembly with long and short reads improves discovery of gene family expansions
James Gurtowski1  W. Richard McCombie1  Robert M. Stupar2  Junqi Liu2  Li Song3  Nevin D. Young4  Peter Tiffin4  Peng Zhou4  Roxanne Denny5  Michael C. Schatz6  Jason R. Miller7  Kevin A. T. Silverstein8  Joann Mudge9  Thiruvarangan Ramaraj9  Brian P. Walenz1,10  Susan R. McCouch1,11  Namrata Singh1,11  Lyza G. Maron1,11  Hayan Lee1,12 
[1] Cold Spring Harbor Laboratory;Department of Agronomy and Plant Genetics, University of Minnesota;Department of Computer Science, Johns Hopkins University;Department of Plant Biology, University of Minnesota;Department of Plant Pathology, University of Minnesota;Departments of Computer Science and Biology, Johns Hopkins University;J. Craig Venter Institute;Minnesota Supercomputing Institute, University of Minnesota;National Center for Genome Resources;National Human Genome Research Institute;School of Integrative Plant Sciences, Plant Breeding and Genetics section, Cornell University;Stanford School of Medicine;
关键词: Genome assembly;    Hybrid assembly pipeline;    Tandem repeats;    Medicago truncatula;   
DOI  :  10.1186/s12864-017-3927-8
来源: DOAJ
【 摘 要 】

Abstract Background Long-read and short-read sequencing technologies offer competing advantages for eukaryotic genome sequencing projects. Combinations of both may be appropriate for surveys of within-species genomic variation. Methods We developed a hybrid assembly pipeline called “Alpaca” that can operate on 20X long-read coverage plus about 50X short-insert and 50X long-insert short-read coverage. To preclude collapse of tandem repeats, Alpaca relies on base-call-corrected long reads for contig formation. Results Compared to two other assembly protocols, Alpaca demonstrated the most reference agreement and repeat capture on the rice genome. On three accessions of the model legume Medicago truncatula, Alpaca generated the most agreement to a conspecific reference and predicted tandemly repeated genes absent from the other assemblies. Conclusion Our results suggest Alpaca is a useful tool for investigating structural and copy number variation within de novo assemblies of sampled populations.

【 授权许可】

Unknown   

  文献评价指标  
  下载次数:0次 浏览次数:0次