| BMC Genomics | |
| What’s in your next-generation sequence data? An exploration of unmapped DNA and RNA sequence reads from the bovine reference individual | |
| Research Article | |
| Steven G. Schroeder1  Tad S. Sonstegard2  Juan F. Medrano3  Jeremy F. Taylor4  JaeWoo Kim4  Polyana C. Tizioto5  Robert D. Schnabel6  Lynsey K. Whitacre6  Jared E. Decker6  Leeson J. Alexander7  | |
| [1] Animal Genomics and Improvement Laboratory, USDA-ARS, 20705, Beltsville, MD, USA;Animal Genomics and Improvement Laboratory, USDA-ARS, 20705, Beltsville, MD, USA;Recombinetics Inc., 1246 University Ave W #301, 55104, St Paul, MN, USA;Department of Animal Science, University of California-Davis, 95616, Davis, CA, USA;Division of Animal Sciences, University of Missouri, 65211, Columbia, MO, USA;Division of Animal Sciences, University of Missouri, 65211, Columbia, MO, USA;Embrapa Southeast Livestock, 13560-970, São Carlos, São Paulo, Brazil;Informatics Institute, University of Missouri, 65211, Columbia, MO, USA;Division of Animal Sciences, University of Missouri, 65211, Columbia, MO, USA;USDA-ARS (retired), LARRL, Fort Keogh Miles City, 59301, Montana, USA; | |
| 关键词: DNA sequencing; RNA sequencing; Unmapped reads; | |
| DOI : 10.1186/s12864-015-2313-7 | |
| received in 2015-08-26, accepted in 2015-12-15, 发布年份 2015 | |
| 来源: Springer | |
PDF
|
|
【 摘 要 】
BackgroundNext-generation sequencing projects commonly commence by aligning reads to a reference genome assembly. While improvements in alignment algorithms and computational hardware have greatly enhanced the efficiency and accuracy of alignments, a significant percentage of reads often remain unmapped.ResultsWe generated de novo assemblies of unmapped reads from the DNA and RNA sequencing of the Bos taurus reference individual and identified the closest matching sequence to each contig by alignment to the NCBI non-redundant nucleotide database using BLAST. As expected, many of these contigs represent vertebrate sequence that is absent, incomplete, or misassembled in the UMD3.1 reference assembly. However, numerous additional contigs represent invertebrate species. Most prominent were several species of Spirurid nematodes and a blood-borne parasite, Babesia bigemina. These species are either not present in the US or are not known to infect taurine cattle and the reference animal appears to have been host to unsequenced sister species.ConclusionsWe demonstrate the importance of exploring unmapped reads to ascertain sequences that are either absent or misassembled in the reference assembly and for detecting sequences indicative of parasitic or commensal organisms.
【 授权许可】
CC BY
© Whitacre et al. 2015
【 预 览 】
| Files | Size | Format | View |
|---|---|---|---|
| RO202311106194717ZK.pdf | 790KB |
【 参考文献 】
- [1]
- [2]
- [3]
- [4]
- [5]
- [6]
- [7]
- [8]
- [9]
- [10]
- [11]
- [12]
- [13]
- [14]
- [15]
- [16]
- [17]
- [18]
- [19]
- [20]
- [21]
- [22]
- [23]
- [24]
- [25]
- [26]
- [27]
- [28]
- [29]
- [30]
- [31]
- [32]
- [33]
- [34]
- [35]
- [36]
- [37]
- [38]
PDF