| BMC Genomics | |
| Genome sequencing of bacteria: sequencing, de novoassembly and rapid analysis using open source tools | |
| Research Article | |
| Teresa Lettieri1  Veljo Kisand2  | |
| [1] European Commission, Joint Research Centre, Institute for Environment and Sustainability Rural, Water and Ecosystem Resources Unit, TP 270, Via E. Fermi, 2749, 21027, Ispra, VA, Italy;Institute of Technology, Tartu University, Nooruse 1, 50411, Tartu, Estonia;European Commission, Joint Research Centre, Institute for Environment and Sustainability Rural, Water and Ecosystem Resources Unit, TP 270, Via E. Fermi, 2749, 21027, Ispra, VA, Italy; | |
| 关键词: Reference mapping; De novo; De novo; Automated annotation; Marine bacteria; | |
| DOI : 10.1186/1471-2164-14-211 | |
| received in 2012-08-20, accepted in 2013-03-22, 发布年份 2013 | |
| 来源: Springer | |
PDF
|
|
【 摘 要 】
BackgroundDe novo genome sequencing of previously uncharacterized microorganisms has the potential to open up new frontiers in microbial genomics by providing insight into both functional capabilities and biodiversity. Until recently, Roche 454 pyrosequencing was the NGS method of choice for de novo assembly because it generates hundreds of thousands of long reads (<450 bps), which are presumed to aid in the analysis of uncharacterized genomes. The array of tools for processing NGS data are increasingly free and open source and are often adopted for both their high quality and role in promoting academic freedom.ResultsThe error rate of pyrosequencing the Alcanivorax borkumensis genome was such that thousands of insertions and deletions were artificially introduced into the finished genome. Despite a high coverage (~30 fold), it did not allow the reference genome to be fully mapped. Reads from regions with errors had low quality, low coverage, or were missing. The main defect of the reference mapping was the introduction of artificial indels into contigs through lower than 100% consensus and distracting gene calling due to artificial stop codons. No assembler was able to perform de novo assembly comparable to reference mapping. Automated annotation tools performed similarly on reference mapped and de novo draft genomes, and annotated most CDSs in the de novo assembled draft genomes.ConclusionsFree and open source software (FOSS) tools for assembly and annotation of NGS data are being developed rapidly to provide accurate results with less computational effort. Usability is not high priority and these tools currently do not allow the data to be processed without manual intervention. Despite this, genome assemblers now readily assemble medium short reads into long contigs (>97-98% genome coverage). A notable gap in pyrosequencing technology is the quality of base pair calling and conflicting base pairs between single reads at the same nucleotide position. Regardless, using draft whole genomes that are not finished and remain fragmented into tens of contigs allows one to characterize unknown bacteria with modest effort.
【 授权许可】
CC BY
© Kisand and Lettieri; licensee BioMed Central Ltd. 2013
【 预 览 】
| Files | Size | Format | View |
|---|---|---|---|
| RO202311105664813ZK.pdf | 1487KB |
【 参考文献 】
- [1]
- [2]
- [3]
- [4]
- [5]
- [6]
- [7]
- [8]
- [9]
- [10]
- [11]
- [12]
- [13]
- [14]
- [15]
- [16]
- [17]
- [18]
- [19]
- [20]
- [21]
- [22]
- [23]
- [24]
- [25]
- [26]
- [27]
- [28]
- [29]
- [30]
- [31]
- [32]
- [33]
- [34]
- [35]
- [36]
- [37]
- [38]
- [39]
- [40]
- [41]
- [42]
- [43]
- [44]
- [45]
- [46]
PDF