| BMC Bioinformatics | |
| BAMQL: a query language for extracting reads from BAM files | |
| Software | |
| Michael Fraser1  Robert G. Bristow2  Christopher M. Lalansingh3  Andre P. Masella3  Pragash Sivasundaram3  Paul C. Boutros4  | |
| [1] Ontario Cancer Institute, Princess Margaret Cancer Centre/University Health Network, Toronto, Canada;Ontario Cancer Institute, Princess Margaret Cancer Centre/University Health Network, Toronto, Canada;Department of Medical Biophysics, University of Toronto, Toronto, Canada;Department of Radiation Oncology, University of Toronto, Toronto, Canada;Ontario Institute for Cancer Research, Suite 510, 661 University Avenue, M5G 0A3, Toronto, Canada;Ontario Institute for Cancer Research, Suite 510, 661 University Avenue, M5G 0A3, Toronto, Canada;Department of Pharmacology & Toxicology, University of Toronto, Toronto, Canada;Department of Medical Biophysics, University of Toronto, Toronto, Canada; | |
| 关键词: BAMQL; Query language; BAM-format; | |
| DOI : 10.1186/s12859-016-1162-y | |
| received in 2016-04-08, accepted in 2016-07-21, 发布年份 2016 | |
| 来源: Springer | |
PDF
|
|
【 摘 要 】
BackgroundIt is extremely common to need to select a subset of reads from a BAM file based on their specific properties. Typically, a user unpacks the BAM file to a text stream using SAMtools, parses and filters the lines using AWK, then repacks them using SAMtools. This process is tedious and error-prone. In particular, when working with many columns of data, mix-ups are common and the bit field containing the flags is unintuitive. There are several libraries for reading BAM files, such as Bio-SamTools for Perl and pysam for Python. Both allow access to the BAM’s read information and can filter reads, but require substantial boilerplate code; this is high overhead for mostly ad hoc filtering.ResultsWe have created a query language that gathers reads using a collection of predicates and common logical connectives. Queries run faster than equivalents and can be compiled to native code for embedding in larger programs.ConclusionsBAMQL provides a user-friendly, powerful and performant way to extract subsets of BAM files for ad hoc analyses or integration into applications. The query language provides a collection of predicates beyond those in SAMtools, and more flexible connectives.
【 授权许可】
CC BY
© The Author(s) 2016
【 预 览 】
| Files | Size | Format | View |
|---|---|---|---|
| RO202311099493190ZK.pdf | 400KB |
【 参考文献 】
- [1]
- [2]
- [3]
- [4]
- [5]
PDF