BMC Bioinformatics | |
libFLASM: a software library for fixed-length approximate string matching | |
Software | |
Ahmad Retha1  Lorraine A. K. Ayad1  Solon P. Pissis1  | |
[1] Department of Informatics, King’s College London, The Strand, WC2R 2LS, London, UK; | |
关键词: Approximate string matching; Fixed-length approximate string matching; Dynamic programming; Software library; | |
DOI : 10.1186/s12859-016-1320-2 | |
received in 2016-07-08, accepted in 2016-11-03, 发布年份 2016 | |
来源: Springer | |
【 摘 要 】
BackgroundApproximate string matching is the problem of finding all factors of a given text that are at a distance at most k from a given pattern. Fixed-length approximate string matching is the problem of finding all factors of a text of length n that are at a distance at most k from any factor of length ℓ of a pattern of length m. There exist bit-vector techniques to solve the fixed-length approximate string matching problem in time O(m⌈ℓ/w⌉n)\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$\mathcal {O}(m\lceil \ell /w \rceil n)$\end{document} and space O(m⌈ℓ/w⌉)\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$\mathcal {O}(m\lceil \ell /w\rceil)$\end{document} under the edit and Hamming distance models, where w is the size of the computer word; as such these techniques are independent of the distance threshold k or the alphabet size. Fixed-length approximate string matching is a generalisation of approximate string matching and, hence, has numerous direct applications in computational molecular biology and elsewhere.ResultsWe present and make available libFLASM, a free open-source C++ software library for solving fixed-length approximate string matching under both the edit and the Hamming distance models. Moreover we describe how fixed-length approximate string matching is applied to solve real problems by incorporating libFLASM into established applications for multiple circular sequence alignment as well as single and structured motif extraction. Specifically, we describe how it can be used to improve the accuracy of multiple circular sequence alignment in terms of the inferred likelihood-based phylogenies; and we also describe how it is used to efficiently find motifs in molecular sequences representing regulatory or functional regions. The comparison of the performance of the library to other algorithms show how it is competitive, especially with increasing distance thresholds.ConclusionsFixed-length approximate string matching is a generalisation of the classic approximate string matching problem. We present libFLASM, a free open-source C++ software library for solving fixed-length approximate string matching. The extensive experimental results presented here suggest that other applications could benefit from using libFLASM, and thus further maintenance and development of libFLASM is desirable.
【 授权许可】
CC BY
© The Author(s) 2016
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO202311096922018ZK.pdf | 687KB | download | |
12864_2017_3669_Article_IEq4.gif | 1KB | Image | download |
12864_2016_2789_Article_IEq16.gif | 1KB | Image | download |
12888_2016_877_Article_IEq4.gif | 1KB | Image | download |
12864_2015_2192_Article_IEq10.gif | 1KB | Image | download |
12864_2017_3669_Article_IEq5.gif | 1KB | Image | download |
12864_2016_3440_Article_IEq55.gif | 1KB | Image | download |
12864_2017_3733_Article_IEq13.gif | 1KB | Image | download |
12864_2016_3440_Article_IEq56.gif | 1KB | Image | download |
12864_2017_4166_Article_IEq1.gif | 1KB | Image | download |
12864_2016_2821_Article_IEq55.gif | 1KB | Image | download |
12864_2017_4004_Article_IEq10.gif | 2KB | Image | download |
12864_2017_3771_Article_IEq2.gif | 1KB | Image | download |
12864_2017_3771_Article_IEq3.gif | 1KB | Image | download |
12864_2016_2821_Article_IEq60.gif | 1KB | Image | download |
12864_2015_1945_Article_IEq4.gif | 1KB | Image | download |
12864_2017_4186_Article_IEq31.gif | 1KB | Image | download |
12864_2017_4132_Article_IEq5.gif | 1KB | Image | download |
12864_2017_4186_Article_IEq32.gif | 1KB | Image | download |
12864_2016_2756_Article_IEq2.gif | 1KB | Image | download |
12888_2016_877_Article_IEq7.gif | 1KB | Image | download |
12864_2017_3503_Article_IEq1.gif | 1KB | Image | download |
12888_2016_877_Article_IEq8.gif | 1KB | Image | download |
12864_2015_2192_Article_IEq13.gif | 1KB | Image | download |
12870_2017_1059_Article_IEq7.gif | 1KB | Image | download |
12864_2017_3487_Article_IEq15.gif | 1KB | Image | download |
12864_2017_3733_Article_IEq17.gif | 1KB | Image | download |
12864_2015_2300_Article_IEq24.gif | 1KB | Image | download |
12864_2017_4004_Article_IEq12.gif | 1KB | Image | download |
12864_2015_2300_Article_IEq26.gif | 1KB | Image | download |
12864_2017_3521_Article_IEq1.gif | 1KB | Image | download |
12864_2016_2392_Article_IEq1.gif | 1KB | Image | download |
12864_2017_4030_Article_IEq26.gif | 1KB | Image | download |
【 图 表 】
12864_2017_4030_Article_IEq26.gif
12864_2016_2392_Article_IEq1.gif
12864_2017_3521_Article_IEq1.gif
12864_2015_2300_Article_IEq26.gif
12864_2017_4004_Article_IEq12.gif
12864_2015_2300_Article_IEq24.gif
12864_2017_3733_Article_IEq17.gif
12864_2017_3487_Article_IEq15.gif
12870_2017_1059_Article_IEq7.gif
12864_2015_2192_Article_IEq13.gif
12888_2016_877_Article_IEq8.gif
12864_2017_3503_Article_IEq1.gif
12888_2016_877_Article_IEq7.gif
12864_2016_2756_Article_IEq2.gif
12864_2017_4186_Article_IEq32.gif
12864_2017_4132_Article_IEq5.gif
12864_2017_4186_Article_IEq31.gif
12864_2015_1945_Article_IEq4.gif
12864_2016_2821_Article_IEq60.gif
12864_2017_3771_Article_IEq3.gif
12864_2017_3771_Article_IEq2.gif
12864_2017_4004_Article_IEq10.gif
12864_2016_2821_Article_IEq55.gif
12864_2017_4166_Article_IEq1.gif
12864_2016_3440_Article_IEq56.gif
12864_2017_3733_Article_IEq13.gif
12864_2016_3440_Article_IEq55.gif
12864_2017_3669_Article_IEq5.gif
12864_2015_2192_Article_IEq10.gif
12888_2016_877_Article_IEq4.gif
12864_2016_2789_Article_IEq16.gif
12864_2017_3669_Article_IEq4.gif
【 参考文献 】
- [1]
- [2]
- [3]
- [4]
- [5]
- [6]
- [7]
- [8]
- [9]
- [10]
- [11]
- [12]
- [13]
- [14]
- [15]
- [16]
- [17]
- [18]
- [19]
- [20]
- [21]
- [22]
- [23]
- [24]
- [25]
- [26]
- [27]
- [28]
- [29]
- [30]
- [31]
- [32]
- [33]
- [34]
- [35]
- [36]
- [37]
- [38]
- [39]
- [40]
- [41]
- [42]
- [43]
- [44]
- [45]
- [46]
- [47]
- [48]
- [49]
- [50]
- [51]
- [52]
- [53]