期刊论文详细信息
BMC Bioinformatics
libFLASM: a software library for fixed-length approximate string matching
Software
Ahmad Retha1  Lorraine A. K. Ayad1  Solon P. Pissis1 
[1] Department of Informatics, King’s College London, The Strand, WC2R 2LS, London, UK;
关键词: Approximate string matching;    Fixed-length approximate string matching;    Dynamic programming;    Software library;   
DOI  :  10.1186/s12859-016-1320-2
 received in 2016-07-08, accepted in 2016-11-03,  发布年份 2016
来源: Springer
PDF
【 摘 要 】

BackgroundApproximate string matching is the problem of finding all factors of a given text that are at a distance at most k from a given pattern. Fixed-length approximate string matching is the problem of finding all factors of a text of length n that are at a distance at most k from any factor of length ℓ of a pattern of length m. There exist bit-vector techniques to solve the fixed-length approximate string matching problem in time O(m⌈ℓ/w⌉n)\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$\mathcal {O}(m\lceil \ell /w \rceil n)$\end{document} and space O(m⌈ℓ/w⌉)\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$\mathcal {O}(m\lceil \ell /w\rceil)$\end{document} under the edit and Hamming distance models, where w is the size of the computer word; as such these techniques are independent of the distance threshold k or the alphabet size. Fixed-length approximate string matching is a generalisation of approximate string matching and, hence, has numerous direct applications in computational molecular biology and elsewhere.ResultsWe present and make available libFLASM, a free open-source C++ software library for solving fixed-length approximate string matching under both the edit and the Hamming distance models. Moreover we describe how fixed-length approximate string matching is applied to solve real problems by incorporating libFLASM into established applications for multiple circular sequence alignment as well as single and structured motif extraction. Specifically, we describe how it can be used to improve the accuracy of multiple circular sequence alignment in terms of the inferred likelihood-based phylogenies; and we also describe how it is used to efficiently find motifs in molecular sequences representing regulatory or functional regions. The comparison of the performance of the library to other algorithms show how it is competitive, especially with increasing distance thresholds.ConclusionsFixed-length approximate string matching is a generalisation of the classic approximate string matching problem. We present libFLASM, a free open-source C++ software library for solving fixed-length approximate string matching. The extensive experimental results presented here suggest that other applications could benefit from using libFLASM, and thus further maintenance and development of libFLASM is desirable.

【 授权许可】

CC BY   
© The Author(s) 2016

【 预 览 】
附件列表
Files Size Format View
RO202311108200687ZK.pdf 687KB PDF download
Fig. 3 453KB Image download
MediaObjects/13046_2023_2865_MOESM2_ESM.docx 22KB Other download
Fig. 5 2311KB Image download
729KB Image download
12936_2016_1315_Article_IEq8.gif 1KB Image download
Fig. 2 279KB Image download
Fig. 4 756KB Image download
Fig. 5 40KB Image download
MediaObjects/12944_2023_1921_MOESM1_ESM.pdf 34KB PDF download
Fig. 1 806KB Image download
Fig. 3 447KB Image download
MediaObjects/13046_2023_2865_MOESM4_ESM.tif 8864KB Other download
12944_2023_1932_Article_IEq1.gif 1KB Image download
Fig. 5 471KB Image download
Fig. 1 2453KB Image download
12944_2023_1932_Article_IEq4.gif 2KB Image download
Fig. 4 557KB Image download
MediaObjects/12888_2023_5232_MOESM1_ESM.docx 2566KB Other download
MediaObjects/12974_2023_2918_MOESM1_ESM.jpg 395KB Other download
Fig. 3 825KB Image download
12936_2016_1315_Article_IEq9.gif 1KB Image download
Fig. 5 466KB Image download
MediaObjects/12936_2023_4475_MOESM1_ESM.doc 84KB Other download
MediaObjects/12974_2023_2918_MOESM2_ESM.jpg 726KB Other download
12936_2015_966_Article_IEq12.gif 1KB Image download
Fig. 3 3082KB Image download
Fig. 3 606KB Image download
12936_2017_2014_Article_IEq61.gif 1KB Image download
Fig. 5 121KB Image download
MediaObjects/13046_2022_2359_MOESM2_ESM.docx 15KB Other download
Fig. 12 729KB Image download
Fig. 1 247KB Image download
【 图 表 】

Fig. 1

Fig. 12

Fig. 5

12936_2017_2014_Article_IEq61.gif

Fig. 3

Fig. 3

12936_2015_966_Article_IEq12.gif

Fig. 5

12936_2016_1315_Article_IEq9.gif

Fig. 3

Fig. 4

12944_2023_1932_Article_IEq4.gif

Fig. 1

Fig. 5

12944_2023_1932_Article_IEq1.gif

Fig. 3

Fig. 1

Fig. 5

Fig. 4

Fig. 2

12936_2016_1315_Article_IEq8.gif

Fig. 5

Fig. 3

【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  • [27]
  • [28]
  • [29]
  • [30]
  • [31]
  • [32]
  • [33]
  • [34]
  • [35]
  • [36]
  • [37]
  • [38]
  • [39]
  • [40]
  • [41]
  • [42]
  • [43]
  • [44]
  • [45]
  • [46]
  • [47]
  • [48]
  • [49]
  • [50]
  • [51]
  • [52]
  • [53]
  文献评价指标  
  下载次数:0次 浏览次数:0次