期刊论文详细信息
BMC Bioinformatics
Compression of next-generation sequencing quality scores using memetic algorithm
Proceedings
Jiarui Zhou1  Shan He2  Zhen Ji3  Zexuan Zhu3 
[1] College of Biomedical Engineering and Instrument Science, Zhejiang University, 310027, Hangzhou, China;Shenzhen City Key Laboratory of Embedded System Design, College of Computer Science and Software Engineering, Shenzhen University, 518060, Shenzhen, China;School of Computer Science, University of Birmingham, B15 2TT, Birmingham, UK;Shenzhen City Key Laboratory of Embedded System Design, College of Computer Science and Software Engineering, Shenzhen University, 518060, Shenzhen, China;
关键词: Quality Score;    Compression Ratio;    Memetic Algorithm;    Compression Algorithm;    Fitness Evaluation;   
DOI  :  10.1186/1471-2105-15-S15-S10
来源: Springer
PDF
【 摘 要 】

BackgroundThe exponential growth of next-generation sequencing (NGS) derived DNA data poses great challenges to data storage and transmission. Although many compression algorithms have been proposed for DNA reads in NGS data, few methods are designed specifically to handle the quality scores.ResultsIn this paper we present a memetic algorithm (MA) based NGS quality score data compressor, namely MMQSC. The algorithm extracts raw quality score sequences from FASTQ formatted files, and designs compression codebook using MA based multimodal optimization. The input data is then compressed in a substitutional manner. Experimental results on five representative NGS data sets show that MMQSC obtains higher compression ratio than the other state-of-the-art methods. Particularly, MMQSC is a lossless reference-free compression algorithm, yet obtains an average compression ratio of 22.82% on the experimental data sets.ConclusionsThe proposed MMQSC compresses NGS quality score data effectively. It can be utilized to improve the overall compression ratio on FASTQ formatted files.

【 授权许可】

Unknown   
© Zhou et al.; licensee BioMed Central Ltd. 2014. This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

【 预 览 】
附件列表
Files Size Format View
RO202311090330760ZK.pdf 919KB PDF download
【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  文献评价指标  
  下载次数:2次 浏览次数:0次