期刊论文详细信息
Journal of computational biology: A journal of computational molecular cell biology
Massively Parallel Implementation of Sequence Alignment with Basic Local Alignment Search Tool Using Parallel Computing in Java Library
MarekNowicki^1,21  PiotrBaŁa^42  DavitBzhalava^33 
[1] Address correspondence to:Dr. Marek NowickiFaculty of Mathematics and Computer ScienceNicolaus Copernicus UniversityChopina 12/1887-100 ToruńPoland^1;Department of Laboratory Medicine, Karolinska Institutet, Stockholm, Sweden^3;Faculty of Mathematics and Computer Science, Nicolaus Copernicus University in Toruń, Poland^2;Interdisciplinary Center for Mathematical and Computational Modeling, University of Warsaw, Warsaw, Poland^4
关键词: BLAST;    Java;    next-generation sequencing;    PCJ;    sequence alignment;   
DOI  :  10.1089/cmb.2018.0079
学科分类:生物科学(综合)
来源: Mary Ann Liebert, Inc. Publishers
PDF
【 摘 要 】

Basic Local Alignment Search Tool (BLAST) is an essential algorithm that researchers use for sequence alignment analysis. The National Center for Biotechnology Information (NCBI)-BLAST application is the most popular implementation of the BLAST algorithm. It can run on a single multithreading node. However, the volume of nucleotide and protein data is fast growing, making single node insufficient. It is more and more important to develop high-performance computing solutions, which could help researchers to analyze genetic data in a fast and scalable way. This article presents execution of the BLAST algorithm onhigh performance computing (HPC) clusters and supercomputers in a massively parallel manner using thousands of processors. The Parallel Computing in Java (PCJ) library has been used to implement the optimal splitting up of the input queries, the work distribution, and search management. It is used with the nonmodified NCBI-BLAST package, which is an additional advantage for the users. The result application—PCJ-BLAST—is responsible for reading sequence for comparison, splitting it up and starting multiple NCBI-BLAST executables. Since I/O performance could limit sequence analysis performance, the article contains an investigation of this problem. The obtained results show that using Java and PCJ library it is possible to perform sequence analysis using hundreds of nodes in parallel. We have achieved excellent performance and efficiency and we have significantly reduced the time required for sequence analysis. Our work also proved that PCJ library could be used as an effective tool for fast development of the scalable applications.

【 授权许可】

Unknown   

【 预 览 】
附件列表
Files Size Format View
RO201910251059252ZK.pdf 2205KB PDF download
  文献评价指标  
  下载次数:12次 浏览次数:10次