Journal of computational biology: A journal of computational molecular cell biology | |
A Fast Parallel K-Modes Algorithm for Clustering Nucleotide Sequences to Predict Translation Initiation Sites | |
Luis EnriqueZárate^11  Guilherme TorresCastro^12  | |
[1] Address correspondence to: Dr. Henrique C. Freitas, Department of Computer Science, Pontifícia Universidade Católica de Minas Gerais, Av. Dom Jose Gaspar 500, Belo Horizonte 30535-901, Minas Gerais, Brazil^2;Department of Computer Science, Pontifícia Universidade Católica de Minas Gerais, Belo Horizonte, Brazil^1 | |
关键词: clustering; K-means; K-modes; nucleotide sequences; parallel computing; translation initiation site; | |
DOI : 10.1089/cmb.2018.0245 | |
学科分类:生物科学(综合) | |
来源: Mary Ann Liebert, Inc. Publishers | |
【 摘 要 】
Predicting the location of the translation initiation sites (TIS) is an important problem of molecular biology. In this field, the computational cost for balancing non-TIS sequences is substantial and demands high-performance computing. In this article, we present an optimized version of the K-modes algorithm to cluster TIS sequences and a comparison with the standard K-means clustering. The adapted algorithm uses simple instructions and fewer computational resources to deliver a significant speedup without compromising the sequence clustering results. We also implemented two optimized parallel versions of the algorithm, one for graphics processing units (GPUs) and the other one for general-purpose multicore processors. In our experiments, the GPU K-modes's performance was up to 203 times faster than the respective sequential version for processing Arabidopsis thaliana sequence.
【 授权许可】
Unknown
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO201910252757042ZK.pdf | 1365KB | download |