Genetics and Molecular Biology | |
A parallel algorithm for finding small sets of genes that are enough to distinguish two biological states | |
Martha Torres1  Junior Barrera1  | |
[1] ,Universidade de São Paulo Instituto de Matemática e Estatística São Paulo SP ,Brazil | |
关键词: gene expression; classification; parallel processing; | |
DOI : 10.1590/S1415-47572004000400034 | |
来源: SciELO | |
【 摘 要 】
GCLASS is an algorithm which explores small samples of two distinct biological states for finding small sets of genes, which form a feature vector that is enough to separate these two states. A typical sample is a set of 60 microarrays, 30 for each biological state, with several thousand genes. The technique consists of the following: a spreading model defined in the space of small sets of genes studied and centered in each feature vector considered; the designing of optimal linear classifiers under this spreading model; and ranking the designed classifiers, based on their error and robustness relative to the spreading. The feature vectors used in the best classifiers are considered the best feature vectors. Due to the great number of potential feature sets, a parallel implementation is a good option for reducing the procedure execution time. This paper presents a parallel solution of GCLASS and shows some performance results. The experimental results show that the proposed solution provides quasi linear speedup if compared to the sequential implementation. For example, using 60 genes as the complete feature space and 6 genes as the small feature space, our parallel version with 11 processors is approximately 10.98 times faster than the sequential version.
【 授权许可】
CC BY
All the contents of this journal, except where otherwise noted, is licensed under a Creative Commons Attribution License
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO202005130147382ZK.pdf | 631KB | download |