| PATTERN RECOGNITION | 卷:39 |
| Exploiting homogeneity in protein sequence clusters for construction of protein family hierarchies | |
| Article | |
| Chen, Chien-Yu ; Chung, Wen-Chin ; Su, Chung-Tsai | |
| 关键词: protein sequence clustering; family analysis; twilight zone; hierarchical algorithm; | |
| DOI : 10.1016/j.patcog.2005.12.008 | |
| 来源: Elsevier | |
PDF
|
|
【 摘 要 】
In the field of proteomics, protein hierarchies based on sequence analysis have been extensively applied to automate the annotations of new proteins and facilitate the discovery and analysis of protein families. However, the presence of ambiguous similarities in large databases increases the difficulty of delivering protein family hierarchies with favorable sensitivity and specificity. This work develops the HomoClust algorithm that exploits the homogeneity of protein sequences in generating protein family hierarchies. HomoClust improves the clustering quality of traditional hierarchical clustering algorithms by adopting different clustering mechanisms for different levels of sequence similarity. With considering homogeneity detection during clustering process, HomoClust increases the sensitivity of protein clusters without a drop in high specificity. (c) 2006 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.
【 授权许可】
Free
【 预 览 】
| Files | Size | Format | View |
|---|---|---|---|
| 10_1016_j_patcog_2005_12_008.pdf | 507KB |
PDF