| Computational and Structural Biotechnology Journal | |
| A novel numerical representation for proteins: Three-dimensional Chaos Game Representation and its Extended Natural Vector | |
| Rong Lucy He1  Zeju Sun2  Shaojun Pei2  Stephen S.-T. Yau2  | |
| [1] Department of Biological Sciences, Chicago State University, Chicago, IL 60628, USA;Department of Mathematical Sciences, Tsinghua University, Beijing, PR China; | |
| 关键词: Chaos Game Representation; Three-dimensional CGR; Extended Natural Vector; Protein classification; | |
| DOI : | |
| 来源: DOAJ | |
【 摘 要 】
Chaos Game Representation (CGR) was first proposed to be an image representation method of DNA and have been extended to the case of other biological macromolecules. Compared with the CGR images of DNA, where DNA sequences are converted into a series of points in the unit square, the existing CGR images of protein are not so elegant in geometry and the implications of the distribution of points in the CGR image are not so obvious. In this study, by naturally distributing the twenty amino acids on the vertices of a regular dodecahedron, we introduce a novel three-dimensional image representation of protein sequences with CGR method. We also associate each CGR image with a vector in high dimensional Euclidean space, called the extended natural vector (ENV), in order to analyze the information contained in the CGR images. Based on the results of protein classification and phylogenetic analysis, our method could serve as a precise method to discover biological relationships between proteins.
【 授权许可】
Unknown