BMC Bioinformatics | |
Partitioning of functional gene expression data using principal points | |
Research Article | |
Jaehee Kim1  Haseong Kim2  | |
[1] Department of Statistics, Duksung Women’s University, Seoul, South Korea;Korea Research Institute of Bioscience and Biotechnology (KRIBB), Daejeon, South Korea; | |
关键词: Fourier coefficients; Legendre polynomials; Escherichia coli; K-means clustering; Principal points; Silhouette; Yeast cell-cycle data; | |
DOI : 10.1186/s12859-017-1860-0 | |
received in 2016-04-19, accepted in 2017-10-02, 发布年份 2017 | |
来源: Springer | |
【 摘 要 】
BackgroundDNA microarrays offer motivation and hope for the simultaneous study of variations in multiple genes. Gene expression is a temporal process that allows variations in expression levels with a characterized gene function over a period of time. Temporal gene expression curves can be treated as functional data since they are considered as independent realizations of a stochastic process. This process requires appropriate models to identify patterns of gene functions. The partitioning of the functional data can find homogeneous subgroups of entities for the massive genes within the inherent biological networks. Therefor it can be a useful technique for the analysis of time-course gene expression data. We propose a new self-consistent partitioning method of functional coefficients for individual expression profiles based on the orthonormal basis system.ResultsA principal points based functional partitioning method is proposed for time-course gene expression data. The method explores the relationship between genes using Legendre coefficients as principal points to extract the features of gene functions. Our proposed method provides high connectivity in connectedness after clustering for simulated data and finds a significant subsets of genes with the increased connectivity. Our approach has comparative advantages that fewer coefficients are used from the functional data and self-consistency of principal points for partitioning. As real data applications, we are able to find partitioned genes through the gene expressions found in budding yeast data and Escherichia coli data.ConclusionsThe proposed method benefitted from the use of principal points, dimension reduction, and choice of orthogonal basis system as well as provides appropriately connected genes in the resulting subsets. We illustrate our method by applying with each set of cell-cycle-regulated time-course yeast genes and E. coli genes. The proposed method is able to identify highly connected genes and to explore the complex dynamics of biological systems in functional genomics.
【 授权许可】
CC BY
© The Author(s). 2017
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO202311095749654ZK.pdf | 1596KB | download | |
12864_2017_3492_Article_IEq18.gif | 1KB | Image | download |
12864_2017_3492_Article_IEq19.gif | 1KB | Image | download |
12864_2017_3492_Article_IEq20.gif | 1KB | Image | download |
【 图 表 】
12864_2017_3492_Article_IEq20.gif
12864_2017_3492_Article_IEq19.gif
12864_2017_3492_Article_IEq18.gif
【 参考文献 】
- [1]
- [2]
- [3]
- [4]
- [5]
- [6]
- [7]
- [8]
- [9]
- [10]
- [11]
- [12]
- [13]
- [14]
- [15]
- [16]
- [17]
- [18]
- [19]
- [20]
- [21]
- [22]
- [23]
- [24]
- [25]
- [26]
- [27]
- [28]
- [29]
- [30]
- [31]
- [32]
- [33]
- [34]
- [35]
- [36]
- [37]
- [38]
- [39]
- [40]
- [41]
- [42]
- [43]
- [44]
- [45]
- [46]
- [47]
- [48]
- [49]
- [50]
- [51]
- [52]
- [53]
- [54]
- [55]
- [56]
- [57]