期刊论文详细信息
BMC Bioinformatics
SW1PerS: Sliding windows and 1-persistence scoring; discovering periodicity in gene expression time series data
Jose A. Perea4  Anastasia Deckard2  Steve B. Haase3  John Harer1 
[1] Department of Computer Science and Department of Electrical and Computer Engineering, Duke University, Science Dr, Durham 27708, NC, USA
[2] Program in Computational Biology and Bioinformatics, Duke University, Durham 27708, NC, USA
[3] Department of Biology, Duke University, Durham 27708, NC, USA
[4] Institute for Mathematics and its Applications (IMA), University of Minnesota, Minneapolis, MN, USA
关键词: Persistent homology;    Sliding windows;    Time series;    Gene expression;    Periodicity;   
Others  :  1229845
DOI  :  10.1186/s12859-015-0645-6
 received in 2014-12-21, accepted in 2015-06-10,  发布年份 2015
【 摘 要 】

Background

Identifying periodically expressed genes across different processes (e.g. the cell and metabolic cycles, circadian rhythms, etc) is a central problem in computational biology. Biological time series may contain (multiple) unknown signal shapes of systemic relevance, imperfections like noise, damping, and trending, or limited sampling density. While there exist methods for detecting periodicity, their design biases (e.g. toward a specific signal shape) can limit their applicability in one or more of these situations.

Methods

We present in this paper a novel method, SW1PerS, for quantifying periodicity in time series in a shape-agnostic manner and with resistance to damping. The measurement is performed directly, without presupposing a particular pattern, by evaluating the circularity of a high-dimensional representation of the signal. SW1PerS is compared to other algorithms using synthetic data and performance is quantified under varying noise models, noise levels, sampling densities, and signal shapes. Results on biological data are also analyzed and compared.

Results

On the task of periodic/not-periodic classification, using synthetic data, SW1PerS outperforms all other algorithms in the low-noise regime. SW1PerS is shown to be the most shape-agnostic of the evaluated methods, and the only one to consistently classify damped signals as highly periodic. On biological data, and for several experiments, the lists of top 10% genes ranked with SW1PerS recover up to 67% of those generated with other popular algorithms. Moreover, the list of genes from data on the Yeast metabolic cycle which are highly-ranked only by SW1PerS, contains evidently non-cosine patterns (e.g. ECM33, CDC9, SAM1,2 and MSH6) with highly periodic expression profiles. In data from the Yeast cell cycle SW1PerS identifies genes not preferred by other algorithms, hence not previously reported as periodic, but found in other experiments such as the universal growth rate response of Slavov. These genes are BOP3, CDC10, YIL108W, YER034W, MLP1, PAC2 and RTT101.

Conclusions

In biological systems with low noise, i.e. where periodic signals with interesting shapes are more likely to occur, SW1PerS can be used as a powerful tool in exploratory analyses. Indeed, by having an initial set of periodic genes with a rich variety of signal types, pattern/shape information can be included in the study of systems and the generation of hypotheses regarding the structure of gene regulatory networks.

【 授权许可】

   
2015 Perea et al.

附件列表
Files Size Format View
Fig. 9. 24KB Image download
Fig. 8. 19KB Image download
Fig. 7. 29KB Image download
Fig. 6. 52KB Image download
Fig. 5. 19KB Image download
Fig. 4. 65KB Image download
Fig. 3. 94KB Image download
Fig. 2. 63KB Image download
Fig. 1. 49KB Image download
Fig. 9. 24KB Image download
Fig. 8. 19KB Image download
Fig. 7. 29KB Image download
Fig. 6. 52KB Image download
Fig. 5. 19KB Image download
Fig. 4. 65KB Image download
Fig. 3. 94KB Image download
Fig. 2. 63KB Image download
Fig. 1. 49KB Image download
【 图 表 】

Fig. 1.

Fig. 2.

Fig. 3.

Fig. 4.

Fig. 5.

Fig. 6.

Fig. 7.

Fig. 8.

Fig. 9.

Fig. 1.

Fig. 2.

Fig. 3.

Fig. 4.

Fig. 5.

Fig. 6.

Fig. 7.

Fig. 8.

Fig. 9.

【 参考文献 】
  • [1]Deckard A, Anafi RC, Hogenesch JB, Haase SB, Harer J. Design and analysis of large-scale biological rhythm studies: a comparison of algorithms for detecting periodic signals in biological data. Bioinformatics. 2013; 29(24):3174-3180.
  • [2]Wu G, Zhu J, Yu J, Zhou L, Huang JZ, Zhang Z. Evaluation of five methods for genome-wide circadian gene identification. Journal of Biological Rhythms. 2014; 29(4):231-242.
  • [3]de Lichtenberg U, Jensen LJ, Fausbøll A, Jensen TS, Bork P, Brunak S. Comparison of computational methods for the identification of cell cycle-regulated genes. Bioinformatics. 2005; 21(7):1164-1171.
  • [4]Straume M, Vol. 383. Methods in Enzymology. Methods in Enzymology: Elsevier; 2004.
  • [5]Lomb N. Least-squares frequency analysis of unequally spaced data. Astrophysics and Space Science. 1976; 39:447-462.
  • [6]Scargle J. Studies in astronomical time series analysis. II-Statistical aspects of spectral analysis of unevenly spaced data. Astrophysical Journal. 1982; 263:835-853.
  • [7]Luan Y, Li H. Model-based methods for identifying periodically expressed genes based on time course microarray gene expression data. Bioinformatics. 2004; 20(3):332-339.
  • [8]Hughes M, Hogenesch JB, Kornacker K. JTK_CYCLE: An Efficient Nonparametric Algorithm for Detecting Rhythmic Components in Genome-Scale Data Sets. Journal of Biological Rhythms. 2010; 25(372):372-380.
  • [9]Ahnert S, Willbrand K, Brown F, Fink T. Unbiased pattern detection in microarray data series. Bioinformatics. 2006; 22(12):1471-1476.
  • [10]Cohen-Steiner D, Edelsbrunner H, Harer J, Mileyko Y. Lipschitz Functions Have L p -Stable Persistence. Foundations of Computational Mathematics. 2010; 10(2):127-139.
  • [11]Orlando D, Lin C, Bernard A, Wang J, Socolar J, Iversen E, Hartemink A, Haase S. Global control of cell-cycle transcription by coupled CDK and network oscillators. Nature. 2008; 453(7197):944-947.
  • [12]Tu B, Kudlicki A, Rowicka M, McKnight S. Logic of the yeast metabolic cycle: temporal compartmentalization of cellular processes. Science. 2005; 310(5751):1152-1158.
  • [13]Hughes ME, DiTacchio L, Hayes KR, Vollmers C, Pulivarthy S, Baggs JE, Panda S, Hogenesch JB. Harmonics of circadian gene transcription in mammals. PLoS genetics. 2009; 5(4):1000442.
  • [14]Simon I, Barnett J, Hannett N, Harbison CT, Rinaldi NJ, Volkert TL, Wyrick JJ, Zeitlinger J, Gifford DK, Jaakkola TS. Serial Regulation of Transcriptional Regulators in the Yeast Cell Cycle. Cell. 2001; 106(6):697-708.
  • [15]Koike N, Yoo SH, Huang HC, Kumar V, Lee C, Kim TK, Takahashi JS. Transcriptional architecture and chromatin landscape of the core circadian clock in mammals. Science. 2012; 338(6105):349-354.
  • [16]Slavov N, Botstein D. Coupling among growth rate response, metabolic cycle, and cell division cycle in yeast. Molecular Biology of the Cell. 2011; 22(12):1997-2009.
  • [17]Hartwell LH. Genetic control of the cell division cycle in yeast. IV. Genes controlling bud emergence and cytokinesis. Experimental cell research. 1971; 69(2):265-276.
  • [18]Spellman PTP, Sherlock GG, Zhang MQM, Iyer VRV, Anders KK, Eisen MBM, Brown POP, Botstein DD, Futcher BB. Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Molecular Biology of the Cell. 1998; 9(12):3273-3297.
  • [19]Ubersax JA, Woodbury EL, Quang PN, Paraz M, Blethrow JD, Shah K, Shokat KM, Morgan DO. Targets of the cyclin-dependent kinase Cdk1. Nature. 2003; 425(6960):859-864.
  • [20]Tkach JM, Yimit A, Lee AY, Riffle M, Costanzo M, Jaschob D, Hendry JA, Ou J, Moffat J, Boone C, Davis TN, Nislow C, Brown GW. Dissecting DNA damage response pathways by analysing protein localization and abundance changes during DNA replication stress. Nat. Cell Biol. 2012; 14(9):966-976.
  • [21]Hediger F, Dubrana K, Gasser SM. Myosin-like proteins 1 and 2 are not required for silencing or telomere anchoring, but act in the Tel1 pathway of telomere length control. Journal of Structural Biology. 2002; 140:79-91.
  • [22]Niu W, Li Z, Zhan W, Iyer VR, Marcotte EM. Mechanisms of Cell Cycle Control Revealed by a Systematic and Quantitative Overexpression Screen in S. cerevisiae. PLoS genetics. 2008; 4(7):1000120.
  • [23]Michel JJ, McCarville JF, Xiong Y. A role for Saccharomyces cerevisiae Cul8 ubiquitin ligase in proper anaphase progression. The Journal of Biological Bhemistry. 2003; 278(25):22828-22837.
  • [24]Carlsson G. Topology and data. Bulletin of the American Mathematical Society. 2009; 46(2):255-308.
  • [25]Perea JA, Harer J. Sliding windows and persistence: An application of topological methods to signal analysis. Foundations of Computational Mathematics. 2014:1–40. doi:10.1007/s10208-014-9206-z
  • [26]Edelsbrunner H, Letscher D, Zomorodian A. Topological persistence and simplification. Discrete and Computational Geometry. 2002; 28(4):511-533.
  • [27]Comaniciu D, Meer P. Mean shift: A robust approach toward feature space analysis. Pattern Analysis and Machine Intelligence. 2002; 24(5):603-619.
  文献评价指标  
  下载次数:413次 浏览次数:31次