期刊论文详细信息
BMC Bioinformatics
Statistical significance approximation in local trend analysis of high-throughput time-series data using the theory of Markov chains
Methodology Article
Li C. Xia1  Jacob A. Cram2  Jed A. Fuhrman2  Fengzhu Sun3  Xiaoyi Liang4  Dongmei Ai4 
[1] Department of Medicine, Division of Oncology, Stanford University School of Medicine, 94305-5151, Stanford, CA, USA;Department of Statistics, The Wharton School, University of Pennsylvania, 19104, Philadelphia, PA, USA;Marine and Environmental Biology, Department of Biological Sciences, University of Southern California, 90089-0371, Los Angeles, CA, USA;Molecular and Computational Biology, Department of Biological Sciences, University of Southern California, 90089-2910, Los Angeles, CA, USA;Centre for Computational Systems Biology, Fudan University, 200433, Shanghai, China;School of Mathematics and Physics, University of Science and Technology Beijing, 100083, Beijing, China;
关键词: Markov Chain;    Time Series Data;    Markov Chain Model;    Local Trend;    Permutation Approach;   
DOI  :  10.1186/s12859-015-0732-8
 received in 2015-03-17, accepted in 2015-09-05,  发布年份 2015
来源: Springer
PDF
【 摘 要 】

BackgroundLocal trend (i.e. shape) analysis of time series data reveals co-changing patterns in dynamics of biological systems. However, slow permutation procedures to evaluate the statistical significance of local trend scores have limited its applications to high-throughput time series data analysis, e.g., data from the next generation sequencing technology based studies.ResultsBy extending the theories for the tail probability of the range of sum of Markovian random variables, we propose formulae for approximating the statistical significance of local trend scores. Using simulations and real data, we show that the approximate p-value is close to that obtained using a large number of permutations (starting at time points >20 with no delay and >30 with delay of at most three time steps) in that the non-zero decimals of the p-values obtained by the approximation and the permutations are mostly the same when the approximate p-value is less than 0.05. In addition, the approximate p-value is slightly larger than that based on permutations making hypothesis testing based on the approximate p-value conservative. The approximation enables efficient calculation of p-values for pairwise local trend analysis, making large scale all-versus-all comparisons possible. We also propose a hybrid approach by integrating the approximation and permutations to obtain accurate p-values for significantly associated pairs. We further demonstrate its use with the analysis of the Polymouth Marine Laboratory (PML) microbial community time series from high-throughput sequencing data and found interesting organism co-occurrence dynamic patterns.AvailabilityThe software tool is integrated into the eLSA software package that now provides accelerated local trend and similarity analysis pipelines for time series data. The package is freely available from the eLSA website: http://bitbucket.org/charade/elsa.

【 授权许可】

CC BY   
© Xia et al. 2015

【 预 览 】
附件列表
Files Size Format View
RO202311109691661ZK.pdf 2823KB PDF download
【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  • [27]
  • [28]
  • [29]
  • [30]
  • [31]
  • [32]
  • [33]
  • [34]
  • [35]
  文献评价指标  
  下载次数:0次 浏览次数:0次