Infection and Drug Resistance | 卷:Volume 14 |
Forecasting the Tuberculosis Incidence Using a Novel Ensemble Empirical Mode Decomposition-Based Data-Driven Hybrid Model in Tibet, China | |
关键词: tuberculosis; incidence rate; time series analysis; statistical models; forecasting; | |
DOI : | |
来源: DOAJ |
【 摘 要 】
Jizhen Li,1 Yuhong Li,2 Ming Ye,3 Sanqiao Yao,1 Chongchong Yu,1 Lei Wang,4 Weidong Wu,1 Yongbin Wang1 1Department of Epidemiology and Health Statistics, School of Public Health, Xinxiang Medical University, Xinxiang, Henan Province, People’s Republic of China; 2National Center for Tuberculosis Control and Prevention, China Center for Disease Control and Prevention, Beijing, People’s Republic of China; 3Preventive Medicine Clinic, Xinxiang Center for Disease Control and Prevention, Xinxiang, Henan Province, People’s Republic of China; 4Center for Musculoskeletal Surgery, Charité–Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt–Universität Zu Berlin and Berlin Institute of Health, Berlin, GermanyCorrespondence: Yongbin WangDepartment of Epidemiology and Health Statistics, School of Public Health, Xinxiang Medical University, Xinxiang, Henan Province, 453000, People’s Republic of ChinaEmail wybwho@163.comObjective: The purpose of this study is to develop a novel data-driven hybrid model by fusing ensemble empirical mode decomposition (EEMD), seasonal autoregressive integrated moving average (SARIMA), with nonlinear autoregressive artificial neural network (NARNN), called EEMD-ARIMA-NARNN model, to assess and forecast the epidemic patterns of TB in Tibet.Methods: The TB incidence from January 2006 to December 2017 was obtained, and then the time series was partitioned into training subsamples (from January 2006 to December 2016) and testing subsamples (from January to December 2017). Among them, the training set was used to develop the EEMD-SARIMA-NARNN combined model, whereas the testing set was used to validate the forecasting performance of the model. Whilst the forecasting accuracy level of this novel method was compared with the basic SARIMA model, basic NARNN model, error-trend-seasonal (ETS) model, and traditional SARIMA-NARNN mixture model.Results: By comparing the accuracy level of the forecasting measurements including root-mean-square error, mean absolute deviation, mean error rate, mean absolute percentage error, and root-mean-square percentage error, it was shown that the EEMD-SARIMA-NARNN combined method produced lower error rates than the others. The descriptive statistics suggested that TB was a seasonal disease, peaking in late winter and early spring and a trough in autumn and early winter, and the TB epidemic indicated a drastic increase by a factor of 1.7 from 2006 to 2017 in Tibet, with average annual percentage change of 5.8 (95% confidence intervals: 3.5– 8.1).Conclusion: This novel data-driven hybrid method can better consider both linear and nonlinear components in the TB incidence than the others used in this study, which is of great help to estimate and forecast the future epidemic trends of TB in Tibet. Besides, under present trends, strict precautionary measures are required to reduce the spread of TB in Tibet.Keywords: tuberculosis, incidence rate, time series analysis, statistical models, forecasting
【 授权许可】
Unknown