Liu Tian, Ruan Dexin, Hou Qingbo, Chen Hongying. Construction of combinatorial prediction model for infectious diseases based on software R[J]. Disease Surveillance, 2023, 38(9): 1094-1100. DOI: 10.3784/jbjc.202211090482
Citation: Liu Tian, Ruan Dexin, Hou Qingbo, Chen Hongying. Construction of combinatorial prediction model for infectious diseases based on software R[J]. Disease Surveillance, 2023, 38(9): 1094-1100. DOI: 10.3784/jbjc.202211090482

Construction of combinatorial prediction model for infectious diseases based on software R

  •   Objective  To construct a combinatorial prediction model for infectious diseases by using software R, and provide reference for disease surveillance.
      Methods  The monthly incidence of hemorrhagic fever with renal syndrome (HFRS) in China, and in Jilin, Liaoning and Heilongjiang provinces from 2004 to 2017 were used as the training data to fit models, and the data from January to December 2018 were used to evaluate the prediction effect. Seasonal autoregressive integrated moving average (SARIMA), exponential smoothing (ETS), neural network autoregression (NNETAR), and exponential smoothing state space model (TBATS) were selected, and “forecastHybrid” package in software R was used to construct combinatorial models. Single models with the same weight was recorded as combinatorial model A; the combinatorial model with the weight of single models determined according to the fitting effect of the training data was recorded as combinatorial model B. Mean absolute percentage error (MAPE) and root mean square error (RMSE) were used to evaluate the fitting and prediction effects of the six models. The data from 2004 to 2011, 2004 to 2012, 2004 to 2013, 2004 to 2014, 2004 to 2015, 2004 to 2016 and 2004 to 2017 were selected as training sets to construct the models respectively, and predict the incidence from January to December in the following year for sensitivity analysis. The fitting of combinatorial model, and the rank of MAPE and RMSE predicted were calculated to evaluate the stability of model fitting and prediction effects.
      Results  The MAPEs fitted by SARIMA, ETS, NNETAR, TBATS, combinatorial model A, and combinatorial model B in China, and in Jilin, Liaoning and Heilongjiang provinces were 11.81%, 9.75%, 11.50%, 9.71%, 8.09%, 8.06%; 29.63%, 15.39%, 23.04%, 14.60%, 16.33%, 16.29%;19.76%, 15.48%, 3.93%, 15.24%, 12.66%, 7.08%; 21.92%, 17.96%, 6.73%, 15.80%, 13.55%, 10.29% respectively. The predicted MAPEs of SARIMA, ETS, NNETAR, TBATS, combinatorial model A and combinatorial model B in China, and in Jilin, Liaoning and Heilongjiang provinces were 23.38%, 20.35%, 11.01%, 34.28%, 17.03%, 16.02%; 11.72%%, 14.26%, 24.32%, 14.16%, 11.93%, 11.92%; 28.09%, 27.57%, 29.19%, 27.32%, 26.91%, 26.49%; 23.72%, 33.28%, 28.96%, 33.785%, 25% 27.31% respectively. The RMSEs fitted by SARIMA, ETS, NNETAR, TBATS, combinatorial model A and combinatorial model B in China, and in Jilin, Liaoning and Heilongjiang provinces were 0.01, 0.01, 0.01, 0.01, 0.01, 0.01; 0.08, 0.08, 0.05, 0.08, 0.05, 0.05; 0.08, 0.07, 0.01, 0.07, 0.04, 0.02; 0.16, 0.16, 0.04, 0.15, 0.08, 0.06 respectively. The RMSEs predicted by SARIMA, ETS, NNETAR, TBATS, combinatorial model A and combinatorial model B in China, and in Jilin, Liaoning and Heilongjiang provinces were 0.02, 0.01, 0.02, 0.02, 0.01, 0.01; 0.03, 0.04, 0.07, 0.04 , 0.05, 0.05; 0.07, 0.05, 0.05, 0.05, 0.05, 0.05; 0.13, 0.14, 0.11, 0.14, 0.12, 0.12 respectively. The sensitivity analysis showed that the overall fitting effect of combinatorial model B ranked first, and combinatorial model A ranked 2–4. From the perspective of the prediction effect, the optimal effect of the combinatorial model in the two evaluation indicators ranked first or second.
      Conclusion  The performance of fitting and prediction of the combinatorial model are better than that of the single model, and the combinatorial model with the model weights according to the fitting effect of the training data is the optimal model. The combinatorial model can be constructed by simple programming using software R, the application of the model is worthy to promote.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return