基于LSTM和ARIMA两种模型对甘肃省庆阳市水痘发病情况的预测和比较分析

Prediction and comparative analysis of varicella incidence in Qingyang, Gansu province based on LSTM and ARIMA models

  • 摘要:
    目的 探讨长短期记忆神经网络(LSTM)模型和自回归积分滑动平均(ARIMA)模型在甘肃省庆阳市水痘发病趋势预测中的应用,为制定针对性的水痘疫情防控预警措施及后期评估水痘疫苗免疫策略提供数据支持。
    方法 选取2014年1月至2024年3月庆阳市水痘周发病数分别构建LSTM模型和ARIMA模型,并利用所构建的模型对2024年3月至2025年3月水痘发病数进行拟合比较,通过均方根误差(RMSE)、平均绝对误差(MAE)、平均绝对百分比误差(MAPE)、决定系数R2评估模型预测性能的优劣,最后应用最优模型预测2025—2026年庆阳市水痘周发病数。
    结果 2014—2024年共报告水痘病例10 075例,报告发病率从2014年的19.74/10万上升至2024年的93.18/10万,年均报告发病率为41.27/10万。男性报告发病率(44.20/10万)高于女性(38.43/10万),差异有统计学意义(χ2=49.070,P<0.001)。病例主要集中在≤18岁人群,占报告病例数的85.94%(8 658例);发病人群主要为学生(5 504例,54.63%)。LSTM模型测试集预测结果的R2为0.92,RMSE值为6.22,MAE值为4.76,MAPE值为0.20,而ARIMA(2,1,1)(2,0,2)52的测试集预测结果的R2为-0.03,RMSE值为26.29,MAE值为18.72,MAPE值为0.50,LSTM模型在R2、MAPE、RMSE和MAE四个评价指标上均优于ARIMA 模型,表明LSTM模型具有更高的预测精度。
    结论 LSTM模型预测效果优于ARIMA模型,可为水痘早期预警、疫苗免疫和防控工作提供理论基础和实践指导。

     

    Abstract:
    Objective To explore the application of the long short-term memory (LSTM) model and the autoregressive integrated moving average (ARIMA) model in predicting varicella incidence trends, thereby providing data support for targeted early warning measures, evaluation of vaccination strategies, and prevention and control efforts.
    Methods Weekly incidence data of varicella from January 2014 to March 2024 in Qingyang were selected to construct both LSTM and ARIMA models. The constructed models were utilized to fit and compare the incidence numbers from March 2024 to March 2025. Model performance was evaluated using root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), and the coefficient of determination (R2). Finally, the optimal model was applied to predict weekly varicella incidence from 2025 to 2026.
    Results A total of 10 075 varicella cases were reported between 2014-2024, with the reported incidence rate increasing from 19.74/100 000 in 2014 to 93.18/100 000 in 2024, resulting in an average annual reported incidence rate of 41.27/100 000. The incidence rate was significantly higher in males (44.20/100 000) than in females (38.43/100 000) (χ2=49.070, P<0.001). Cases were predominantly concentrated in individuals ≤18 years old, accounting for 85.94% (8 658 cases) of the total. students represented the primary affected group with a total of 5 504 cases or about 54.63% of all reports. The R2value for predictions made by the LSTM model on the test set was found to be 0.92, with RMSE at 6.22, MAE at 4.76, and MAPE at only 0.20. Conversely, for ARIMA(2,1,1)(2,0 ,2)52 on its test set yielded an R2 value of -0.03, RMSE at 26.29, MAE at 18.72, and MAPE at 0.50. The results indicate that the LSTM model outperformed the ARIMA model across all four evaluation metrics (R2, MAPE, RMSE, and MAE), demonstrating its superior predictive accuracy.
    Conclusion The LSTM model demonstrates superior predictive performance compared to ARIMA models, providing a theoretical foundation and practical guidance for early warning measures, vaccination immunization, and control efforts related to varicella infections.

     

/

返回文章
返回