Abstract:
Objective To explore the application of long short-term memory (LSTM) model and autoregressive integrated moving average (ARIMA) model in prediction of varicella incidence, and provide data support for targeted early warning, vaccination and prevention and control of varicella.
Methods Weekly incidence data of varicella in Qingyang, Gansu province, from January 2014 to March 2024 were collected to construct LSTM and ARIMA models. The constructed models were used to fit and compare the incidence of varicella from March 2024 to March 2025. Model performance was evaluated by using root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), and the coefficient of determination (R2). Finally, the optimal model was used to predict the weekly incidence of varicella from 2025 to 2026.
Results A total of 10 075 varicella cases were reported in Qingyang during 2014−2024, with the reported incidence rate increasing from 19.74/100 000 in 2014 to 93.18/100 000 in 2024, the average annual reported incidence rate was 41.27/100 000. The incidence rate was significantly higher in men (44.20/100 000) than in women (38.43/100 000) (χ2=49.070, P<0.001). The cases were mainly in individuals aged ≤18 years (8 658 cases, 85.94%), and most of them were students (5 504 cases, 54.63%) . The R2 was 0.92, RMSE was 6.22, MAE was 4.76, and MAPE was only 0.20, predicted by the LSTM model on the test set. Conversely, ARIMA (2,1,1) (2,0 ,2) 52 on its test set predicted the values of R2 of −0.03, RMSE of 26.29, MAE of 18.72, and MAPE of 0.50. The results indicated that the LSTM model outperformed the ARIMA model across all four evaluation indicators (R2, MAPE, RMSE, and MAE), demonstrating its superior prediction accuracy.
Conclusion The LSTM model demonstrated superior prediction performance compared with the ARIMA model, providing a theoretical foundation and practical guidance for early warning, vaccination, and control of varicella.