Objective To analyze the dynamic correlation between internet searched data and actual data of pulmonary tuberculosis (TB) in China, construct a prediction model of pulmonary TB, explore the supplementary application of Baidu index in TB prevention, control and surveillance, and provide evidence for the pulmonary TB prevention and control in Jiangsu province.
Methods The Baidu index and actual incidence data of pulmonary TB from January 2011 to December 2019 were selected using the range selection method, and
Pearson correlation analysis was used to analyze their dynamic correlation. A multiple linear regression model and an artificial neural network model were established. The models were tested with the incidence data of pulmonary TB during January-December 2020 and the prediction effects of the two models were evaluated using mean absolute error (MAE), mean absolute percentage error (MAPE) and goodness-of-fit (
R^2
).
Results The prediction models based on the Baidu index could predict the time point of next pulmonary TB epidemic by 1−2 months in advance. The MAE of the “artificial neural network models by 2 months early” and “1 month early” were 273.75 and 357.99, the MAPE were 8.86% and 11.53%, and the
R^2
were 0.75 and 0.6.
Conclusion The prediction model based on the internet searched Baidu index has certain prediction power for the next wave of pulmonary TB epidemic in advance and can be used as the indicator supplement and extension in traditional pulmonary TB surveillance