Time series analysis of human brucellosis in mainland China by using Elman and Jordan recurrent neural networks.
ABSTRACT: BACKGROUND:Establishing epidemiological models and conducting predictions seems to be useful for the prevention and control of human brucellosis. Autoregressive integrated moving average (ARIMA) models can capture the long-term trends and the periodic variations in time series. However, these models cannot handle the nonlinear trends correctly. Recurrent neural networks can address problems that involve nonlinear time series data. In this study, we intended to build prediction models for human brucellosis in mainland China with Elman and Jordan neural networks. The fitting and forecasting accuracy of the neural networks were compared with a traditional seasonal ARIMA model. METHODS:The reported human brucellosis cases were obtained from the website of the National Health and Family Planning Commission of China. The human brucellosis cases from January 2004 to December 2017 were assembled as monthly counts. The training set observed from January 2004 to December 2016 was used to build the seasonal ARIMA model, Elman and Jordan neural networks. The test set from January 2017 to December 2017 was used to test the forecast results. The root mean squared error (RMSE), mean absolute error (MAE) and mean absolute percentage error (MAPE) were used to assess the fitting and forecasting accuracy of the three models. RESULTS:There were 52,868 cases of human brucellosis in Mainland China from January 2004 to December 2017. We observed a long-term upward trend and seasonal variance in the original time series. In the training set, the RMSE and MAE of Elman and Jordan neural networks were lower than those in the ARIMA model, whereas the MAPE of Elman and Jordan neural networks was slightly higher than that in the ARIMA model. In the test set, the RMSE, MAE and MAPE of Elman and Jordan neural networks were far lower than those in the ARIMA model. CONCLUSIONS:The Elman and Jordan recurrent neural networks achieved much higher forecasting accuracy. These models are more suitable for forecasting nonlinear time series data, such as human brucellosis than the traditional ARIMA model.
Project description:OBJECTIVES:Haemorrhagic fever with renal syndrome (HFRS) is a serious threat to public health in China, accounting for almost 90% cases reported globally. Infectious disease prediction may help in disease prevention despite some uncontrollable influence factors. This study conducted a comparison between a hybrid model and two single models in forecasting the monthly incidence of HFRS in China. DESIGN:Time-series study. SETTING:The People's Republic of China. METHODS:Autoregressive integrated moving average (ARIMA) model, generalised regression neural network (GRNN) model and hybrid ARIMA-GRNN model were constructed by R V.3.4.3 software. The monthly reported incidence of HFRS from January 2011 to May 2018 were adopted to evaluate models' performance. Root mean square error (RMSE), mean absolute error (MAE) and mean absolute percentage error (MAPE) were adopted to evaluate these models' effectiveness. Spatial stratified heterogeneity of the time series was tested by month and another GRNN model was built with a new series. RESULTS:The monthly incidence of HFRS in the past several years showed a slight downtrend and obvious seasonal variation. A total of four plausible ARIMA models were built and ARIMA(2,1,1) (2,1,1)12 model was selected as the optimal model in HFRS fitting. The smooth factors of the basic GRNN model and the hybrid model were 0.027 and 0.043, respectively. The single ARIMA model was the best in fitting part (MAPE=9.1154, MAE=89.0302, RMSE=138.8356) while the hybrid model was the best in prediction (MAPE=17.8335, MAE=152.3013, RMSE=196.4682). GRNN model was revised by building model with new series and the forecasting performance of revised model (MAPE=17.6095, MAE=163.8000, RMSE=169.4751) was better than original GRNN model (MAPE=19.2029, MAE=177.0356, RMSE=202.1684). CONCLUSIONS:The hybrid ARIMA-GRNN model was better than single ARIMA and basic GRNN model in forecasting monthly incidence of HFRS in China. It could be considered as a decision-making tool in HFRS prevention and control.
Project description:Hospital crowding is a rising problem, effective predicting and detecting managment can helpful to reduce crowding. Our team has successfully proposed a hybrid model combining both the autoregressive integrated moving average (ARIMA) and the nonlinear autoregressive neural network (NARNN) models in the schistosomiasis and hand, foot, and mouth disease forecasting study. In this paper, our aim is to explore the application of the hybrid ARIMA-NARNN model to track the trends of the new admission inpatients, which provides a methodological basis for reducing crowding.We used the single seasonal ARIMA (SARIMA), NARNN and the hybrid SARIMA-NARNN model to fit and forecast the monthly and daily number of new admission inpatients. The root mean square error (RMSE), mean absolute error (MAE) and mean absolute percentage error (MAPE) were used to compare the forecasting performance among the three models. The modeling time range of monthly data included was from January 2010 to June 2016, July to October 2016 as the corresponding testing data set. The daily modeling data set was from January 4 to September 4, 2016, while the testing time range included was from September 5 to October 2, 2016.For the monthly data, the modeling RMSE and the testing RMSE, MAE and MAPE of SARIMA-NARNN model were less than those obtained from the single SARIMA or NARNN model, but the MAE and MAPE of modeling performance of SARIMA-NARNN model did not improve. For the daily data, all RMSE, MAE and MAPE of NARNN model were the lowest both in modeling stage and testing stage.Hybrid model does not necessarily outperform its constituents' performances. It is worth attempting to explore the reliable model to forecast the number of new admission inpatients from different data.
Project description:<b>Objectives: </b>Human brucellosis is a public health problem endangering health and property in China. Predicting the trend and the seasonality of human brucellosis is of great significance for its prevention. In this study, a comparison between the autoregressive integrated moving average (ARIMA) model and the eXtreme Gradient Boosting (XGBoost) model was conducted to determine which was more suitable for predicting the occurrence of brucellosis in mainland China.<br><br><b>Design: </b>Time-series study.<br><br><b>Setting: </b>Mainland China.<br><br><b>Methods: </b>Data on human brucellosis in mainland China were provided by the National Health and Family Planning Commission of China. The data were divided into a training set and a test set. The training set was composed of the monthly incidence of human brucellosis in mainland China from January 2008 to June 2018, and the test set was composed of the monthly incidence from July 2018 to June 2019. The mean absolute error (MAE), root mean square error (RMSE) and mean absolute percentage error (MAPE) were used to evaluate the effects of model fitting and prediction.<br><br><b>Results: </b>The number of human brucellosis patients in mainland China increased from 30?002 in 2008 to 40?328 in 2018. There was an increasing trend and obvious seasonal distribution in the original time series. For the training set, the MAE, RSME and MAPE of the ARIMA(0,1,1)×(0,1,1)<sub>12</sub> model were 338.867, 450.223 and 10.323, respectively, and the MAE, RSME and MAPE of the XGBoost model were 189.332, 262.458 and 4.475, respectively. For the test set, the MAE, RSME and MAPE of the ARIMA(0,1,1)×(0,1,1)<sub>12</sub> model were 529.406, 586.059 and 17.676, respectively, and the MAE, RSME and MAPE of the XGBoost model were 249.307, 280.645 and 7.643, respectively.<br><br><b>Conclusions: </b>The performance of the XGBoost model was better than that of the ARIMA model. The XGBoost model is more suitable for prediction cases of human brucellosis in mainland China.
Project description:<h4>Background</h4>Cases of hemorrhagic fever with renal syndrome (HFRS) are widely distributed in eastern Asia, especially in China, Russia, and Korea. It is proved to be a difficult task to eliminate HFRS completely because of the diverse animal reservoirs and effects of global warming. Reliable forecasting is useful for the prevention and control of HFRS.<h4>Methods</h4>Two hybrid models, one composed of nonlinear autoregressive neural network (NARNN) and autoregressive integrated moving average (ARIMA) the other composed of generalized regression neural network (GRNN) and ARIMA were constructed to predict the incidence of HFRS in the future one year. Performances of the two hybrid models were compared with ARIMA model.<h4>Results</h4>The ARIMA, ARIMA-NARNN ARIMA-GRNN model fitted and predicted the seasonal fluctuation well. Among the three models, the mean square error (MSE), mean absolute error (MAE) and mean absolute percentage error (MAPE) of ARIMA-NARNN hybrid model was the lowest both in modeling stage and forecasting stage. As for the ARIMA-GRNN hybrid model, the MSE, MAE and MAPE of modeling performance and the MSE and MAE of forecasting performance were less than the ARIMA model, but the MAPE of forecasting performance did not improve.<h4>Conclusion</h4>Developing and applying the ARIMA-NARNN hybrid model is an effective method to make us better understand the epidemic characteristics of HFRS and could be helpful to the prevention and control of HFRS.
Project description:It is a daunting task to eradicate tuberculosis completely in Heng County due to a large transient population, human immunodeficiency virus/tuberculosis coinfection, and latent infection. Thus, a high-precision forecasting model can be used for the prevention and control of tuberculosis. In this study, four models including a basic autoregressive integrated moving average (ARIMA) model, a traditional ARIMA-generalized regression neural network (GRNN) model, a basic GRNN model, and a new ARIMA-GRNN hybrid model were used to fit and predict the incidence of tuberculosis. Parameters including mean absolute error (MAE), mean absolute percentage error (MAPE), and mean square error (MSE) were used to evaluate and compare the performance of these models for fitting historical and prospective data. The new ARIMA-GRNN model had superior fit relative to both the traditional ARIMA-GRNN model and basic ARIMA model when applied to historical data and when used as a predictive model for forecasting incidence during the subsequent 6 months. Our results suggest that the new ARIMA-GRNN model may be more suitable for forecasting the tuberculosis incidence in Heng County than traditional models.
Project description:BACKGROUND:Hepatitis B virus (HBV) infection is a major public health threat in China for China has a hepatitis B prevalence of more than one million people in 2017 year. Disease incidence prediction may help hepatitis B prevention and control. This study intends to build and compare 2 forecasting models for hepatitis B incidence in China. METHODS:Autoregressive integrated moving average (ARIMA) model and grey model GM(1,1) were adopted to fit the monthly incidence of hepatitis B in China from March 2010 to October 2017. The fitting and forecasting performances of the 2 models were evaluated. The better one was adopted to predict the incidence from November 2017 to March 2018. Database was built by Excel 2016 and statistical analysis was completed using R 3.4.3 software. RESULTS:Descriptive analysis showed that the incidence of hepatitis B in China has seasonal variation and has shown a downward trend from 2010 to 2017. We selected the ARIMA (3,1,1) (0,1,2)12 model among all the ARIMA models for it has the lowest AIC value. Model expression of GM (1,1) was X(1) (k + 1) = 3386876.7478e0.0249k - 3289206.7428. The root mean square error (RMSE), mean absolute error (MAE) and mean absolute percentage error (MAPE) of ARIMA(3,1,1)(0,1,2)12 model were lower than GM(1,1) model on fitting part and forecasting part. According to the forecast results, the incidence may have a slight fluctuation during the following months. CONCLUSIONS:The ARIMA model showed better hepatitis B fitting and forecasting performance than GM(1,1) model. It is a potential decision supportive tool for controlling hepatitis B in China before a predictive hepatitis B outbreak.
Project description:BACKGROUND:Accurate and reliable predictions of infectious disease can be valuable to public health organizations that plan interventions to decrease or prevent disease transmission. A great variety of models have been developed for this task. However, for different data series, the performance of these models varies. Hepatitis E, as an acute liver disease, has been a major public health problem. Which model is more appropriate for predicting the incidence of hepatitis E? In this paper, three different methods are used and the performance of the three methods is compared. METHODS:Autoregressive integrated moving average(ARIMA), support vector machine(SVM) and long short-term memory(LSTM) recurrent neural network were adopted and compared. ARIMA was implemented by python with the help of statsmodels. SVM was accomplished by matlab with libSVM library. LSTM was designed by ourselves with Keras, a deep learning library. To tackle the problem of overfitting caused by limited training samples, we adopted dropout and regularization strategies in our LSTM model. Experimental data were obtained from the monthly incidence and cases number of hepatitis E from January 2005 to December 2017 in Shandong province, China. We selected data from July 2015 to December 2017 to validate the models, and the rest was taken as training set. Three metrics were applied to compare the performance of models, including root mean square error(RMSE), mean absolute percentage error(MAPE) and mean absolute error(MAE). RESULTS:By analyzing data, we took ARIMA(1, 1, 1), ARIMA(3, 1, 2) as monthly incidence prediction model and cases number prediction model, respectively. Cross-validation and grid search were used to optimize parameters of SVM. Penalty coefficient C and kernel function parameter g were set 8, 0.125 for incidence prediction, and 22, 0.01 for cases number prediction. LSTM has 4 nodes. Dropout and L2 regularization parameters were set 0.15, 0.001, respectively. By the metrics of RMSE, we obtained 0.022, 0.0204, 0.01 for incidence prediction, using ARIMA, SVM and LSTM. And we obtained 22.25, 20.0368, 11.75 for cases number prediction, using three models. For MAPE metrics, the results were 23.5%, 21.7%, 15.08%, and 23.6%, 21.44%, 13.6%, for incidence prediction and cases number prediction, respectively. For MAE metrics, the results were 0.018, 0.0167, 0.011 and 18.003, 16.5815, 9.984, for incidence prediction and cases number prediction, respectively. CONCLUSIONS:Comparing ARIMA, SVM and LSTM, we found that nonlinear models(SVM, LSTM) outperform linear models(ARIMA). LSTM obtained the best performance in all three metrics of RSME, MAPE, MAE. Hence, LSTM is the most suitable for predicting hepatitis E monthly incidence and cases number.
Project description:BACKGROUND:Hepatitis is a serious public health problem with increasing cases and property damage in Heng County. It is necessary to develop a model to predict the hepatitis epidemic that could be useful for preventing this disease. METHODS:The autoregressive integrated moving average (ARIMA) model and the generalized regression neural network (GRNN) model were used to fit the incidence data from the Heng County CDC (Center for Disease Control and Prevention) from January 2005 to December 2012. Then, the ARIMA-GRNN hybrid model was developed. The incidence data from January 2013 to December 2013 were used to validate the models. Several parameters, including mean absolute error (MAE), root mean square error (RMSE), mean absolute percentage error (MAPE) and mean square error (MSE), were used to compare the performance among the three models. RESULTS:The morbidity of hepatitis from Jan 2005 to Dec 2012 has seasonal variation and slightly rising trend. The ARIMA(0,1,2)(1,1,1)12 model was the most appropriate one with the residual test showing a white noise sequence. The smoothing factor of the basic GRNN model and the combined model was 1.8 and 0.07, respectively. The four parameters of the hybrid model were lower than those of the two single models in the validation. The parameters values of the GRNN model were the lowest in the fitting of the three models. CONCLUSIONS:The hybrid ARIMA-GRNN model showed better hepatitis incidence forecasting in Heng County than the single ARIMA model and the basic GRNN model. It is a potential decision-supportive tool for controlling hepatitis in Heng County.
Project description:BACKGROUND:The identification of statistical models for the accurate forecast and timely determination of the outbreak of infectious diseases is very important for the healthcare system. Thus, this study was conducted to assess and compare the performance of four machine-learning methods in modeling and forecasting brucellosis time series data based on climatic parameters. METHODS:In this cohort study, human brucellosis cases and climatic parameters were analyzed on a monthly basis for the Qazvin province-located in northwestern Iran- over a period of 9 years (2010-2018). The data were classified into two subsets of education (80%) and testing (20%). Artificial neural network methods (radial basis function and multilayer perceptron), support vector machine and random forest were fitted to each set. Performance analysis of the models were done using the Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Mean Absolute Root Error (MARE), and R2 criteria. RESULTS:The incidence rate of the brucellosis in Qazvin province was 27.43 per 100,000 during 2010-2019. Based on our results, the values of the RMSE (0.22), MAE (0.175), MARE (0.007) criteria were smaller for the multilayer perceptron neural network than their values in the other three models. Moreover, the R2 (0.99) value was bigger in this model. Therefore, the multilayer perceptron neural network exhibited better performance in forecasting the studied data. The average wind speed and mean temperature were the most effective climatic parameters in the incidence of this disease. CONCLUSIONS:The multilayer perceptron neural network can be used as an effective method in detecting the behavioral trend of brucellosis over time. Nevertheless, further studies focusing on the application and comparison of these methods are needed to detect the most appropriate forecast method for this disease.
Project description:In recent years, the incidence of hepatitis B (HB) in Guangxi is higher than that of the national level; it has been increasing, so it is urgent to do a good predictive research of HB incidence, which can help analyze the early warning of hepatitis B in Guangxi, China. In the study, the feasibility of predicting HB incidence in Guangxi by autoregressive integrated moving average (ARIMA) model method and Elman neural network (ElmanNN) method was discussed respectively, and the prediction accuracy of the two models was compared. Finally, we established the ARIMA (0, 1, 1) model and ElmanNN with 8 neurons. Both ARIMA (0, 1, 1) model and ElmanNN model had good performance, and their prediction accuracy were high. The fitting and prediction root-mean-square error (RMSE) and mean absolute error (MAE) of ElmanNN were smaller than those of ARIMA (0, 1, 1) model, which indicated that ElmanNN was superior to ARIMA (0, 1, 1) model in predicting the incidence of hepatitis B in Guangxi. Based on the ElmanNN, the HB incidence from September 2019 to December 2020 in Guangxi was predicted, the predicted results showed that the incidence of HB in 2020 was slightly higher than that in 2019 and the change trend was similar to that in 2019, for 2021 and beyond, the ElmanNN model could be used to continue the predictive analysis.