PMC:7033348 / 6675-10475 JSONTXT 9 Projects

Real-time forecasts of the COVID-19 epidemic in China from February 5th to February 24th, 2020 Abstract The initial cluster of severe pneumonia cases that triggered the COVID-19 epidemic was identified in Wuhan, China in December 2019. While early cases of the disease were linked to a wet market, human-to-human transmission has driven the rapid spread of the virus throughout China. The Chinese government has implemented containment strategies of city-wide lockdowns, screening at airports and train stations, and isolation of suspected patients; however, the cumulative case count keeps growing every day. The ongoing outbreak presents a challenge for modelers, as limited data are available on the early growth trajectory, and the epidemiological characteristics of the novel coronavirus are yet to be fully elucidated. We use phenomenological models that have been validated during previous outbreaks to generate and assess short-term forecasts of the cumulative number of confirmed reported cases in Hubei province, the epicenter of the epidemic, and for the overall trajectory in China, excluding the province of Hubei. We collect daily reported cumulative confirmed cases for the 2019-nCoV outbreak for each Chinese province from the National Health Commission of China. Here, we provide 5, 10, and 15 day forecasts for five consecutive days, February 5th through February 9th, with quantified uncertainty based on a generalized logistic growth model, the Richards growth model, and a sub-epidemic wave model. Our most recent forecasts reported here, based on data up until February 9, 2020, largely agree across the three models presented and suggest an average range of 7409–7496 additional confirmed cases in Hubei and 1128–1929 additional cases in other provinces within the next five days. Models also predict an average total cumulative case count between 37,415 and 38,028 in Hubei and 11,588–13,499 in other provinces by February 24, 2020. Mean estimates and uncertainty bounds for both Hubei and other provinces have remained relatively stable in the last three reporting dates (February 7th – 9th). We also observe that each of the models predicts that the epidemic has reached saturation in both Hubei and other provinces. Our findings suggest that the containment strategies implemented in China are successfully reducing transmission and that the epidemic growth has slowed in recent days. Introduction The ongoing epidemic of the novel coronavirus (SARS-CoV-2) is primarily affecting mainland China and can be traced back to a cluster of severe pneumonia cases identified in Wuhan, China in December 2019 (Li et al., 2020; World Health Organization, 2020). Early cases of the disease have been linked to a live animal seafood market in Wuhan, pointing to a zoonotic origin of the epidemic. However, human-to-human transmission has driven its rapid spread with a total of 37,289 confirmed cases, including 813 deaths, in China and 302 confirmed cases imported in multiple countries as of February 9, 2020 (Chinese National Health Committee). While the early transmission potential of this novel coronavirus appeared similar to that of severe acute respiratory syndrome (SARS) (Riou & Althaus, 2020), the current tally of the epidemic has already surpassed the total cases reported for the SARS outbreaks in 2002–2003 (W. World Health Organization, 2003; Wu, Leung, & Leung, 2020; Zhang et al., 2020). The timing and location of the outbreak facilitated the rapid transmission of the virus within a highly mobile population. The initial reporting of observed cases occurred during the traditional Chinese New Year, when the largest population movement takes place every year (Ai et al., 2020). Further, Wuhan is a highly populated city with more than 11 million residents and is connected to many cities in China through public transportation, such as buses, trains, and flights (Lai et al., 2020; Read, Bridgen, Cummings, Ho, & Jewell, 2020). In the absence of pharmaceutical interventions, rapid action was required by the Chinese government to mitigate transmission within and outside of Wuhan. On January 23, 2020, the Chinese government implemented a strict lockdown of Wuhan, followed by several nearby cities in subsequent days; the lockdowns include temporarily suspending all public transportation and advising residents to remain at home (Du et al., 2020; Wu et al., 2020). Further, many high-speed rail stations and airports have implemented screening measures to detect travelers with a fever, specifically those traveling from Wuhan, and those with a fever are referred to public hospitals (Lai et al., 2020; Wu et al., 2020). Within hospitals, patients who fulfill clinical and epidemiological characteristics of 2019-nCoV are immediately isolated. The number of 2019-nCoV cases in Wuhan quickly outnumbered the available number of beds in hospitals, putting a substantial burden on the healthcare system. Consequently, the government rapidly built and launched two new hospitals with capacity for 1,600 and 1,000 beds, respectively, in Wuhan in addition to the existing 132 quarantine sites with more than 12,500 beds (Steinbuch, 2020). To anticipate additional resources to combat the epidemic, mathematical and statistical modeling tools can be useful to generate timely short-term forecasts of reported cases. These predictions can include estimates of expected morbidity burden that can help guide public health officials preparing the medical care and other resources needed to confront the epidemic. Short-term forecasts can also guide the intensity and type of interventions needed to mitigate an epidemic (Funk, Camacho, Kucharski, Eggo, & Edmunds, 2016; Shanafelt, Jones, Lima, Perrings, & Chowell, 2017). In the absence of vaccines or antiviral drugs for 2019-nCoV, the effective implementation of nonpharmaceutical interventions, such as personal protection and social distancing, will be critical to bring the epidemic under control. In this emerging epidemic, the epidemiological data is limited, and the epidemiological parameters needed to calibrate elaborate mechanistic transmission models are not yet fully elucidated. Real-time short-term forecasts must be based on dynamic phenomenological models that have been validated during previous outbreaks (Chowell et al., 2016; Pell, Kuang, Viboud, & Chowell, 2018)(Bürger, Chowell, & Lara-Díıaz, 2019). We employ several dynamic models to generate and assess 5, 10, and 15 day ahead forecasts of the cumulative number of confirmed cases in Hubei province, the epicenter of the epidemic, and the overall trajectory of the epidemic in China excluding the province of Hubei. Methods Data We obtained daily updates of the cumulative number of reported confirmed cases for the 2019-nCoV epidemic across provinces in China from the National Health Commission of China website (Chinese National Health Commission). The data contains 34 areas, including provinces, municipalities, autonomous regions, and special administrative regions; here we refer to the regions collectively as provinces. Data updates were collected daily at 12 p.m. (GMT-5), between January 22, 2020 and February 9, 2020. The short time-series is affected by irregularities and reporting lags, so the cumulative curves are more stable and likely yield more stable and reliable estimates. Therefore, we analyze the cumulative trajectory of the epidemic in Hubei province, the epicenter of the outbreak, as well as the cumulative aggregate trajectory of all other provinces. Models We generate short-term forecasts in real-time using three phenomenological models that have been previously used to derive short-term forecasts for a number of epidemics for several infectious diseases, including SARS, Ebola, pandemic influenza, and dengue (Chowell, Tariq, & Hyman, 2019; Pell et al., 2018; Wang, Wu, & Yang, 2012). The generalized logistic growth model (GLM) extends the simple logistic growth model to accommodate sub-exponential growth dynamics with a scaling of growth parameter, p (Viboud, Simonsen, & Chowell, 2016). The Richards model also includes a scaling parameter, a, to allow for deviation from the symmetric logistic curve (Chowell, 2017; Richards, 1959; Wang et al., 2012). We also include a recently developed sub-epidemic wave model that supports complex epidemic trajectories, including multiple peaks (i.e., SARS in Singapore (Chowell et al., 2019)). In this approach, the observed reported curve is assumed to be the aggregate of multiple underlying sub-epidemics (Chowell et al., 2019). A detailed description for each of the models is included in the Supplement. Short-term forecasts We calibrate each model to the daily cumulative reported case counts for Hubei and other provinces (all except Hubei). While the outbreak began in December 2019, available data on cumulative case counts are available starting on January 22, 2020. Therefore, the first calibration process includes 15 observations: from January 22, 2020 to February 5, 2020. Each subsequent calibration period increases by one day with each new published daily data, with the last calibration period between January 22, 2020 and February 9, 2020 (19 data points). We estimate the best-fit model solution to the reported data using nonlinear least squares fitting. This process yields the set of model parameters Θ that minimizes the sum of squared errors between the model f(t,Θ) and the data yt; where ΘGLM = (r, p, K), ΘRich = (r, a, K), and ΘSub = (r, p, K 0 , q, C thr) correspond to the estimated parameter sets for the GLM, the Richards model, and the sub-epidemic model, respectively; parameter descriptions are provided in the Supplement. Thus, the best-fit solution f(t,Θˆ) is defined by the parameter set Θˆ=argmin∑t=1n(f(t,Θ)−yt)2. We fix the initial condition to the first data point. We then use a parametric bootstrap approach to quantify uncertainty around the best-fit solution, assuming a Poisson error structure. A detailed description of this method is provided in prior studies (Chowell, 2017; Roosa & Chowell, 2019). The models are refitted to the M = 200 bootstrap datasets to obtain M parameter sets, which are used to define 95% confidence intervals for each parameter. Each of the M model solutions to the bootstrap curves is used to generate m = 30 simulations extended through a forecasting period of 15 days. These 6000 (M × m) curves construct the 95% prediction intervals for the forecasts. Results We generated 5, 10, and 15 day ahead forecasts for Hubei and other provinces excluding Hubei for 5 consecutive dates: February 5, 2020 to February 9, 2020. Fig. 1, Fig. 2, Fig. 3 represent the range of 5, 10, and 15 day ahead forecasts, respectively, by the date generated, and we compare the daily short-term forecasts of cumulative case counts across dates as more data become available. Current cumulative reported case counts as of February 9, 2020 are 27,100 for Hubei and 10,189 in other provinces (Chinese National Health Commission). Model calibration Our results for Hubei province indicate that the parameter estimates for the three models tend to stabilize and decrease in uncertainty as more data become available (Supplemental Table 1). In particular, the growth rate r decreases and appears to be converging over time, particularly for the GLM and sub-epidemic model. Parameter K also follows this general trend, with prediction intervals decreasing significantly in width as more data become available. Importantly, the p estimates from the GLM indicate that the epidemic growth in Hubei is close to exponential (p = 0.99 (95% CI: 0.98, 1) – February 9th). Further, growth rate and scaling parameter estimates have remained relatively stable over the last three reporting dates, while estimates of K are still declining. This may correlate with the effectiveness of control measures or the slowing of the epidemic. For the trajectory that aggregates all other provinces (excluding Hubei), the parameter estimates follow trends that differ from those for Hubei (Supplemental Table 2). While the three models estimated stable and nearly equivalent growth rates in Hubei, the estimated growth rates for other provinces vary across models and do not follow a distinct trend as more data become available. However, the scaling and size parameters remain relatively stable across all dates. Further, the p estimates from the GLM reveal a consistent sub-exponential growth pattern in other provinces (p = 0.67 (95% CI: 0.64, 0.70) – February 9th). 5-days ahead forecasts The latest 5-day ahead forecasts, generated on February 9, 2020, estimate an average of 34,509–34,596 total cumulative cases in Hubei by February 14, 2020 across the three models (Fig. 1 a). For other provinces, the models predict an average range of 11,317–12,118 cumulative cases by February 14 (Fig. 1b). Based on cumulative reported cases as of February 9th, these estimates correspond with an average of 7409–7496 additional cases in Hubei and 1128–1929 additional cases in other provinces within the next 5 days. Fig. 1 Forecasting results for 5-days ahead estimates, generated daily from February 5–9, 2020, of cumulative reported cases in Hubei (a) and other provinces (b). The mean case estimate is represented by the dots, while the lines represent the 95% prediction intervals for each model. Comparing the 5-day ahead forecasts generated daily on February 5–9, 2020, the GLM and Richards models yield comparable prediction intervals in Hubei, while the sub-epidemic model yields wider intervals than the other models. Also, 5 day ahead forecasts from the sub-epidemic model on February 5th and 6th predict significantly higher case counts in Hubei compared to forecasts generated on February 7th and beyond (Fig. 1a). For other provinces, the GLM and Richards model yield intervals of similar widths, but the GLM predicts higher case counts than the Richards model across all dates (Fig. 1b). Further, the sub-epidemic model has significantly wider prediction intervals compared to the other models for all forecasts for other provinces. While the uncertainty of the predictions decreases as more data became available in Hubei, the uncertainty of the predictions for other provinces remain relatively stable, compared to forecasts from earlier dates. 10-days ahead forecasts The 10 day ahead forecasts generated on February 9, 2020 from the three models estimate between 36,854 and 37,230 cumulative cases, on average, in Hubei by February 19, 2020 (Fig. 2 a). For other provinces, the latest 10 day ahead forecasts predict average cumulative case counts between 11,549 and 13,069 cases across the three models (Fig. 2b). These estimates correspond with an additional 9754–10,130 cases in Hubei and an additional 1360–2880 cases reported in other provinces on average in the next 10 days. Fig. 2 Forecasting results for 10-days ahead estimates, generated daily from February 5–9, 2020, of cumulative reported cases in Hubei (a) and other provinces (b). The mean case estimate is represented by the dots, while the lines represent the 95% prediction intervals for each model. 10 day ahead forecasts of case counts in Hubei generated on February 5th show significantly different results between the GLM and Richards versus the sub-epidemic model, with the sub-epidemic model predicting significantly higher case counts (Fig. 2a). For forecasts generated after February 5th, the prediction intervals of the three models are comparable, with the GLM intervals having the lowest uncertainty, followed by the Richards model (Fig. 2a). For other provinces, the sub-epidemic model yields significantly wider prediction intervals than the other two models. Like the 5 day ahead forecasts, the 10 day ahead prediction intervals become increasingly narrow for Hubei when including more data, but uncertainty remains relatively stable in other provinces. 15-days ahead forecasts The latest 15 day ahead forecasts predict a cumulative reported case count between 37,415 and 38,028 cases, on average, in Hubei by February 24, 2020. Further, the latest 15 day ahead forecasts suggest an average cumulative case count between 11,588 and 13,499 cases for other provinces. These forecasts correspond with an additional 10,315–10,928 cases in Hubei and an additional 1399–3310 cases in other provinces within the next 15 days. Again, the sub-epidemic model yields significantly higher forecasts for Hubei on February 5th, compared to the other models and compared to subsequent prediction intervals on following dates (Fig. 3 a). The width of prediction intervals decreases as more data are included for each of the models in both Hubei and other provinces. This is consistent with shorter-term forecasts in Hubei but differs from the pattern of shorter-term forecasts in other provinces. Fig. 3 Forecasting results for 15-days ahead estimates, generated daily from February 5–9, 2020, of cumulative reported cases in Hubei (a) and other provinces (b). The mean case estimate is represented by the dots, while the lines represent the 95% prediction intervals for each model. Discussion In this report, we provide timely short-term forecasts of the cumulative number of reported cases of the 2019-nCoV epidemic in Hubei province and other provinces in China as of February 9, 2020. As the epidemic continues, we are also publishing online daily 10day ahead forecasts including each of the models presented here (Roosa & Chowell, 2020). Based on the three models calibrated to data up until February 9, 2020, we forecast a cumulative number of reported cases between 37,415 and 38,028 in Hubei Province and 11,588–13,499 in other provinces by February 24, 2020. Our models yield a good visual fit to the epidemic curves, based on residuals, with the sub-epidemic model outperforming the other models in terms of mean squared error (MSE) (Supplemental Tables 1 and 2). Parameter estimation results from the GLM consistently show that the epidemic growth is near exponential in Hubei and sub-exponential in other provinces. Overall, models predict similar ranges of short-term forecasts, except for those generated on February 5th, where the sub-epidemic model predicts significantly higher case counts than the other two models (Figs. 1–3). The sub-epidemic model predicts similar ranges to the other models for subsequent dates, so the higher ranges on February 5th may indicate that more data are required to inform the parameters of the sub-epidemic model. We observe that the width of the prediction intervals decreases on average as more data are included for forecasts in Hubei; however, this pattern is not obvious for our analysis based on other provinces. This can, in part, be attributed to the smaller case counts and smaller initial prediction interval range seen in other provinces. Mean predictions and associated uncertainty remain relatively stable in other provinces though, while the mean estimates of 10 and 15 days ahead decrease significantly in Hubei (Fig. 2, Fig. 3). This suggests that the epidemic lasts longer in Hubei compared to other provinces (Fig. 4, Fig. 5, Fig. 6 ), which may be attributed to intensive control efforts and large-scale social distancing interventions. Therefore, it is not necessarily surprising that estimates from earlier dates, specifically prior to saturation, yield predictions with higher uncertainty. Fig. 4 15-day ahead GLM forecasts of cumulative reported 2019-nCoV cases in China – Hubei and other provinces – generated on February 9, 2020. Fig. 5 15-day ahead Richards forecasts of cumulative reported 2019-nCoV cases in China – Hubei and other provinces – generated on February 9, 2020. Fig. 6 15-day ahead sub-epidemic model forecasts of cumulative reported 2019-nCoV cases in China – Hubei and other provinces – generated on February 9, 2020. We retrieve the data from the Chinese media conglomerate Tencent (Chinese National Health Commission); however, the data show small differences in case counts compared to data of the epidemic reported by other sources (Johns Hopkins University Center for Systems Science and Engineering, 2020). Importantly, the curves of confirmed cases that we employ in our study are reported according to reporting date and could be influenced by testing capacity and other related factors. Further, there may be significant delays in identifying, isolating, and reporting cases in Hubei due to the magnitude of the epidemic, which could influence our predictions. Incidence curves according to the date of symptom onset could provide a clearer picture of the transmission dynamics during an epidemic. We also note that we analyzed the epidemic curves starting on January 22, 2020, but the epidemic started in December 2019. Hence, the first data point accumulates cases up until January 22, 2020, as data were not available prior to this date. The 2019-nCoV outbreak in China presents a significant challenge for modelers, as there are limited data available on the early growth trajectory, and epidemiological characteristics of the novel coronavirus have not been fully elucidated. Our timely short-term forecasts based on phenomenological models can be useful for real-time preparedness, such as anticipating the required number of hospital beds and other medical resources, as they provide an estimate of the number of cases hospitals will need to prepare for in the coming days. In future work, we plan to report the results of a retrospective analysis of forecasting performance across models based on various performance metrics. Of note, the case definition changed on February 12, 2020 to count clinical cases that have not been laboratory tested. As a result in this change in reporting, the province of Hubei experienced a jump in the nuber of cases on February 13th, 2020. This change in reporting will need to be taken into account in order to assess the accuracy of the forecasts reported here. In conclusion, our most recent forecasts, based on data for the last three days (February 7th – 9th, 2020), remained relatively stable. These models predict that the epidemic has reached a saturation point for both Hubei and other provinces. This likely reflects the impact of the wide spectrum of social distancing measures implemented by the Chinese government, which likely helped stabilize the epidemic. The forecasts presented are based on the assumption that current mitigation efforts will continue. Funding GC is supported by 10.13039/100000001NSF grants 1610429 and 1633381. Ethics Not applicable. Data, code and materials Data will be made available in an online repository upon acceptance of manuscript. Author contributions KR and GC conducted forecasts and data analysis; YL retrieved and managed data; All authors contributed to writing and revising subsequent versions of the manuscript. All authors read and approved the final manuscript. Declaration of competing interest Authors declare no competing interests. Appendix A Supplementary data The following is the Supplementary data to this article:Multimedia component 1 Acknowledgements We thank Homma Rafi (Director of Communications, School of Public Health, Georgia State University) for creating and maintaining the online record of daily short-term forecasts. Peer review under responsibility of KeAi Communications Co., Ltd. Appendix A Supplementary data to this article can be found online at https://doi.org/10.1016/j.idm.2020.02.002.

Document structure show

Annnotations TAB TSV DIC JSON TextAE

last updated at 2021-10-18 16:11:32 UTC

  • Denotations: 28
  • Blocks: 0
  • Relations: 0