Journal Pre-proof 1000,000 cases of COVID-19 outside of China: The date predicted by a simple heuristic 1,000,000 CASES OF COVID-19 OUTSIDE OF CHINA: THE DATE PREDICTED BY A SIMPLE HEURISTIC
Abstract
Journal Pre-proof 1 such as ventilators in limited supply, preparations should be made ahead of time on how to allocate these finite resources. More information about COVID-19 can be found in [2], [3], and [7]
. The best course of action to "flatten the curve" is to follow WHO guidelines. The best way to keep hospitals under capacity is social distancing:
limiting or cancelling large gatherings, only travelling when necessary, and keeping a distance from others all help to prevent the spread.
The presented heuristic is based on the exponential growth of the data collected Journal Pre-proof J o u r n a l P r e -p r o o f 4 WWK, MAM, WP, AW-D, PFZ, DS, TA, AHZ, MD, AND JM by WHO situation reports for days 31 to 57. As pointed out in [4] the predictability could be improved by pairwise comparisons based on abductive reasoning [5]. Abduction is frequently used in diagnostic expert systems. The abductive reasoning (or inference) process was used for this study. It is a type of logical inference which starts with a set of observations and then searches for Journal Pre-proof J o u r n a l P r e -p r o o f 1,000,000 COVID-19 CASES X Journal Pre-proof J o u r n a l P r e -p r o o f
J o u r n a l P r e -p r o o f 1,000,000 COVID-19 CASES 3
Abstract. We forecast 1,000,000 COVID-19 cases outside of China by March 30, 2020 based on a heuristic and WHO situation reports. We do not model the COVID-19 pandemic; we model only the number of cases.
The proposed heuristic is based on a simple observation that the plot of the given data is well approximated by an exponential curve. The exponential curve is used for forecasting the growth of new cases. It has been tested for the last situation report of the last day. Its accuracy has been 1.29% for the last day added and predicted by the 57 previous WHO situation reports ( Due to potentially overwhelming numbers of severe COVID-19 patients, medical resources need to be allocated wisely. With hospital beds and life-saving machinery the simplest and most likely explanation for the observations. In our case, the most likely explanation is exponential growth. This process yields a plausible conclusion but may not always positively verify it. The abductive conclusions are heuristics (see [1]), hence involve uncertainty, which is expressed by the bounded rationality as satisficing. Satisficing is a decision making process which takes into account the costs of optimization into the optimization process, thereby producing an efficient but suboptimal result. This can be compared with maximizing, which produces an optimal result at the expense of suboptimal costs.
The extrapolation is a mathematical estimation, predicting unknown future values based on existing values. Compared to interpolation, which determines unknown values between existing values, extrapolation is less accurate. The best method for extrapolation is dependent on which method was used to initially acquire the data.
The WHO situation report #31 (see [7] ) has been assumed as the starting data point since it shows, for the first time, over 1,000 cases outside China (see Fig. 1 ).
Due to the risk of data from any individual country being biased or politically motivated to misreport data, we decided to use data from many countries; as such, any doctored data becomes statistically insignificant. In China, where COVID-19 originated, the situation seems to be under control as the Fig. 2 indicates.
For this reason, including data about China would deviate the results or at least make them difficult to obtain.
The visual inspection suggested the exponential growth, but could not be assumed.
As such, R code was needed to be used for it with its nls function. According to For more details see [8] . We consider a non-linear model of the form:
with type exponential function f (.) of the form:
(2)
In order to estimate the parameters a, b, we apply the non-linear least squares method, in which the residual sum of squares is minimized, see [8]:
where yi is the number of total infected by COVID-19 outside China. In a, b parameters estimation we use well-known nls function from R program receiving: The residual standard error is Su = 1827. According to these results, we predict 1,000,0000 COVID-19 cases outside of China by the WHO situation report day 70/71 which is 31 March / 01 April (see Fig. 3 ).
The lines of the plot, up to the last day of WHO situation report, are:
(1) the blue line connecting 18 March WHO data,
(2) the red line standing for 1,000,000 cases,
(3) the exponential curve computed by R to be as close as possible to the real data up to 18 March.
The vertical blue bar (Fig. 3) shows where the WHO data ends and where the predicted results start. For this reason, on the right hand side of the vertical bar there is only one line which is the computed exponential curve.
Evidently, we do not have knowledge of how long (in terms of days) such an exponential curve will be an acceptable extrapolation; a million cases in 16 days, however, seems to have a high likeliness. Such a finding has considerable importance and should not be ignored.
To the best of our knowledge, this may be the first study proposing a heuristic for computing parameters a and b for the approximating exponential curve a * exp(b * x) and for using x as the day number for the COVID-19 situation. The more people know about our finding, the better chance that they may regard self-care as a major contribution to preventing the spread of COVID-19. Our assumptions do not consider the complexity of a pandemic. In particular, we do not consider flattening of the approximating exponential curve. Simply, it is a short term prediction model, but it is very simple and we believe it is very accurate. As for the prediction standards, 1.29% error is more than acceptable for short term predictions.
We regard the WHO situation report #31 as the starting data point since it shows over 1,000 cases outside China for the first time. The presented approach is based on a heuristic solution and makes a realistic assumption that the current trend can continue for the next 17 days. Obviously, it is an abstract, mathematical model; the reality may be different and COVID-19 situation may change in just a few days.
|