Statistical analyses The monthly pattern for the cumulative number of COVID-19 cases in each country/region was visualized in relation to the geography, biome type, and climate (mean temperature and annual precipitation) of that location. In addition, the pattern of increasing COVID-19 case numbers was evaluated based on country type, with individual countries being classified into four types defined by the number of COVID-19 cases per week and the date of outbreak onset. To ensure the robustness of our results, we investigated the relationship between various environmental variables (climate, host susceptibility to COVID-19, international human mobility, and socioeconomic factors) and the number of COVID-19 cases (per 1 million population) using the two different approaches: conventional multiple linear regression and random forest, which is a machine-learning model [15]. We separately modeled the cumulative number of COVID-19 cases (per 1 million population) in successive periods from December 2019 to June 30, 2020. In the multiple regression analysis, we set the log-scaled cumulative number of COVID-19 cases within a period as the response variable and the climatic factors (mean temperature, squared mean temperature, and log-scaled monthly precipitation), socioeconomic conditions (log-scaled population density and GDP per person), international human mobility (the relative amount of foreign visitors per population) and region-specific COVID-19 susceptibility (the percentage of people aged ≥ 65 years, the log-scaled relative incidence of malaria, and the BCG vaccination effect) as explanatory variables. To control for country/region-specific observation biases, we included the length of time (measured in days) since the first confirmed COVID-19 case in each country/region and the number of COVID-19 tests conducted (as a measure of sampling effort) as covariates. In addition, we applied the trend surface method to take spatial autocorrelation into account as a covariate; we added the first eigenvector of the geo-distance matrix among the countries or regions, which was computed using the geocoordinates of the largest city, as a covariate [16]. The explanatory power of the model was evaluated by the adjusted coefficient of determination (R2). We also calculated the relative importance of each explanatory variable in a regression model according to its partial coefficient of determination and determined the predominant variables that explained the variance in the response variables. The statistical significance of each variable was determined by conducting F-test. All the explanatory variables were standardized to have a mean of zero and a variance of one before these analyses. The explanatory factors of the regression model were compared between the four country types. In the random forest model, we used the same set of response and explanatory variables, as well as the same covariates. In each run of the random forest analysis, we generated 1,000 regression trees. The model performance was evaluated by the proportion of variance explained by the model. We evaluated the relative importance of each explanatory variable based on the increase in the mean squared error when the variable was permutated. Before these analyses, we tested the collinearity between the explanatory variables by calculating the variance inflation factor (VIF). For the study period, the largest VIF value was 8.56, and the VIF at June 30, 2020 was 8.56, indicating the absence of multicollinearity in the regression. To confirm the testing effort bias on the number of confirmed cases, we conducted an additional analysis that accounted for the number of conducted tests (i.e., sampling efforts) in individual countries/regions, as a covariate in the model. Note that this analysis was applied to the data from 128/828 countries/regions, because testing data for many countries is currently unavailable (https://ourworldindata.org/covid-testing). All analyses were performed with the R environment for statistical computing [17]; the ‘sf’ package was used for graphics artworks [18] and the ‘randomForest’ package was used for the random forest analysis [19].