Simulating early outbreak trajectories In a first step, we initialised simulations with one index case. For each primary case, we generated secondary cases according to a negative-binomial offspring distribution with mean R0 and dispersion k [7,8]. The dispersion parameter k quantifies the variability in the number of secondary cases, and can be interpreted as a measure of the impact of superspreading events (the lower the value of k, the higher the impact of superspreading). The generation time interval D was assumed to be gamma-distributed with a shape parameter of 2, and a mean that varied between 7 and 14 days. We explored a wide range of parameter combinations (Table) and ran 1,000 stochastic simulations for each individual combination. This corresponds to a total of 3.52 million one-index-case simulations that were run on UBELIX (http://www.id.unibe.ch/hpc), the high performance computing cluster at the University of Bern, Switzerland. Table Parameter ranges for stochastic simulations of outbreak trajectories, 2019 novel coronavirus outbreak, China, 2019–2020 Parameter Description Range Number of values explored within the range R0 Basic reproduction number 0.8–5.0 22 (equidistant) k Dispersion parameter 0.0110 20 (equidistant on log10 scale) D Generation time interval (days) 9–11,13,16–19 8 (equidistant) n Initial number of index cases 1–50 6 (equidistant) T Date of zoonotic transmission 20 Nov–4 Dec 2019 Randomised for each index case In a second step, we accounted for the uncertainty regarding the number of index cases n and the date T of the initial zoonotic animal-to-human transmissions at the wet market in Wuhan. An epidemic with several index cases can be considered as the aggregation of several independent epidemics with one index case each. We sampled (with replacement) n of the one-index-case epidemics, sampled a date of onset for each index case and aggregated the epidemic curves together. The sampling of the date of onset was done uniformly from a 2-week interval around 27 November 2019, in coherence with early phylogenetic analyses of 11 2019-nCoV genomes [10]. This step was repeated 100 times for each combination of R0 (22 points), k (20 points), D (8 points) and n (6 points) for a total of 2,112,000 full epidemics simulated that included the uncertainty on D, n and T. Finally, we calculated the proportion of stochastic simulations that reached a total number of infected cases within the interval between 1,000 and 9,700 by 18 January 2020, as estimated by Imai et al. [6]. In a process related to approximate Bayesian computation (ABC), the parameter value combinations that led to simulations within that interval were treated as approximations to the posterior distributions of the parameters with uniform prior distributions. Model simulations and analyses were performed in the R software for statistical computing [11]. Code files are available on https://github.com/jriou/wcov.