Second, these insertions are present not only in 2019-nCoV viruses but also in three betaCoV sequences from bats: two (ZC45 and ZXC21) from Zhejiang deposited in GenBank in 2018 and RaTG13 from Yunnan obtained in 2013 [8]. The RaTG13 is much more similar to 2019-nCoV than both ZC45 and ZXC21 (Figure 1A). The similarity of the spike protein between RaTG13 and 2019-nCoV is 97.7%. In the RaTG13 genome, two inserts are identical (HKNNKS and RSYLTPGDSSSG) to those in 2019-nCoV, one has one T → I substitution (TNGIKR), and the fourth one misses the C-terminal 4 amino acids (QTNS----) (Figure 1B). ZC45 and ZXC21 are more divergent from 2019-nCoV than RaTG13, but both also contain similar insertions at three insertion sites, except insertion 4 (Figure 1B). Furthermore, many other CoV viruses have similar insertions but with different sequences at the insertion 1 position. These results clearly show that three out of four of these inserts naturally exist in three bat CoV viruses before 2019-nCoV was identified. This undoubtedly refutes the possibility that 2019-nCoV is generated through obtaining gene fragments from the HIV-1 genome. Instead, it is much more likely that 2019-nCoV originated from RaTG13-like CoV viruses. Figure 1. Sequence and structure analysis of 2019-nCoV and bat coronaviruses. (A) Phylogenetic tree analysis of the spike gene sequences. (B) Sequence alignment of suspected insertion sites between the 2019-nCoV and bat coronavirus sequences. The deletions in the alignment are shown as dashes. The numbers of insertions are indicated at the top of the alignment. (C) Structure comparison of the four insertions in the CoV spike protein and HIV-1 gp120. 2019-nCoV structure was modelled using I-TASSER server with default parameters. Only relevant domains with residues 1 to 708 (exclude residues from 305 to 603) were presented as ribbon diagram. The four insertions were labelled and coloured in red, blue, green and magenta, respectively. HIV-1 gp120 structure (PDB 1GC1) is presented as ribbon diagram. V4, V5, V1/V2 and LE loops were labelled and coloured in red, blue, green, and black, respectively.