Phylogenetic analyses of 2019-nCoV/SARS-CoV-2 To date, seven pathogenic HCoVs (Fig. 2a, b) have been found:1,29 (i) 2019-nCoV/SARS-CoV-2, SARS-CoV, MERS-CoV, HCoV-OC43, and HCoV-HKU1 are β genera, and (ii) HCoV-NL63 and HCoV-229E are α genera. We performed the phylogenetic analyses using the whole-genome sequence data from 15 HCoVs to inspect the evolutionary relationship of 2019-nCoV/SARS-CoV-2 with other HCoVs. We found that the whole genomes of 2019-nCoV/SARS-CoV-2 had ~99.99% nucleotide sequence identity across three diagnosed patients (Supplementary Table S1). The 2019-nCoV/SARS-CoV-2 shares the highest nucleotide sequence identity (79.7%) with SARS-CoV among the six other known pathogenic HCoVs, revealing conserved evolutionary relationship between 2019-nCoV/SARS-CoV-2 and SARS-CoV (Fig. 2a). Fig. 2 Phylogenetic analysis of coronaviruses. a Phylogenetic tree of coronavirus (CoV). Phylogenetic algorithm analyzed evolutionary conservation among whole genomes of 15 coronaviruses. Red color highlights the recent emergent coronavirus, 2019-nCoV/SARS-CoV-2. Numbers on the branches indicate bootstrap support values. The scale shows the evolutionary distance computed using the p-distance method. b Schematic plot for HCoV genomes. The genus and host information of viruses was labeled on the left by different colors. Empty dark gray boxes represent accessory open reading frames (ORFs). c–e The 3D structures of SARS-CoV nsp12 (PDB ID: 6NUR) (c), spike (PDB ID: 6ACK) (d), and nucleocapsid (PDB ID: 2CJR) (e) shown were based on homology modeling. Genome information and phylogenetic analysis results are provided in Supplementary Tables S1 and S2. HCoVs have five major protein regions for virus structure assembly and viral replications29, including replicase complex (ORF1ab), spike (S), envelope (E), membrane (M), and nucleocapsid (N) proteins (Fig. 2b). The ORF1ab gene encodes the non-structural proteins (nsp) of viral RNA synthesis complex through proteolytic processing30. The nsp12 is a viral RNA-dependent RNA polymerase, together with co-factors nsp7 and nsp8 possessing high polymerase activity. From the protein 3D structure view of SARS-CoV nsp12, it contains a larger N-terminal extension (which binds to nsp7 and nsp8) and polymerase domain (Fig. 2c). The spike is a transmembrane glycoprotein that plays a pivotal role in mediating viral infection through binding the host receptor31,32. Figure 2d shows the 3D structure of the spike protein bound with the host receptor angiotensin converting enznyme2 (ACE2) in SARS-CoV (PDB ID: 6ACK). A recent study showed that 2019-nCoV/SARS-CoV-2 is able to utilize ACE2 as an entry receptor in ACE2-expressing cells33, suggesting potential drug targets for therapeutic development. Furthermore, cryo-EM structure of the spike and biophysical assays reveal that the 2019-nCoV/SARS-CoV-2 spike binds ACE2 with higher affinity than SARS-CoV34. In addition, the nucleocapsid is also an important subunit for packaging the viral genome through protein oligomerization35, and the single nucleocapsid structure is shown in Fig. 2e. Protein sequence alignment analyses indicated that the 2019-nCoV/SARS-CoV-2 was most evolutionarily conserved with SARS-CoV (Supplementary Table S2). Specifically, the envelope and nucleocapsid proteins of 2019-nCoV/SARS-CoV-2 are two evolutionarily conserved regions, with sequence identities of 96% and 89.6%, respectively, compared to SARS-CoV (Supplementary Table S2). However, the spike protein exhibited the lowest sequence conservation (sequence identity of 77%) between 2019-nCoV/SARS-CoV-2 and SARS-CoV. Meanwhile, the spike protein of 2019-nCoV/SARS-CoV-2 only has 31.9% sequence identity compared to MERS-CoV.