Id |
Subject |
Object |
Predicate |
Lexical cue |
T27 |
0-21 |
Sentence |
denotes |
Phylogenetic analysis |
T28 |
22-428 |
Sentence |
denotes |
To analyse the obtained SARS-CoV-2 genomes respectively derived from the infected Chinese tourist (GISAID accession ID: EPI_ISL_412974) and the Italian patient (GISAID accession ID: EPI_ISL_412973) in a phylogenetic context, a dataset of 40 available SARS-Cov-2 complete genomes from different countries was retrieved from GISAID (https://www.gisaid.org/, last access 2 March 2020; Supplementary material). |
T29 |
429-564 |
Sentence |
denotes |
Sequence alignment was performed using MUltiple Sequence Comparison by Log- Expectation (MUSCLE) software (http://www.clustal.org) [6]. |
T30 |
565-861 |
Sentence |
denotes |
Estimation of the best fitting substitution model (Hasegawa, Kishino, and Yano, HKY model) and inference of the phylogenetic tree were conducted by a maximum likelihood approach using Molecular Evolutionary Genetics Analysis across Computing Platforms (MEGA X; https://www.megasoftware.net/) [7]. |
T31 |
862-938 |
Sentence |
denotes |
Support for the tree topology was estimated with 1,000 bootstrap replicates. |
T32 |
939-1041 |
Sentence |
denotes |
The maximum likelihood phylogenetic tree in the Figure shows a main clade containing several clusters. |
T33 |
1042-1285 |
Sentence |
denotes |
The viral genome sequence of the Chinese tourist (GISAID accession ID: EPI_ISL_412974) was identical to that retrieved from one sample of another Chinese tourist, hospitalised at the same hospital in Rome (GISAID accession ID: EPI_ISL_410546). |
T34 |
1286-1409 |
Sentence |
denotes |
The latter was closely related to that of another sample taken from the same patient (GISAID accession ID: EPI_ISL_410545). |
T35 |
1410-1593 |
Sentence |
denotes |
These three genome sequences were located in a cluster with genomes mainly from Europe (England, France, Italy, Sweden), but also one from Australia (Figure, highlighted in dark red). |
T36 |
1594-1778 |
Sentence |
denotes |
Figure Phylogenetic analysis of two SARS-CoV-2 complete genome sequences retrieved in this study, with available complete sequences from different countriesa (n = 40 genome sequences) |
T37 |
1779-1786 |
Sentence |
denotes |
GISAID: |
T38 |
1787-1840 |
Sentence |
denotes |
Global Initiative on Sharing All Influenza Data; HKY: |
T39 |
1841-1877 |
Sentence |
denotes |
Hasegawa, Kishino, and Yano; MEGA X: |
T40 |
1878-2005 |
Sentence |
denotes |
Molecular Evolutionary Genetics Analysis across Computing Platforms; SARS-CoV-2: severe acute respiratory syndrome coronavirus. |
T41 |
2006-2057 |
Sentence |
denotes |
Main clusters are highlighted in different colours. |
T42 |
2058-2143 |
Sentence |
denotes |
The Wuhan reference genome is in larger font (GenBank accession number: NC_045512.2). |
T43 |
2144-2268 |
Sentence |
denotes |
The filled circles represent the main supported clusters (bootstrap support values are indicated at the level of the nodes). |
T44 |
2269-2355 |
Sentence |
denotes |
The scale bar at the bottom of the tree represents 0.000050 nt substitutions per site. |
T45 |
2356-2502 |
Sentence |
denotes |
The cluster containing the viral sequence of the Chinese tourist who had visited Rome, Italy (GISAID accession ID: EPI_ISL_412974) is in dark red. |
T46 |
2503-2699 |
Sentence |
denotes |
This cluster includes viral sequences derived from two samples (sputum and nasopharyngeal swabs) of another Chinese tourist visiting Rome (GISAID accession IDs: EPI_ISL_410545 and EPI_ISL_410546). |
T47 |
2700-2918 |
Sentence |
denotes |
The viral genome sequence (GISAID accession ID: EPI_ISL_412973) derived from a patient from Lombardy, Italy, is in a cluster highlighted in green, which is different from that containing the Chinese tourist’s sequence. |
T48 |
2919-3014 |
Sentence |
denotes |
a The tree wasbuilt by using the best fitting substitution model (HKY) through MEGA X software. |
T49 |
3015-3349 |
Sentence |
denotes |
The genome sequence from the Italian patient in Lombardy (EPI_ISL_412973) appeared in contrast to be located in a different cluster including two genome sequences from Germany (EPI_ISL_406862 Bavaria/Munich and EPI_ISL_412912 Baden-Wuerttemberg-1) and one genome sequence from Mexico (EPI_ISL_ 412972), (Figure, highlighted in green). |
T50 |
3350-3546 |
Sentence |
denotes |
In the tree, some sequences from other SARS-CoV-2 collected in Europe segregated in separate clusters from the two clusters containing the respective patient sequences characterised in this study. |
T51 |
3547-3668 |
Sentence |
denotes |
There was for example a cluster formed by two sequences from England and a cluster formed by three sequences from France. |
T52 |
3669-4104 |
Sentence |
denotes |
Using an alignment, the single nt polymorphisms (SNPs) composition and the potentially resulting variable amino-acids in derived protein sequences compared with the Wuhan reference sequences (MN908947 and NC_045512), were investigated for the genome sequences retrieved in this study, as well as three other genome sequences (EPI_ISL_412972, EPI_ISL_ 412912, EPI_ISL_406862) that clustered with the sequence of the patient in Lombardy. |
T53 |
4105-4242 |
Sentence |
denotes |
The genome-wide SNPs are reported in Table 1 (positions referred respect to the reference sequence; GenBank accession number: NC_045512). |
T54 |
4243-4338 |
Sentence |
denotes |
The corresponding amino-acid positions and variations inside the proteins are shown in Table 2. |
T55 |
4339-4536 |
Sentence |
denotes |
Table 1 Single nt polymorphisms (SNPs)a deduced by comparison of two whole genome sequences of SARS-CoV-2 characterised in this studyb with selected SARS-CoV-2 sequences (n = 7 compared sequences) |
T56 |
4537-4685 |
Sentence |
denotes |
SARS-CoV-2 sequence ID (country from which the sequence originated) 241 3037 10265 11083 13206 14408 15806 23403 26144 28881 28882 28883 |
T57 |
4686-4815 |
Sentence |
denotes |
5' UTR ORF1ab gene ORF1ab gene ORF 1ab gene ORF1ab gene ORF1ab gene ORF1ab gene Gene S ORF3a gene Gene N Gene N Gene N |
T58 |
4816-4869 |
Sentence |
denotes |
NC_045512 (China) C C G G C C A A G G G G |
T59 |
4870-4922 |
Sentence |
denotes |
MN908947 (China) C C G G C C A A G G G G |
T60 |
4923-4982 |
Sentence |
denotes |
EPI_ISL:412972 (Mexico) T T G G G T - G G A A C |
T61 |
4983-4991 |
Sentence |
denotes |
EPI_ISL: |
T62 |
4992-5044 |
Sentence |
denotes |
412912 (Germany) T T A G C T A G G A A C |
T63 |
5045-5053 |
Sentence |
denotes |
EPI_ISL: |
T64 |
5054-5106 |
Sentence |
denotes |
406862 (Germany) T T G G C C A G G G G G |
T65 |
5107-5165 |
Sentence |
denotes |
EPI_ISL_412973 (Italy) T T G G C T A G G G G G |
T66 |
5166-5224 |
Sentence |
denotes |
EPI_ISL_412974 (Italy) C C G T C C A A T G G G |
T67 |
5225-5447 |
Sentence |
denotes |
N: nucleocapsid protein; ORF: open reading frame; ORF1ab: ORF encoding polyprotein; S: surface glycoprotein; SARS-CoV-2: severe acute respiratory syndrome coronavirus; SNP: single nt polymorphism; UTR: untranslated region. |
T68 |
5448-5532 |
Sentence |
denotes |
a SNPs are shown according to nt positions in the genome sequence and gene location. |
T69 |
5533-5641 |
Sentence |
denotes |
b The two sequences characterised in this study are the ones from Italy (EPI_ISL_412973 and EPI_ISL_412974). |
T70 |
5642-5851 |
Sentence |
denotes |
Table 2 Amino acid variationsa deduced by comparing translations of two whole genome sequences of SARS-CoV-2 characterised in this studyb with those of selected SARS-CoV-2 sequences (n = 7 compared sequences) |
T71 |
5852-5925 |
Sentence |
denotes |
SARS-CoV-2 strains 924 3334 3606 4314 4704 5170 614 251 203 204 |
T72 |
5926-6059 |
Sentence |
denotes |
ORF1ab ORF1ab ORF1ab ORF1ab ORF1ab ORF1ab Surface glycoprotein ORF3a Nucleocapsid phosphoprotein Nucleocapsid phosphoprotein |
T73 |
6060-6107 |
Sentence |
denotes |
NC_045512 (China) F G L A P Q D G R G |
T74 |
6108-6154 |
Sentence |
denotes |
MN908947 (China) F G L A P Q D G R G |
T75 |
6155-6209 |
Sentence |
denotes |
EPI_ISL:412972 (Mexico) F G L G L -c G G K R |
T76 |
6210-6218 |
Sentence |
denotes |
EPI_ISL: |
T77 |
6219-6265 |
Sentence |
denotes |
412912 (Germany) F S L A L Q G G K R |
T78 |
6266-6274 |
Sentence |
denotes |
EPI_ISL: |
T79 |
6275-6321 |
Sentence |
denotes |
406862 (Germany) F G L A P Q G G R G |
T80 |
6322-6374 |
Sentence |
denotes |
EPI_ISL_412973 (Italy) F G L A L Q G G R G |
T81 |
6375-6427 |
Sentence |
denotes |
EPI_ISL_412974 (Italy) F G F A P Q D V R G |
T82 |
6428-6545 |
Sentence |
denotes |
ORF: open reading frame; ORF1ab: ORF encoding polyprotein; SARS-CoV-2: severe acute respiratory syndrome coronavirus. |
T83 |
6546-6720 |
Sentence |
denotes |
a The amino acid positions refer to those in each respective protein sequence of the Wuhan reference (GenBank accession number: MN908947), starting from the first methionine. |
T84 |
6721-6829 |
Sentence |
denotes |
b The two sequences characterised in this study are the ones from Italy (EPI_ISL_412973 and EPI_ISL_412974). |
T85 |
6830-6861 |
Sentence |
denotes |
c -: possible sequencing error. |
T86 |
6862-7109 |
Sentence |
denotes |
The genome sequence from the Chinese tourist hospitalised in Rome differed in two nt positions from that of the COVID-19 patient in Wuhan (NC_045512), while the genome sequence isolated from the Italian patient showed four nt variations (Table 1). |
T87 |
7110-7237 |
Sentence |
denotes |
For the sequence of the Chinese tourist, the first SNP inside ORF1ab (bps 3037, AA 924) did not result in an amino acid change. |
T88 |
7238-7771 |
Sentence |
denotes |
In the Table 2 that depicts five sequences characterised outside of China, overall eight missense mutations can be observed compared to the two reference Wuhan sequences: four locate to the ORF1ab polyprotein, whereby only the mutation L3606F has previously been reported by Phan, 2020 [8]; one, D614G, locates to the surface glycoprotein and has been prior observed [8], but is not in the receptor binding domain (RDB), responsible for virus entry into host cell; one is in the ORF3a protein and two are in the nucleocapsid protein. |
T89 |
7772-7944 |
Sentence |
denotes |
The sequence of the Chinese tourist hospitalised in Rome on 29 January (EPI_ISL_412974) presented a mutation 3606F in ORF1ab with respect to the reference Wuhan genome (L). |
T90 |
7945-8052 |
Sentence |
denotes |
In ORF3a, this sequence had a V at amino acid position 251, as opposed to a G in the references from Wuhan. |
T91 |
8053-8223 |
Sentence |
denotes |
Meanwhile, the sequence of the Italian patient from Lombardy (EPI_ISL 412973) presented an L at amino acidic position 4704 with respect to the reference Wuhan genome (P). |
T92 |
8224-8409 |
Sentence |
denotes |
It also had a mutation in the surface glycoprotein, at amino acidic position 614, where it showed a G compared to the reference sequences from Wuhan that presented a D at that position. |
T93 |
8410-8581 |
Sentence |
denotes |
With regard to the nucleocapsid protein, both of the sequences from the Italian patient and Chinese tourist presented the same amino acids of the references Wuhan genomes. |