PMC:7140597 / 749-14677 JSONTXT 15 Projects

Annnotations TAB TSV DIC JSON TextAE

Id Subject Object Predicate Lexical cue
T7 0-298 Sentence denotes An outbreak of a viral respiratory illness (officially named by the World Health Organization coronavirus disease, COVID-19) caused by the newly discovered severe acute respiratory syndrome coronavirus (SARS-CoV-2), started around mid-December 2019, in the city of Wuhan, Hubei province, China [1].
T8 299-457 Sentence denotes The outbreak subsequently spread further and as at 31 March 2020, 750,890 cases of COVID-19 have been confirmed worldwide including 668,345 outside China [2].
T9 458-658 Sentence denotes Since 20 February 2020, sustained local transmission has been documented in Italy [3], where to date, 98,716 COVID-19 cases testing positive for SARS-CoV-2 have been diagnosed, with 10,943 deaths [4].
T10 659-878 Sentence denotes To gain further understanding on the molecular epidemiology of the outbreak in Italy, we characterised the full-genome sequence of two SARS-CoV-2 strains respectively isolated from two patients diagnosed in the country.
T11 879-1090 Sentence denotes The first patient was a Chinese tourist from Wuhan diagnosed at the end of January, who had visited Rome and not been in areas of Italy later found to be the initially affected areas of the epidemic in Lombardy.
T12 1091-1254 Sentence denotes The second patient was an Italian person, with no apparent direct epidemiological link with China and who was diagnosed in the second half of February in Lombardy.
T13 1255-1369 Sentence denotes The sequences presented are analysed in the context of other available genome sequences from Europe and elsewhere.
T14 1371-1426 Sentence denotes Patients, virus cultivation and whole genome sequencing
T15 1427-1621 Sentence denotes The two patients in this study had both been hospitalised with an acute respiratory illness (pneumonia), showing a bilateral lung involvement with ground-glass opacity, requiring intensive care.
T16 1622-1732 Sentence denotes The Chinese patient had had onset of symptoms on 29 January 2020 and had been diagnosed in a hospital in Rome.
T17 1733-1852 Sentence denotes The Italian patient whose onset of symptoms had occurred on 10 February 2020 had been diagnosed in a hospital in Milan.
T18 1853-2032 Sentence denotes Biological samples from both patients had been confirmed as being SARS-CoV-2 positive by the National Reference Laboratory (NRL) of the Istituto Superiore di Sanità (ISS) in Rome.
T19 2033-2091 Sentence denotes The samples used for this study were nasopharyngeal swabs.
T20 2092-2268 Sentence denotes These had been respectively sampled on the same day of hospitalisation, when symptoms occurred, for the Chinese tourist and 10 days after symptom onset for the Italian patient.
T21 2269-2476 Sentence denotes An aliquot of each patient’s nasopharyngeal sample was used to generate in vitro cultures in Vero cells grown in modified Eagle’s medium (MEM; Gibco, Thermofisher, United Kingdom) supplemented with GlutaMAX.
T22 2477-2620 Sentence denotes A total of 140 µL of each culture’s supernatant was used for viral RNA extraction using the QIAMP VIRAL RNA mini kit (Qiagen, Hilden, Germany).
T23 2621-2907 Sentence denotes The obtained genomic RNAs were retro-transcribed using the SuperScript III Reverse Transcriptase kit (Invitrogen, Carisbad, United States (US)) and double-stranded DNAs were subsequently obtained by Klenow enzyme (Roche, Basel, Switzerland) according to the manufacturer’s instructions.
T24 2908-3125 Sentence denotes The Nextera XT kit was used for library preparations and whole genome sequencing was performed using the Illumina Miseq Reagent Nano Kit, V2 (2 x 150 cycles) on the Illumina MiSeq instrument (Illumina, San Diego, US).
T25 3126-3323 Sentence denotes The reads were trimmed for quality and length and assembled by mapping to the reference genome from Wuhan, China (GenBank accession number: NC_045512.2) using Geneious Prime (www.geneious.com) [5].
T26 3324-3504 Sentence denotes Viral sequences from the two patients were deposited in the Global Initiative on Sharing All Influenza Data (GISAID; https://www.gisaid.org/epiflu-applications/next-hcov-19-app/ ).
T27 3506-3527 Sentence denotes Phylogenetic analysis
T28 3528-3934 Sentence denotes To analyse the obtained SARS-CoV-2 genomes respectively derived from the infected Chinese tourist (GISAID accession ID: EPI_ISL_412974) and the Italian patient (GISAID accession ID: EPI_ISL_412973) in a phylogenetic context, a dataset of 40 available SARS-Cov-2 complete genomes from different countries was retrieved from GISAID (https://www.gisaid.org/, last access 2 March 2020; Supplementary material).
T29 3935-4070 Sentence denotes Sequence alignment was performed using MUltiple Sequence Comparison by Log- Expectation (MUSCLE) software (http://www.clustal.org) [6].
T30 4071-4367 Sentence denotes Estimation of the best fitting substitution model (Hasegawa, Kishino, and Yano, HKY model) and inference of the phylogenetic tree were conducted by a maximum likelihood approach using Molecular Evolutionary Genetics Analysis across Computing Platforms (MEGA X; https://www.megasoftware.net/) [7].
T31 4368-4444 Sentence denotes Support for the tree topology was estimated with 1,000 bootstrap replicates.
T32 4445-4547 Sentence denotes The maximum likelihood phylogenetic tree in the Figure shows a main clade containing several clusters.
T33 4548-4791 Sentence denotes The viral genome sequence of the Chinese tourist (GISAID accession ID: EPI_ISL_412974) was identical to that retrieved from one sample of another Chinese tourist, hospitalised at the same hospital in Rome (GISAID accession ID: EPI_ISL_410546).
T34 4792-4915 Sentence denotes The latter was closely related to that of another sample taken from the same patient (GISAID accession ID: EPI_ISL_410545).
T35 4916-5099 Sentence denotes These three genome sequences were located in a cluster with genomes mainly from Europe (England, France, Italy, Sweden), but also one from Australia (Figure, highlighted in dark red).
T36 5100-5284 Sentence denotes Figure Phylogenetic analysis of two SARS-CoV-2 complete genome sequences retrieved in this study, with available complete sequences from different countriesa (n = 40 genome sequences)
T37 5285-5292 Sentence denotes GISAID:
T38 5293-5346 Sentence denotes Global Initiative on Sharing All Influenza Data; HKY:
T39 5347-5383 Sentence denotes Hasegawa, Kishino, and Yano; MEGA X:
T40 5384-5511 Sentence denotes Molecular Evolutionary Genetics Analysis across Computing Platforms; SARS-CoV-2: severe acute respiratory syndrome coronavirus.
T41 5512-5563 Sentence denotes Main clusters are highlighted in different colours.
T42 5564-5649 Sentence denotes The Wuhan reference genome is in larger font (GenBank accession number: NC_045512.2).
T43 5650-5774 Sentence denotes The filled circles represent the main supported clusters (bootstrap support values are indicated at the level of the nodes).
T44 5775-5861 Sentence denotes The scale bar at the bottom of the tree represents 0.000050 nt substitutions per site.
T45 5862-6008 Sentence denotes The cluster containing the viral sequence of the Chinese tourist who had visited Rome, Italy (GISAID accession ID: EPI_ISL_412974) is in dark red.
T46 6009-6205 Sentence denotes This cluster includes viral sequences derived from two samples (sputum and nasopharyngeal swabs) of another Chinese tourist visiting Rome (GISAID accession IDs: EPI_ISL_410545 and EPI_ISL_410546).
T47 6206-6424 Sentence denotes The viral genome sequence (GISAID accession ID: EPI_ISL_412973) derived from a patient from Lombardy, Italy, is in a cluster highlighted in green, which is different from that containing the Chinese tourist’s sequence.
T48 6425-6520 Sentence denotes a The tree wasbuilt by using the best fitting substitution model (HKY) through MEGA X software.
T49 6521-6855 Sentence denotes The genome sequence from the Italian patient in Lombardy (EPI_ISL_412973) appeared in contrast to be located in a different cluster including two genome sequences from Germany (EPI_ISL_406862 Bavaria/Munich and EPI_ISL_412912 Baden-Wuerttemberg-1) and one genome sequence from Mexico (EPI_ISL_ 412972), (Figure, highlighted in green).
T50 6856-7052 Sentence denotes In the tree, some sequences from other SARS-CoV-2 collected in Europe segregated in separate clusters from the two clusters containing the respective patient sequences characterised in this study.
T51 7053-7174 Sentence denotes There was for example a cluster formed by two sequences from England and a cluster formed by three sequences from France.
T52 7175-7610 Sentence denotes Using an alignment, the single nt polymorphisms (SNPs) composition and the potentially resulting variable amino-acids in derived protein sequences compared with the Wuhan reference sequences (MN908947 and NC_045512), were investigated for the genome sequences retrieved in this study, as well as three other genome sequences (EPI_ISL_412972, EPI_ISL_ 412912, EPI_ISL_406862) that clustered with the sequence of the patient in Lombardy.
T53 7611-7748 Sentence denotes The genome-wide SNPs are reported in Table 1 (positions referred respect to the reference sequence; GenBank accession number: NC_045512).
T54 7749-7844 Sentence denotes The corresponding amino-acid positions and variations inside the proteins are shown in Table 2.
T55 7845-8042 Sentence denotes Table 1 Single nt polymorphisms (SNPs)a deduced by comparison of two whole genome sequences of SARS-CoV-2 characterised in this studyb with selected SARS-CoV-2 sequences (n = 7 compared sequences)
T56 8043-8191 Sentence denotes SARS-CoV-2 sequence ID (country from which the sequence originated) 241 3037 10265 11083 13206 14408 15806 23403 26144 28881 28882 28883
T57 8192-8321 Sentence denotes 5' UTR ORF1ab gene ORF1ab gene ORF 1ab gene ORF1ab gene ORF1ab gene ORF1ab gene Gene S ORF3a gene Gene N Gene N Gene N
T58 8322-8375 Sentence denotes NC_045512 (China) C C G G C C A A G G G G
T59 8376-8428 Sentence denotes MN908947 (China) C C G G C C A A G G G G
T60 8429-8488 Sentence denotes EPI_ISL:412972 (Mexico) T T G G G T - G G A A C
T61 8489-8497 Sentence denotes EPI_ISL:
T62 8498-8550 Sentence denotes 412912 (Germany) T T A G C T A G G A A C
T63 8551-8559 Sentence denotes EPI_ISL:
T64 8560-8612 Sentence denotes 406862 (Germany) T T G G C C A G G G G G
T65 8613-8671 Sentence denotes EPI_ISL_412973 (Italy) T T G G C T A G G G G G
T66 8672-8730 Sentence denotes EPI_ISL_412974 (Italy) C C G T C C A A T G G G
T67 8731-8953 Sentence denotes N: nucleocapsid protein; ORF: open reading frame; ORF1ab: ORF encoding polyprotein; S: surface glycoprotein; SARS-CoV-2: severe acute respiratory syndrome coronavirus; SNP: single nt polymorphism; UTR: untranslated region.
T68 8954-9038 Sentence denotes a SNPs are shown according to nt positions in the genome sequence and gene location.
T69 9039-9147 Sentence denotes b The two sequences characterised in this study are the ones from Italy (EPI_ISL_412973 and EPI_ISL_412974).
T70 9148-9357 Sentence denotes Table 2 Amino acid variationsa deduced by comparing translations of two whole genome sequences of SARS-CoV-2 characterised in this studyb with those of selected SARS-CoV-2 sequences (n = 7 compared sequences)
T71 9358-9431 Sentence denotes SARS-CoV-2 strains 924 3334 3606 4314 4704 5170 614 251 203 204
T72 9432-9565 Sentence denotes ORF1ab ORF1ab ORF1ab ORF1ab ORF1ab ORF1ab Surface glycoprotein ORF3a Nucleocapsid phosphoprotein Nucleocapsid phosphoprotein
T73 9566-9613 Sentence denotes NC_045512 (China) F G L A P Q D G R G
T74 9614-9660 Sentence denotes MN908947 (China) F G L A P Q D G R G
T75 9661-9715 Sentence denotes EPI_ISL:412972 (Mexico) F G L G L -c G G K R
T76 9716-9724 Sentence denotes EPI_ISL:
T77 9725-9771 Sentence denotes 412912 (Germany) F S L A L Q G G K R
T78 9772-9780 Sentence denotes EPI_ISL:
T79 9781-9827 Sentence denotes 406862 (Germany) F G L A P Q G G R G
T80 9828-9880 Sentence denotes EPI_ISL_412973 (Italy) F G L A L Q G G R G
T81 9881-9933 Sentence denotes EPI_ISL_412974 (Italy) F G F A P Q D V R G
T82 9934-10051 Sentence denotes ORF: open reading frame; ORF1ab: ORF encoding polyprotein; SARS-CoV-2: severe acute respiratory syndrome coronavirus.
T83 10052-10226 Sentence denotes a The amino acid positions refer to those in each respective protein sequence of the Wuhan reference (GenBank accession number: MN908947), starting from the first methionine.
T84 10227-10335 Sentence denotes b The two sequences characterised in this study are the ones from Italy (EPI_ISL_412973 and EPI_ISL_412974).
T85 10336-10367 Sentence denotes c -: possible sequencing error.
T86 10368-10615 Sentence denotes The genome sequence from the Chinese tourist hospitalised in Rome differed in two nt positions from that of the COVID-19 patient in Wuhan (NC_045512), while the genome sequence isolated from the Italian patient showed four nt variations (Table 1).
T87 10616-10743 Sentence denotes For the sequence of the Chinese tourist, the first SNP inside ORF1ab (bps 3037, AA 924) did not result in an amino acid change.
T88 10744-11277 Sentence denotes In the Table 2 that depicts five sequences characterised outside of China, overall eight missense mutations can be observed compared to the two reference Wuhan sequences: four locate to the ORF1ab polyprotein, whereby only the mutation L3606F has previously been reported by Phan, 2020 [8]; one, D614G, locates to the surface glycoprotein and has been prior observed [8], but is not in the receptor binding domain (RDB), responsible for virus entry into host cell; one is in the ORF3a protein and two are in the nucleocapsid protein.
T89 11278-11450 Sentence denotes The sequence of the Chinese tourist hospitalised in Rome on 29 January (EPI_ISL_412974) presented a mutation 3606F in ORF1ab with respect to the reference Wuhan genome (L).
T90 11451-11558 Sentence denotes In ORF3a, this sequence had a V at amino acid position 251, as opposed to a G in the references from Wuhan.
T91 11559-11729 Sentence denotes Meanwhile, the sequence of the Italian patient from Lombardy (EPI_ISL 412973) presented an L at amino acidic position 4704 with respect to the reference Wuhan genome (P).
T92 11730-11915 Sentence denotes It also had a mutation in the surface glycoprotein, at amino acidic position 614, where it showed a G compared to the reference sequences from Wuhan that presented a D at that position.
T93 11916-12087 Sentence denotes With regard to the nucleocapsid protein, both of the sequences from the Italian patient and Chinese tourist presented the same amino acids of the references Wuhan genomes.
T94 12089-12099 Sentence denotes Discussion
T95 12100-12360 Sentence denotes In this study, the full length genomes of two SARS-CoV-2 strains (EPI_ISL_412973 and EPI_ISL_412974) isolated in Italy, one from an Italian patient, the other from a Chinese tourist visiting Rome, are completely sequenced and analysed, after virus cultivation.
T96 12361-12538 Sentence denotes Compared to the viral genome sequence of the COVID-19 patient in Wuhan, the sequence from the Chinese tourist had two nt differences, while that of the Italian patient had four.
T97 12539-12658 Sentence denotes Phylogenetic analysis consistently placed the Italian patient’s strain in a distinct cluster from the tourist’s strain.
T98 12659-12910 Sentence denotes The strain of the Italian patient grouped with other viral strains identified in Germany and Mexico, while the strain from the Chinese tourist, related with the Wuhan virus strain, clustered with different European strains and a strain from Australia.
T99 12911-13127 Sentence denotes Other sequences from strains collected in Europe, which were included in the phylogenetic analysis, ended up in separate clusters from the ones respectively containing the sequences of the two patients reported here.
T100 13128-13331 Sentence denotes The results are consistent with several introductions of SARS-CoV-2 in Europe and/or further circulation of the single strain originating in Wuhan with concurrent evolution and accumulation of mutations.
T101 13332-13569 Sentence denotes The mutations found in the virus identified in Lombardy, compared with the reference Wuhan strain, and the identification of amino acids changes, should be further investigated to understand whether they may affect virus characteristics.
T102 13570-13811 Sentence denotes Some limitations need be mentioned: first, the lack of epidemiological information available with most sequences deposited in the database; second, the number of genomes available at the time of the analysis and consequently their selection.
T103 13812-13928 Sentence denotes Nevertheless, these data may be useful to understand the dynamics of the local transmission of SARS-CoV-2 in Europe.