PMC:7033720 / 3314-7229 JSONTXT 12 Projects

Annnotations TAB TSV DIC JSON TextAE

Id Subject Object Predicate Lexical cue
T25 0-21 Sentence denotes Materials and methods
T26 23-39 Sentence denotes Ethics statement
T27 40-133 Sentence denotes This study was approved by the Ethics Committee of the Zhongnan Hospital of Wuhan University.
T28 134-282 Sentence denotes The mNGS analyses of BALF samples were performed on existing samples collected during standard diagnostic tests, posing no extra burden to patients.
T29 284-302 Sentence denotes Sequence of events
T30 303-320 Sentence denotes 2nd January 2020.
T31 321-384 Sentence denotes Obtained BALF samples from two patients with unusual pneumonia.
T32 385-402 Sentence denotes 3rd January 2020.
T33 403-504 Sentence denotes Performed SARS-specific RT-PCR assay, yielded partial RdRp fragment, and revealed potential pathogen.
T34 505-522 Sentence denotes 4th January 2020.
T35 523-623 Sentence denotes Extended RdRp fragments and obtained more genome fragments, and started mNGS RNA library preparation
T36 624-641 Sentence denotes 5th January 2020.
T37 642-681 Sentence denotes Completed mNGS RNA library preparation.
T38 682-699 Sentence denotes 6th January 2020.
T39 700-742 Sentence denotes Started mNGS sequencing on Miseq platform.
T40 743-760 Sentence denotes 7th January 2020.
T41 761-969 Sentence denotes Received sequencing data, started pathogen identification pipeline, obtained virus genome, corrected the genome end with mapping, identified 2019-nCoV as sole pathogen, and the final CoV genome was 29,881 nt.
T42 970-987 Sentence denotes 8th January 2020.
T43 988-1043 Sentence denotes Performed genome comparisons and evolutionary analyses.
T44 1044-1260 Sentence denotes Since 3rd January 2020, instant progress reports have been sent to Chinese Center for Disease Control and Prevention (CDC), keeping pace with every advancement we made in pathogen identification and characterization.
T45 1262-1296 Sentence denotes Library preparation and sequencing
T46 1297-1437 Sentence denotes Total RNA extracted from BALF samples (collected on 2nd January 2020) were subject to metagenomic next-generation sequencing (mNGS) testing.
T47 1438-1777 Sentence denotes The concentration of RNA samples were low (<0.5 ng/ul) based on measurement by Qubit RNA HS Assay Kit (Thermo Fisher Scientific), and therefore the library preparation was performed with Trio RNA-Seq kit (NuGEN Technologies, USA) which targeted low concentration RNA samples and contained AnyDeplete probe that removes human ribosomal RNA.
T48 1778-1877 Sentence denotes The resulting libraries were subject to 150 bp pair-end sequencing with an Illumina Miseq platform.
T49 1878-1933 Sentence denotes The sequencing results were obtained in less than 24 h.
T50 1935-1974 Sentence denotes Pathogen discovery and characterization
T51 1975-2105 Sentence denotes To identify potential pathogens from the mNGS sequencing results, a pathogen discovery pipeline was carried out on sequenced data.
T52 2106-2204 Sentence denotes Briefly, reads containing adaptor sequences and low-complex regions were removed from the dataset.
T53 2205-2281 Sentence denotes Human reads were also removed by mapping against the reference human genome.
T54 2282-2561 Sentence denotes All non-human and non-repeat sequence reads were then compared to a reference virus database (downloaded from https://ftp.ncbi.nih.gov/blast/db/ref_viruses_rep_genomes.tar.gz) and the non-redundant protein database (nr) using blastn and diamond blastx programs [4], respectively.
T55 2562-2747 Sentence denotes Taxonomy lineage information was obtained for each blast hits by matching the accession number with the taxonomy database, which was subsequently used to identify reads of virus origin.
T56 2748-2834 Sentence denotes Bacterial pathogen identification was carried out by using the Metaphlan2 program [5].
T57 2835-2966 Sentence denotes Reads were also assembled de novo using Megahit [6], with the virus genome identified based on the blast procedure described above.
T58 2967-3124 Sentence denotes To validate the assembled genome sequences, reads were subsequently mapped to the genomes and a majority consensus sequences were determined for each sample.
T59 3125-3285 Sentence denotes Minor variation calling was performed after mapping using Genious software package, with a minimum coverage set to 20 and minimum variant frequency set to 0.05.
T60 3286-3421 Sentence denotes In addition to mapping, the virus genomes were also confirmed with Sanger sequencing using primers designed based on the NGS sequences.
T61 3423-3462 Sentence denotes Phylogenetic and recombination analyses
T62 3463-3565 Sentence denotes Reference sequences associated with CoVs were downloaded from GenBank and aligned using mafft program.
T63 3566-3777 Sentence denotes Phylogenetic trees (both amino acid and nucleotide alignment) were reconstructed using the maximum likelihood method in PhyML 3.0 [7], employing a best fit substitution model and a SPR branch swapping algorithm.
T64 3778-3915 Sentence denotes Recombination event were discovered from phylogenetic analyses and confirmed with similarity plot implemented in the Simplot program [8].