Id |
Subject |
Object |
Predicate |
Lexical cue |
T50 |
0-39 |
Sentence |
denotes |
Pathogen discovery and characterization |
T51 |
40-170 |
Sentence |
denotes |
To identify potential pathogens from the mNGS sequencing results, a pathogen discovery pipeline was carried out on sequenced data. |
T52 |
171-269 |
Sentence |
denotes |
Briefly, reads containing adaptor sequences and low-complex regions were removed from the dataset. |
T53 |
270-346 |
Sentence |
denotes |
Human reads were also removed by mapping against the reference human genome. |
T54 |
347-626 |
Sentence |
denotes |
All non-human and non-repeat sequence reads were then compared to a reference virus database (downloaded from https://ftp.ncbi.nih.gov/blast/db/ref_viruses_rep_genomes.tar.gz) and the non-redundant protein database (nr) using blastn and diamond blastx programs [4], respectively. |
T55 |
627-812 |
Sentence |
denotes |
Taxonomy lineage information was obtained for each blast hits by matching the accession number with the taxonomy database, which was subsequently used to identify reads of virus origin. |
T56 |
813-899 |
Sentence |
denotes |
Bacterial pathogen identification was carried out by using the Metaphlan2 program [5]. |
T57 |
900-1031 |
Sentence |
denotes |
Reads were also assembled de novo using Megahit [6], with the virus genome identified based on the blast procedure described above. |
T58 |
1032-1189 |
Sentence |
denotes |
To validate the assembled genome sequences, reads were subsequently mapped to the genomes and a majority consensus sequences were determined for each sample. |
T59 |
1190-1350 |
Sentence |
denotes |
Minor variation calling was performed after mapping using Genious software package, with a minimum coverage set to 20 and minimum variant frequency set to 0.05. |
T60 |
1351-1486 |
Sentence |
denotes |
In addition to mapping, the virus genomes were also confirmed with Sanger sequencing using primers designed based on the NGS sequences. |