PMC:7519301 / 34614-35756 JSONTXT

Annnotations TAB JSON ListView MergeView

    LitCovid-PD-MONDO

    {"project":"LitCovid-PD-MONDO","denotations":[{"id":"T180","span":{"begin":4,"end":12},"obj":"Disease"},{"id":"T181","span":{"begin":4,"end":8},"obj":"Disease"}],"attributes":[{"id":"A180","pred":"mondo_id","subj":"T180","obj":"http://purl.obolibrary.org/obo/MONDO_0005091"},{"id":"A181","pred":"mondo_id","subj":"T181","obj":"http://purl.obolibrary.org/obo/MONDO_0005091"}],"text":"All SARS-CoV-2 sequences available on GISAID as of May 18, 2020 (n = 27,989) were downloaded and deduplicated where possible, and those missing accurate dates (that is, only recording the month and/or year) were removed. Sequences were processed using the Biostrings package (version 2.48.0) in R (49). Sequences known to be linked through direct transmission were removed, and only the sample with the earliest date (chosen at random when multiple samples were taken on the same day) was retained. Sequences were then aligned with Mafft v7.467 using the -addfragments option to align to the reference sequence (Wuhan-Hu1, GISAID accession EPI_ISL_402125) (50). Insertions relative to Wuhan-Hu-1 were removed, and the 5′ and 3′ ends of sequences (where coverage was low) were excised, resulting in an alignment consisting of the 10 ORFs. Any sequences with less than 95% coverage of the ORFs (i.e., \u003e5% gaps) were removed, and 30 homoplasic sites likely due to sequencing artifacts identified by de Maio et al. were masked (https://github.com/W-L/ProblematicSites_SARS-CoV2/blob/master/archived_vcf/problematic_sites_sarsCov2.2020-05-27.vcf)."}

    LitCovid-PD-CLO

    {"project":"LitCovid-PD-CLO","denotations":[{"id":"T277","span":{"begin":55,"end":57},"obj":"http://purl.obolibrary.org/obo/CLO_0050510"},{"id":"T278","span":{"begin":1134,"end":1136},"obj":"http://purl.obolibrary.org/obo/CLO_0050509"}],"text":"All SARS-CoV-2 sequences available on GISAID as of May 18, 2020 (n = 27,989) were downloaded and deduplicated where possible, and those missing accurate dates (that is, only recording the month and/or year) were removed. Sequences were processed using the Biostrings package (version 2.48.0) in R (49). Sequences known to be linked through direct transmission were removed, and only the sample with the earliest date (chosen at random when multiple samples were taken on the same day) was retained. Sequences were then aligned with Mafft v7.467 using the -addfragments option to align to the reference sequence (Wuhan-Hu1, GISAID accession EPI_ISL_402125) (50). Insertions relative to Wuhan-Hu-1 were removed, and the 5′ and 3′ ends of sequences (where coverage was low) were excised, resulting in an alignment consisting of the 10 ORFs. Any sequences with less than 95% coverage of the ORFs (i.e., \u003e5% gaps) were removed, and 30 homoplasic sites likely due to sequencing artifacts identified by de Maio et al. were masked (https://github.com/W-L/ProblematicSites_SARS-CoV2/blob/master/archived_vcf/problematic_sites_sarsCov2.2020-05-27.vcf)."}

    LitCovid-PubTator

    {"project":"LitCovid-PubTator","denotations":[{"id":"493","span":{"begin":618,"end":621},"obj":"Gene"},{"id":"494","span":{"begin":4,"end":14},"obj":"Species"},{"id":"495","span":{"begin":1069,"end":1073},"obj":"Species"}],"attributes":[{"id":"A493","pred":"tao:has_database_id","subj":"493","obj":"Gene:3215"},{"id":"A494","pred":"tao:has_database_id","subj":"494","obj":"Tax:2697049"},{"id":"A495","pred":"tao:has_database_id","subj":"495","obj":"Tax:2697049"}],"namespaces":[{"prefix":"Tax","uri":"https://www.ncbi.nlm.nih.gov/taxonomy/"},{"prefix":"MESH","uri":"https://id.nlm.nih.gov/mesh/"},{"prefix":"Gene","uri":"https://www.ncbi.nlm.nih.gov/gene/"},{"prefix":"CVCL","uri":"https://web.expasy.org/cellosaurus/CVCL_"}],"text":"All SARS-CoV-2 sequences available on GISAID as of May 18, 2020 (n = 27,989) were downloaded and deduplicated where possible, and those missing accurate dates (that is, only recording the month and/or year) were removed. Sequences were processed using the Biostrings package (version 2.48.0) in R (49). Sequences known to be linked through direct transmission were removed, and only the sample with the earliest date (chosen at random when multiple samples were taken on the same day) was retained. Sequences were then aligned with Mafft v7.467 using the -addfragments option to align to the reference sequence (Wuhan-Hu1, GISAID accession EPI_ISL_402125) (50). Insertions relative to Wuhan-Hu-1 were removed, and the 5′ and 3′ ends of sequences (where coverage was low) were excised, resulting in an alignment consisting of the 10 ORFs. Any sequences with less than 95% coverage of the ORFs (i.e., \u003e5% gaps) were removed, and 30 homoplasic sites likely due to sequencing artifacts identified by de Maio et al. were masked (https://github.com/W-L/ProblematicSites_SARS-CoV2/blob/master/archived_vcf/problematic_sites_sarsCov2.2020-05-27.vcf)."}

    LitCovid-sentences

    {"project":"LitCovid-sentences","denotations":[{"id":"T211","span":{"begin":0,"end":220},"obj":"Sentence"},{"id":"T212","span":{"begin":221,"end":302},"obj":"Sentence"},{"id":"T213","span":{"begin":303,"end":498},"obj":"Sentence"},{"id":"T214","span":{"begin":499,"end":661},"obj":"Sentence"},{"id":"T215","span":{"begin":662,"end":837},"obj":"Sentence"},{"id":"T216","span":{"begin":838,"end":1142},"obj":"Sentence"}],"namespaces":[{"prefix":"_base","uri":"http://pubannotation.org/ontology/tao.owl#"}],"text":"All SARS-CoV-2 sequences available on GISAID as of May 18, 2020 (n = 27,989) were downloaded and deduplicated where possible, and those missing accurate dates (that is, only recording the month and/or year) were removed. Sequences were processed using the Biostrings package (version 2.48.0) in R (49). Sequences known to be linked through direct transmission were removed, and only the sample with the earliest date (chosen at random when multiple samples were taken on the same day) was retained. Sequences were then aligned with Mafft v7.467 using the -addfragments option to align to the reference sequence (Wuhan-Hu1, GISAID accession EPI_ISL_402125) (50). Insertions relative to Wuhan-Hu-1 were removed, and the 5′ and 3′ ends of sequences (where coverage was low) were excised, resulting in an alignment consisting of the 10 ORFs. Any sequences with less than 95% coverage of the ORFs (i.e., \u003e5% gaps) were removed, and 30 homoplasic sites likely due to sequencing artifacts identified by de Maio et al. were masked (https://github.com/W-L/ProblematicSites_SARS-CoV2/blob/master/archived_vcf/problematic_sites_sarsCov2.2020-05-27.vcf)."}

    2_test

    {"project":"2_test","denotations":[{"id":"32868447-23329690-132542412","span":{"begin":657,"end":659},"obj":"23329690"}],"text":"All SARS-CoV-2 sequences available on GISAID as of May 18, 2020 (n = 27,989) were downloaded and deduplicated where possible, and those missing accurate dates (that is, only recording the month and/or year) were removed. Sequences were processed using the Biostrings package (version 2.48.0) in R (49). Sequences known to be linked through direct transmission were removed, and only the sample with the earliest date (chosen at random when multiple samples were taken on the same day) was retained. Sequences were then aligned with Mafft v7.467 using the -addfragments option to align to the reference sequence (Wuhan-Hu1, GISAID accession EPI_ISL_402125) (50). Insertions relative to Wuhan-Hu-1 were removed, and the 5′ and 3′ ends of sequences (where coverage was low) were excised, resulting in an alignment consisting of the 10 ORFs. Any sequences with less than 95% coverage of the ORFs (i.e., \u003e5% gaps) were removed, and 30 homoplasic sites likely due to sequencing artifacts identified by de Maio et al. were masked (https://github.com/W-L/ProblematicSites_SARS-CoV2/blob/master/archived_vcf/problematic_sites_sarsCov2.2020-05-27.vcf)."}