PMC:3570799 / 3785-8941 JSONTXT

Annnotations TAB JSON ListView MergeView

    2_test

    {"project":"2_test","denotations":[{"id":"23450099-2231712-140447099","span":{"begin":107,"end":109},"obj":"2231712"},{"id":"23450099-16820507-140447100","span":{"begin":358,"end":360},"obj":"16820507"},{"id":"23450099-11934745-140447101","span":{"begin":2155,"end":2157},"obj":"11934745"},{"id":"23450099-10742046-140447102","span":{"begin":2158,"end":2160},"obj":"10742046"},{"id":"23450099-18853362-140447103","span":{"begin":2237,"end":2239},"obj":"18853362"},{"id":"23450099-22135293-140447104","span":{"begin":2944,"end":2946},"obj":"22135293"},{"id":"23450099-18522759-140447105","span":{"begin":3050,"end":3052},"obj":"18522759"},{"id":"23450099-22768363-140447106","span":{"begin":3342,"end":3344},"obj":"22768363"},{"id":"23450099-17267434-140447107","span":{"begin":3494,"end":3496},"obj":"17267434"},{"id":"23450099-17953748-140447108","span":{"begin":3543,"end":3545},"obj":"17953748"},{"id":"23450099-9103655-140447109","span":{"begin":4233,"end":4235},"obj":"9103655"},{"id":"23450099-20817437-140447110","span":{"begin":4236,"end":4238},"obj":"20817437"}],"text":"16S rRNA analysis\nThe single genomic 16S rRNA sequence of strain ATCC 8093T was compared using NCBI BLAST [30,31] under default settings (e.g., considering only the high-scoring segment pairs (HSPs) from the best 250 hits) with the most recent release of the Greengenes database [32] and the relative frequencies of taxa and keywords (reduced to their stem [33]) were determined, weighted by BLAST scores. The most frequently occurring genera were Ancylobacter (30.0%), Starkeya (13.4%), Agrobacterium (13.1%), Xanthobacter (12.4%) and Azorhizobium (11.5%) (98 hits in total). Regarding the three hits to sequences from members of the species, the average identity within HSPs was 99.5%, whereas the average coverage by HSPs was 92.8%. Among all other species, the one yielding the highest score was Ancylobacter rudongensis (AY056830), which corresponded to an identity of 98.1% and an HSP coverage of 98.4%. (Note that the Greengenes database uses the INSDC (= EMBL/NCBI/DDBJ) annotation, which is not an authoritative source for nomenclature or classification.) The highest-scoring environmental sequence was EU835464 ('structure and quorum sensing reverse osmosis RO membrane biofilm clone 3M02'), which showed an identity of 98.4% and an HSP coverage of 100.0%. The most frequently occurring keywords within the labels of all environmental samples which yielded hits were 'skin' (6.0%), 'microbiom' (3.0%), 'human, tempor, topograph' (2.5%), 'compost' (2.1%) and 'dure' (2.1%) (152 hits in total) and fit only partially to the known habitat of the species. Environmental samples that yielded hits of a higher score than the highest scoring species were not found.\nFigure 1 shows the phylogenetic neighborhood of in a 16S rRNA based tree. The sequence of the single 16S rRNA gene copy in the genome differs by nine nucleotides from the previously published 16S rRNA sequence (D32247), which contains one ambiguous base call.\nFigure 1 Phylogenetic tree highlighting the position of S. novella relative to the type strains of the other species within the family Xanthobacteraceae (blue font color). The tree was inferred from 1,381 aligned characters [34,35] of the 16S rRNA gene sequence under the maximum likelihood (ML) criterion [36]. Hyphomicrobiaceae (green font color for those species that caused conflict according to the Parafit test, black color for the remaining ones; see below for the difference) were included in the dataset for use as outgroup taxa but then turned out to be intermixed with the target family; hence, the rooting shown was inferred by the midpoint-rooting method [29]. The branches are scaled in terms of the expected number of substitutions per site. Numbers adjacent to the branches are support values from 550 ML bootstrap replicates [37] (left) and from 1,000 maximum-parsimony bootstrap replicates [38] (right) if larger than 60%. Lineages with type strain genome sequencing projects registered in GOLD [39] are labeled with one asterisk, those also listed as 'Complete and Published' with two asterisks (see [40] and CP000781 for Xanthobacter autotrophicus, CP002083 for Hyphomicrobium denitrificans and CP002292 for Rhodomicrobium vannielii). To measure conflict between 16S rRNA data and taxonomic classification in detail, we followed a constraint-based approach as described recently in detail [41], conducting both unconstrained searches and searches constrained for the monophyly of both families and using our own re-implementation of CopyCat [42] in conjunction with AxPcoords and AxParafit [43] was used to determine those leaves (species) whose placement significantly deviated between the constrained and the unconstrained tree.\nThe best-supported ML tree had a log likelihood of -12,191.55, whereas the best tree found under the constraint had a log likelihood of -12,329.92. The constrained tree was significantly worse than the globally best one in the SH test as implemented in RAxML [37,44] (α = 0.01). The best supported MP trees had a score of 1,926, whereas the best constrained trees found had a score of 1.982 and were also significantly worse in the KH test as implemented in PAUP [8,44] (α \u003c 0.0001). Accordingly, the current classification of the family as used in [45,46], on which the annotation of Figure 1 is based, is in significant conflict with the 16S rRNA data. Figure 1 also shows those species that cause phylogenetic conflict as detected using the ParaFit test (i.e., those with a p value \u003e 0.05 because ParaFit measures the significance of congruence) in green font color. According to our analyses, the Hyphomonadaceae genera (Blastochloris and Prosthecomicrobium) nested within the Xanthobacteraceae display significant conflict. In the constrained tree (data not shown), the Angulomicrobium-Methylorhabdus clade is placed at the base of the Xanthobacteraceae clade (forced to be monophyletic). For this reason, Angulomicrobium and Methylorhabdus were not detected as causing conflict (note that the ParaFit test essentially compares unrooted trees). A taxonomic revision of the group would probably need to start with the reassignment of these genera to different families."}

    MicrobeTaxon

    {"project":"MicrobeTaxon","denotations":[{"id":"T38","span":{"begin":4608,"end":4621},"obj":"59282"},{"id":"T39","span":{"begin":4584,"end":4599},"obj":"69657"},{"id":"T40","span":{"begin":1413,"end":1418},"obj":"9606"},{"id":"T41","span":{"begin":4626,"end":4644},"obj":"81894"},{"id":"T42","span":{"begin":4664,"end":4681},"obj":"335928"},{"id":"T43","span":{"begin":4758,"end":4773},"obj":"204473"},{"id":"T44","span":{"begin":4774,"end":4788},"obj":"61655"},{"id":"T45","span":{"begin":4824,"end":4841},"obj":"335928"},{"id":"T46","span":{"begin":4894,"end":4909},"obj":"204473"},{"id":"T47","span":{"begin":4914,"end":4928},"obj":"61655"},{"id":"T81","span":{"begin":1986,"end":1996},"obj":"921"},{"id":"T82","span":{"begin":2065,"end":2082},"obj":"335928"},{"id":"T83","span":{"begin":2242,"end":2259},"obj":"45401"},{"id":"T84","span":{"begin":3071,"end":3097},"obj":"280"},{"id":"T85","span":{"begin":3112,"end":3140},"obj":"53399"},{"id":"T86","span":{"begin":3158,"end":3182},"obj":"1069"},{"id":"T31","span":{"begin":58,"end":75},"obj":"921"},{"id":"T32","span":{"begin":448,"end":460},"obj":"99"},{"id":"T33","span":{"begin":470,"end":478},"obj":"152053"},{"id":"T34","span":{"begin":488,"end":501},"obj":"357"},{"id":"T35","span":{"begin":511,"end":523},"obj":"279"},{"id":"T36","span":{"begin":536,"end":548},"obj":"6"},{"id":"T37","span":{"begin":800,"end":824},"obj":"177413"}],"namespaces":[{"prefix":"_base","uri":"http://purl.bioontology.org/ontology/NCBITAXON/"}],"text":"16S rRNA analysis\nThe single genomic 16S rRNA sequence of strain ATCC 8093T was compared using NCBI BLAST [30,31] under default settings (e.g., considering only the high-scoring segment pairs (HSPs) from the best 250 hits) with the most recent release of the Greengenes database [32] and the relative frequencies of taxa and keywords (reduced to their stem [33]) were determined, weighted by BLAST scores. The most frequently occurring genera were Ancylobacter (30.0%), Starkeya (13.4%), Agrobacterium (13.1%), Xanthobacter (12.4%) and Azorhizobium (11.5%) (98 hits in total). Regarding the three hits to sequences from members of the species, the average identity within HSPs was 99.5%, whereas the average coverage by HSPs was 92.8%. Among all other species, the one yielding the highest score was Ancylobacter rudongensis (AY056830), which corresponded to an identity of 98.1% and an HSP coverage of 98.4%. (Note that the Greengenes database uses the INSDC (= EMBL/NCBI/DDBJ) annotation, which is not an authoritative source for nomenclature or classification.) The highest-scoring environmental sequence was EU835464 ('structure and quorum sensing reverse osmosis RO membrane biofilm clone 3M02'), which showed an identity of 98.4% and an HSP coverage of 100.0%. The most frequently occurring keywords within the labels of all environmental samples which yielded hits were 'skin' (6.0%), 'microbiom' (3.0%), 'human, tempor, topograph' (2.5%), 'compost' (2.1%) and 'dure' (2.1%) (152 hits in total) and fit only partially to the known habitat of the species. Environmental samples that yielded hits of a higher score than the highest scoring species were not found.\nFigure 1 shows the phylogenetic neighborhood of in a 16S rRNA based tree. The sequence of the single 16S rRNA gene copy in the genome differs by nine nucleotides from the previously published 16S rRNA sequence (D32247), which contains one ambiguous base call.\nFigure 1 Phylogenetic tree highlighting the position of S. novella relative to the type strains of the other species within the family Xanthobacteraceae (blue font color). The tree was inferred from 1,381 aligned characters [34,35] of the 16S rRNA gene sequence under the maximum likelihood (ML) criterion [36]. Hyphomicrobiaceae (green font color for those species that caused conflict according to the Parafit test, black color for the remaining ones; see below for the difference) were included in the dataset for use as outgroup taxa but then turned out to be intermixed with the target family; hence, the rooting shown was inferred by the midpoint-rooting method [29]. The branches are scaled in terms of the expected number of substitutions per site. Numbers adjacent to the branches are support values from 550 ML bootstrap replicates [37] (left) and from 1,000 maximum-parsimony bootstrap replicates [38] (right) if larger than 60%. Lineages with type strain genome sequencing projects registered in GOLD [39] are labeled with one asterisk, those also listed as 'Complete and Published' with two asterisks (see [40] and CP000781 for Xanthobacter autotrophicus, CP002083 for Hyphomicrobium denitrificans and CP002292 for Rhodomicrobium vannielii). To measure conflict between 16S rRNA data and taxonomic classification in detail, we followed a constraint-based approach as described recently in detail [41], conducting both unconstrained searches and searches constrained for the monophyly of both families and using our own re-implementation of CopyCat [42] in conjunction with AxPcoords and AxParafit [43] was used to determine those leaves (species) whose placement significantly deviated between the constrained and the unconstrained tree.\nThe best-supported ML tree had a log likelihood of -12,191.55, whereas the best tree found under the constraint had a log likelihood of -12,329.92. The constrained tree was significantly worse than the globally best one in the SH test as implemented in RAxML [37,44] (α = 0.01). The best supported MP trees had a score of 1,926, whereas the best constrained trees found had a score of 1.982 and were also significantly worse in the KH test as implemented in PAUP [8,44] (α \u003c 0.0001). Accordingly, the current classification of the family as used in [45,46], on which the annotation of Figure 1 is based, is in significant conflict with the 16S rRNA data. Figure 1 also shows those species that cause phylogenetic conflict as detected using the ParaFit test (i.e., those with a p value \u003e 0.05 because ParaFit measures the significance of congruence) in green font color. According to our analyses, the Hyphomonadaceae genera (Blastochloris and Prosthecomicrobium) nested within the Xanthobacteraceae display significant conflict. In the constrained tree (data not shown), the Angulomicrobium-Methylorhabdus clade is placed at the base of the Xanthobacteraceae clade (forced to be monophyletic). For this reason, Angulomicrobium and Methylorhabdus were not detected as causing conflict (note that the ParaFit test essentially compares unrooted trees). A taxonomic revision of the group would probably need to start with the reassignment of these genera to different families."}