PMC:4502367 / 26283-31199 JSONTXT

Annnotations TAB JSON ListView MergeView

    2_test

    {"project":"2_test","denotations":[{"id":"25917918-21513511-43386575","span":{"begin":268,"end":272},"obj":"21513511"},{"id":"25917918-19400631-43386576","span":{"begin":2584,"end":2588},"obj":"19400631"}],"text":"Gene distribution in Zymoseptoria species\nHomology relationships among the four Zymoseptoria species including orthology (interspecies) and paralogy (intraspecies) were analyzed by a comparison of the predicted proteomes using the software package SiLiX (Miele et al. 2011). Of the total 44,227 proteins predicted in the four Zymoseptoria species, 39,177 (88.6%) were clustered into 10,612 families. These families belong to three categories. First, there are 10,361 families of orthologous sequences present in at least two species. The different distributions of these families across the Zymoseptoria species are shown in Figure 1. Second, there are 236 families of orthologous sequences that also included species-specific paralogous sequences, i.e., independent duplication of a conserved gene occurred in one or more species. Third, there are 15 families of paralogous sequences, i.e., duplication of species-specific genes. The remaining 5050 sequences that could not be classified into a gene family were considered as proteins encoded by orphan genes.\nFigure 1 Venn diagram showing the distribution of predicted models in Z. tritici: Z. pseudotritici, Z. ardabiliae, and Z. brevis. The categorizations of core Zymoseptoria genes, orphan genes, and genes shared by three or two species were performed using a detailed characterization of gene orthology.\n\nOrphan genes:\nThe genome of Z. tritici encodes 1798 orphan genes (15.1% of all the predicted genes), of which 1221 are localized on core chromosomes and 577 on accessory chromosomes. However, accessory chromosomes are significantly enriched in orphan genes because they constitute 79.3% of the genes on these chromosomes compared to 10.9% on core chromosomes (χ2 test, P \u003c 2.2e-16). A large majority of the orphan genes (95.5%) have no in silico attributed function. In general, we could not link species-specific genes to any biological or pathogenicity-related function; however, this uniqueness supports the species-specific nature of the genes that have no homology with known proteins from organisms outside the Zymoseptoria species complex. Among the orphan sequences that could be assigned a function, we found predicted functions relating to molecule transport (drugs, proteins, amino acids), primary and secondary metabolism (oxydoreductase activity, nonribosomal peptide synthase), and organic compounds degradation (chitinase, glycoside hydrolase). Small secreted proteins (SSPs) of plant pathogens particularly play an important role as putative effectors during the host infection (Stergiopoulos and De Wit 2009), and we found in Z. tritici that SSP-encoding genes were significantly enriched in the orphan gene set compared to the whole proteome: 8% vs. 3.7% (χ2 test, P \u003c 2.2e-16). Orphan genes representing 9–11% of the predicted genes in Z. pseudotritici, Z. ardabiliae, and Z. brevis were likewise significantly enriched in SSP-encoding genes and genes that could not be assigned a function.\n\nParalogous genes:\nFamilies of paralogous genes were only found in Z. tritici (eight families) and Z. ardabiliae (seven families) and included 23 and 14 genes, respectively. In Z. ardabiliae, 57% of these duplicated sequences are SSP-encoding genes, whereas there were no SSP paralogs in Z. tritici. In Z. ardabiliae, only one family could be associated with a gene function (kinase), whereas in Z. tritici several families contained sequences with known protein domains of transposable elements.\n\nOrthologous genes:\nFamilies of orthologous genes within the four Zymoseptoria species were divided into three categories. The first category comprised the core set of Zymoseptoria genes and included 7786 families containing one gene per species (Figure 1). A large proportion of these genes (59.5%) could be associated with a predicted function, whereas 8.3% of them had no homology to genes in other organisms and thereby potentially represented a set of Zymoseptoria-specific genes. The second category comprised genes present in three out of the four species and included 1294 families. Genes in this second category appeared to be less conserved because only 33% of them could be associated with function information. Finally, the third category included 1281 families comprising genes only found in two of the four species. Compared to the two other categories, genes with a function represented a smaller fraction (18%).\n\nOrthologous genes with paralogy:\nA particular category of families was also identified where orthologous sequences were conserved in two or more species with independent duplications in one or more species. This category contained 236 families with 1552 sequences. Most of the families (87%) comprised genes present in the four species. The largest family included 15 genes all encoding an alcohol dehydrogenase with four copies found in Z. tritici, Z. pseudotritici, and Z. brevis and three copies in Z. ardabiliae."}