PMC:5056902 / 5150-10571
Annnotations
2_test
{"project":"2_test","denotations":[{"id":"27729838-20168995-44844998","span":{"begin":525,"end":527},"obj":"20168995"},{"id":"27729838-15781573-44844999","span":{"begin":2666,"end":2668},"obj":"15781573"},{"id":"27729838-17658954-44845000","span":{"begin":2666,"end":2670},"obj":"17658954"}],"text":"Results and Discussion\nAll of the data are summarized in Table 2. The genome size of the nine plants species varied from 48 Mb to 2 Gb. The mitochondria genome size also varied from 15.8 kb to 773 kb. There was no correlation between whole-genome size and mitochondrial whole-genome size. We drew a correlation chart between whole nuclear genome length and the sum of the inserted Numt lengths (Fig. 1). The larger the genome size, the more nuclear mitochondrial insertions there were. This confirms a previous study result [18]. The added green algae species also showed this tendency. One of the peculiarities of plant species is their many Numt hits. Except for green algae (C. reinhardtii and C. subellipsoidea), the number of BLAST hits after merging all overlapping hits ranged from 770 for A. thaliana to 14,509 for V. vinifera. Furthermore, when integrating all of the neighboring hits within 10 kb into one single event, the hit count ranged from 562 in A. thaliana to 9,022 in V. vinifera. This implies that the transposition of mitochondrial DNA of plants into chromosomal DNA is more preferable than in whale species [19].\nNext, we examined the size distribution of the inserted Numts. Here, we merged the neighboring hits within 10 kb into single events. The merged hits showed a high degree of variation in size—the shortest and largest being 25 bp and 107 kb, respectively (Table 2). The size distribution of Numt was also quite variable between species (Fig. 2). Over 70% of Numts were less than 400 bp in all of the analyzed plants. Green algae species that had shorter mitochondrial DNA than other species had over 80% in the group with less than 200 bp, especially in C. reinhardtii (over 96%). V. vinifera, which has a larger mitochondrial genome size than other plants, included 30% of Numts over 1 kb in size, and half of this group was over 5 kb. Z. mays, which has the largest genome and the second largest mitochondrial genome, and B. rapa, which has a relatively shorter genome than Z. mays, showed similar ratio distributions. In general, species having short mitochondrial genomes had a large ratio of short Numts. When comparing monocots and eudicots, there was no clearly shared feature. But, there were some differences when contrasting green algae and land plants. However, it is not a matter of the species group but rather a matter of genomic size variation. There are two kinds of closely related speciation events: one is between A. thaliana and B. rapa, and the other is between S. bicolor and Z. mays. In each of the speciation events, there were whole-genome triplication or duplication events, leading to B. rapa and Z. mays [2425]. Because of that, each pair has a similar genomic content, but the within-pair Numt size distribution patterns are different. In general, B. rapa and Z. mays have lower ratios of long sizes of Numts than A. thaliana and S. bicolor, respectively. Genome triplication or duplication events may have split the long Numt sequences, such that the number of long Numts was reduced. These patterns were also observed in the speciation between G. max and V. vinifera.\nIn the previous whale Numt study [19], they also performed a Numt size distribution analysis. The average whale genome size is 2.5 Gb, and the average mitochondrial genome size is 16 kb. It has a much larger nuclear genome and smaller mitochondrial genome. In whales, the Numt size group over 5 kb is under 2%, but in plants, it is over 4%, and in V. vinifera, it is over 17%. It is presumed that as a result of a 20-fold larger mitochondrial genome, even if going through the second whole-genome duplication event, there are longer Numt sequences that still reside in the plant genome.\nThe next analysis was the classification of Numt insertion loci by genic features (Table 3). In land plants, a substantial portion of Numt hits lay in intergenic regions, except for green algae, where over 70% of the hits were found within genic boundaries. Within genic regions, over 90% of the hits overlapped exons. This is in contrast with the Numt hits in animals, like whales, where the total number of Numt hits was quite low and in which fewer hits were found in exons than in introns [19]. When we calculated the relative abundance of each genic feature after accounting for the total size of each genic feature, the exon was the most enriched in most plants (Fig. 3). Considering the importance of exons in biological processes, it may be tempting to speculate that the numerous Numt insertions into exons may affect the diversity of plant phenotypes.\nMany research studies on Numt analysis have been performed. But, they usually lack details on Numts, such as the correlation between genome size and inserted Numt size, Numt size distribution ratio, loci classification by gene annotation, and so on. Our general basic analysis shows an interesting tendency but is still not enough to infer the biological meaning. Currently, not many plant genomes have been completely sequenced, and furthermore, their accuracy is somewhat compromised due to high repeat contents or high heterozygosity in the genomes. In order to draw a clearer picture of the effect of Numt insertion in the nuclear genome, more population-level genomic data and more accurate genome sequences may be required. Nevertheless, Numts may be one of the key clues of the mysterious biological implications of genomic analysis."}