PMC:3091640 / 36212-39889 JSONTXT

Annnotations TAB JSON ListView MergeView

    2_test

    {"project":"2_test","denotations":[{"id":"20646326-17145706-10480767","span":{"begin":214,"end":216},"obj":"17145706"},{"id":"20646326-12519987-10480768","span":{"begin":373,"end":375},"obj":"12519987"},{"id":"20646326-9514730-10480769","span":{"begin":680,"end":682},"obj":"9514730"},{"id":"20646326-12520028-10480770","span":{"begin":899,"end":901},"obj":"12520028"},{"id":"20646326-10592173-10480771","span":{"begin":1563,"end":1565},"obj":"10592173"},{"id":"20646326-11181991-10480772","span":{"begin":1722,"end":1724},"obj":"11181991"},{"id":"20646326-18436778-10480773","span":{"begin":2196,"end":2198},"obj":"18436778"},{"id":"20646326-7984417-10480774","span":{"begin":2489,"end":2491},"obj":"7984417"},{"id":"20646326-17488738-10480775","span":{"begin":2649,"end":2651},"obj":"17488738"}],"text":"Methods\n\nIdentification of DRR genes\nAll the annotated gene and protein sequences of rice chromosomes (TIGR release 6) were downloaded from The Institute for Genomic Research (TIGR) Rice Genome Annotation website [39]. For Arabidopsis, all the annotated gene and protein sequences (TAIR release 8) were downloaded from The Arabidopsis Information Resource (TAIR) database [40]. The redundant and genes annotated as transposable elements were removed to get a final list of non-redundant genes in rice and Arabidopsis. To identify DRR genes we used two different criteria. To select the correct putative orthologue first, we performed global alignments using the FASTA BLAST suit [41] and rejected all alignments with less than 20% identity, unless a portion of the protein showed a very strong similarity. Secondly, a possible conserved motif was searched using the Conserved Domain Database (CDD) [42]. When none of these two approaches produced significant result, the corresponding gene was discarded. Primarily, Potential DRR genes in rice and Arabidopsis genome were identified by sequence similarity searches using the DNA repair genes of human, Saccharomyces cerevisiae and E. coli as seed sequences orthologs and keyword searches. Secondly Arabidopsis DRR genes were used to identify the putative DNA repair genes in rice genome. We, next, we searched the primary literature, and when presented, included experimental results that concerned to plant systems to authenticate the annotation. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway database [43] was used as guide for the gene involve in main repair and recombination pathway while we follow the classification of Wood et al. for the DNA repair genes[44].\n\nAnalysis of DRR genes\nThe percentage of amino acid identity between a given human/Saccharomyces cerevisiae/E. coli gene and its corresponding orthologue in Arabidopsis and rice was obtained from the Smith-Waterman alignment between the two sequences. Means were calculated using these values.\nDuplicated genes and their Ka, Ks and e-values in DRR proteins of rice, Arabidopsis, sorghum and populus were extracted out from the Plant Genome Duplication Database (PGDD) [45]. PGDD is a public database to identify and catalogue plant genes in terms of intra or inter-genome syntenic relationships.\nPhylogenetic trees were generated for a group of DNA repair protein homologs. Protein sequences were aligned using the Clustal W multiple sequence alignment program [46]. Only unambiguously aligned positions (excluding poorly conserved and gap regions) were used in phylogenetic analysis, which was performed using the MEGA4 [47]. Neighbor-Joining and Parsimony methods were used for phylogenetic tree searching and inference. The phylogenetic trees were tested by bootstrap analysis with 10000 replications and strict consensus trees were constructed. Similar topologies were found for both algorithms employed, and only Neighbor-Joining being displayed. We used PGDD naming convention for the populus and sorghum dataset in this analysis as the original naming style for predicted models were un-informative. The respective accession numbers used in this analysis and the original naming of predicted genes are given in Additional file 9.\nFunctional diversity of DRR proteins was carried out according to the GO rules using the gene ontology tools at http://www.agbase.msstate.edu. Only two independent sets of ontologies were used to describe a gene product: (1) the biological process in which the gene product participates and (2) the molecular function that describes the gene product activities.\nAdditional methods are given in Additional file 10."}