PMC:4502367 / 11902-13115 JSONTXT

Annnotations TAB JSON ListView MergeView

    2_test

    {"project":"2_test","denotations":[{"id":"25917918-18757608-43386545","span":{"begin":44,"end":48},"obj":"18757608"},{"id":"25917918-2231712-43386546","span":{"begin":274,"end":278},"obj":"2231712"},{"id":"25917918-17379688-43386547","span":{"begin":345,"end":349},"obj":"17379688"},{"id":"25917918-14534192-43386548","span":{"begin":1035,"end":1039},"obj":"14534192"}],"text":"First, GeneMark-ES (Ter-Hovhannisyan et al. 2008) was used for ab initio predictions, because its self-training algorithm allowed the identification of high-quality gene models. Next, evidence from homology searches using tBLASTn (e-value cut-off of 1e-10) (Altschul et al. 1990) against a nonredundant protein database (UniRef90) (Suzek et al. 2007) and from reconstructed transcripts were used to filter the ab initio predicted gene models. Only the complete gene models predicted by GeneMark-ES with a support from the homology-based comparison (100% of coverage for each exon) and with the exact same exon–intron boundaries as in the reconstructed transcripts were selected for the training and testing of ab initio gene predictors. The 2693 selected gene models were divided into two sets: one for the training (training set) containing 1611 sequences (60%) and one for assessing prediction accuracy (test set) containing 1082 sequences (40%). The evaluation of the training process was performed using Augustus (Stanke and Waack 2003). For all species, better performance was obtained using datasets of Z. tritici. We used the Z. tritici training set to train the ab initio gene callers for all the species."}