PMC:1892782 / 14598-15628
Annnotations
2_test
{"project":"2_test","denotations":[{"id":"17540014-15665081-1690050","span":{"begin":64,"end":66},"obj":"15665081"}],"text":"Alignments for training were taken from the same sources as in [15] including representatives for rRNAs, spliceosomal RNAs, tRNAs, miRNAs, small nucleolar RNAs, nuclear RNaseP and SRP RNA. Sequence similarity in this data set ranges from 47% to 99% mean pairwise identity in alignments of 40 nt to 400 nt length and of 2 to 6 sequences. The detailed distributions of mean pairwise identity, length, number of sequences and GU base pair content are given in the supplementary material (see Additional file 1). A total of 5886 ClustalW alignments, approximately equally representing these ncRNA families, were used for training after removing alignments that were not recognized as structured RNA by RNAz in both reading directions. This data set was splitted into two subsets of equal size, namely the positive and negative training set. Alignments in the negative training set were transformed to the reverse complement and realigned with ClustalW as opposed to take just the reverse complementary alignment of the structured RNA."}