To test the performance of an alignment program or the influence of program parameters on performance, we removed all gaps from the datasets, realigned them by the program to be tested, and scored the new alignments by a modified sum-of-pairs score (SPS') and the structure conservation index (SCI). The SPS' scores the identity between test and reference alignments, whereas the SCI scores consensus secondary structure information; for details see Methods. Both scores were multiplied to yield the final RNA alignment score, termed BRALISCORE. For the ranking of program parameters and options of individual programs, or of different programs we used Friedman rank sum and Wilcoxon signed rank tests; for details see Methods. Different program options or even different programs resulted in only small differences in alignment quality for datasets of APSI above 80 %, which is in accordance with the previous benchmark results [22]. Because the alignment problem seems to be almost trivial at these high identities and in order to reduce the number of alignments that need to be computed, we report all results only on datasets with APSI ≤ 80 %.