PMC:1524773 / 1644-2825 JSON TXT

Annnotations TAB JSON ListView MergeView

TextAE

Viewing the subject sequences aligned to a query sequence that result from a BLAST-based [1] comparison, in many cases one can identify groups of sequences clustering around different subintervals of the query sequence. Often, the decision by eye to which cluster a certain sequence belongs, is strongly depending on the order in which the sequences are presented. Fig. 1a) shows a schematic sketch of aligned sequences in random order. The sequences seem to form two, or maybe three groups. The same sequences in Fig. 1b) are ordered according to how many positions they have in common and colors indicate those sequences that share a large part of their sequence. The algorithm finds three different clusters of sequences. A cluster in this sense can be defined as a subset of aligned sequences that have approximately the same length and that are aligned to approximately the same subinterval of the query sequence. As we have argued above, the human eye may be fooled by the ordering of the presented sequences and humans are limited in the number of sequences they can group into clusters, and thus the automatic and objective computation of a clustering is an important task.

PMC:1524773 / 1644-2825 JSONTXT

Annnotations TAB JSON ListView MergeView

PMC:1524773 / 1644-2825 JSON TXT