PMC:1570465 / 36876-38152
Annnotations
{"target":"https://pubannotation.org/docs/sourcedb/PMC/sourceid/1570465","sourcedb":"PMC","sourceid":"1570465","source_url":"https://www.ncbi.nlm.nih.gov/pmc/1570465","text":"An increasingly common way of finding regulatory sites is to look for them among upstream regions of a set of orthologous genes across species (e.g., [9]). In this case additional data, in the form of the phylogenetic tree relating the species, is available and can be exploited. This is especially important when closely related species are part of the input, and, unweighted, they contribute duplicate information and skew the alignment. We use a phylogenetic tree and branch lengths when calculating the edge weights in the graph, with highly diverged sequence pairs getting larger weights. The precise weighting scheme follows the ideas of weighted progressive alignment [42], in which weights αi are computed for every sequence i. The calculation sums branch lengths along the path from the tree root to the sequence at the leaf, splitting shared branches among the descendant leaves, and thereby reducing the weight for related sequences. In essence, we solve a multiple sequence alignment problem with weighted SP-score using match/mismatch, where the computed weight for a pair of positions in sequences i and j is multiplied by αi × αj. The rest of the algorithm operates as in the basic motif finding case above, employing the same LP formulation and DEE techniques.","tracks":[{"project":"2_test","denotations":[{"id":"16916460-11997340-1687083","span":{"begin":151,"end":152},"obj":"11997340"},{"id":"16916460-3118049-1687084","span":{"begin":676,"end":678},"obj":"3118049"}],"attributes":[{"subj":"16916460-11997340-1687083","pred":"source","obj":"2_test"},{"subj":"16916460-3118049-1687084","pred":"source","obj":"2_test"}]}],"config":{"attribute types":[{"pred":"source","value type":"selection","values":[{"id":"2_test","color":"#ec93b5","default":true}]}]}}