PMC:1459173 / 1503-4500
Annnotations
{"target":"https://pubannotation.org/docs/sourcedb/PMC/sourceid/1459173","sourcedb":"PMC","sourceid":"1459173","source_url":"https://www.ncbi.nlm.nih.gov/pmc/1459173","text":"Background\nLet s be a sequence of sets of characters from an alphabet Σ ⋃ {·}, where '.' ∉ Σ denotes a don't care (dot, for short) and the rest are solid characters, we use σ to denote a singleton character. For characters e1 and e2, we write e1 ∘ e2 if and only if e1 is a dot or e1 = e2. Allowing for spacers in a string is what makes it extensible. Such spacers are indicated by annotating the dot characters. Specifically, an annotated \".\" character is written as .α where α is a set of positive integers {α1, α2, ..., αk} or an interval α = [αl, αu], representing all integers between αl and αu including αl and αu. Whenever defined, d will denote the maximum number of consecutive dots allowed in a string. In such cases, for clarity of notation, we use the extensible wild card denoted by the dash symbol \"-\" instead of the annotated dot character, .[1,d] in the string. Note that '-' ∉ Σ. Thus a string of the form a.[1,d]b will be simply written as a-b. A motif m is extensible if it contains at least one annotated dot, otherwise m is rigid. Given an extensible string m, a rigid string m' is a realization of m if each annotated dot .α is replaced by l ∈ α dots. The collection of all such rigid realizations of m is denoted by R(m). A rigid string m occurs at position l on s if m[j] ∘ s[l + j - 1] holds for 1 ≤ j ≤ |m|. An extensible string m occurs at position l in s if there exists a realization m' of m that occurs at l. Note than an extensible string m could possibly occur more than once at a location on a sequence s. Throughout in the discussion we are interested mostly in the (unique) first left-most occurrence at each location.\nFor a sequence s and positive integer k, k ≤ |s|, a string (extensible or rigid) m is a motif of s with |m| \u003e 1 and location list m = (l1, l2, ..., lp), if both m[1] and m[|m|] are solid and m, |m| ≥ k, is the list of all and only the occurrences of m in s. Given a motif m let m[j1], m[j2], ... m[jl] be the l solid elements in the motif m. Then the sub-motifs of m are given as follows: for every ji, jt, the sub-motif m[ji ... jt] is obtained by dropping all the elements before (to the left of) ji and all elements after (to the right of) jt in m. We also say that m is a condensation for any of its sub-motifs. We are interested in motifs for which any condensation would disrupt the list of occurrences. Formally, let m1, m2, ..., mj be the motifs in a string s. A motif mi is maximal in length if there exists no ml, l ≠ i with and mi is a sub-motif of ml. A motif mi is maximal in composition if no dot character of mi can be replaced by a solid character that appears in all the locations in m. A motif mi is maximal in extension if no annotated dot character of mi can be replaced by a fixed length substring (without annotated dot characters) that appears in all the locations in m. A maximal motif is maximal in composition, in extension and in length. For an exhaustive description of these properties we refer the reader to [1].","divisions":[{"label":"title","span":{"begin":0,"end":10}},{"label":"p","span":{"begin":11,"end":1653}}],"tracks":[]}