PMC:1630425 / 34049-35952
Annnotations
2_test
{"project":"2_test","denotations":[{"id":"17049089-15501469-1690152","span":{"begin":458,"end":460},"obj":"15501469"},{"id":"17049089-7630711-1690153","span":{"begin":916,"end":918},"obj":"7630711"},{"id":"17049089-10673015-1690154","span":{"begin":919,"end":921},"obj":"10673015"},{"id":"17049089-15501469-1690155","span":{"begin":1149,"end":1151},"obj":"15501469"}],"text":"A novel kernel density method to measure oligomeric frequency in a iterative sequence maps of biological sequences (Chaos Game Representation or its generalization to alphabets longer than 4 units, Universal sequence Maps) was described and summarily illustrated. However, the illustration would not be complete without mapping the promoter regions of Bacillus subtilis and the recognition of the TATA box in the same sequences used in the preceding report [15], which motivated the development described here. This discussion will therefore focus on the representation and decomposition of sequence conservation, which can be detected by unlikely repetition of the conserved segment of because the conserved segment has an unlikely composition in the context of the remaining sequence. Accordingly, the illustration in Figure 4 uses the same 20, 100 unit long upstream promoter regions of B.subtilis obtained from [23,24], all having a known promoter sequence constituted by the sub-string TTGACA-(space)-TATAAT with at most one substitution (known as the TATA-box). The entropic properties of those sequences were discussed in the preceding work [15], were they were designated by the Es symbol. For the sake of reference, the Es concatenation is embedded in the software library provided with this report, and is retrieved when using the illustrative function paper_fig(4), which reproduces Figure 4 (this function can be used to reproduce the other three figures too, see Methods). The volume under the density distribution is, by definition, unitary (the normalized height is obtained by dividing H, equation 3, by N, the total number of sequence units). Therefore, the average value of the matrix underneath the 3D bar plot in Figure 4 is also unitary and sets the scale for the representation (scaled height axis is represented in the 3D view of the density distribution represented in Figure 4)."}