PMC:3475482 / 7944-8465 JSONTXT

Annnotations TAB JSON ListView MergeView

    BLAH6-GNI-CORPUS

    {"project":"BLAH6-GNI-CORPUS","denotations":[{"id":"T172","span":{"begin":381,"end":383},"obj":"protein"},{"id":"T173","span":{"begin":310,"end":312},"obj":"protein"},{"id":"T174","span":{"begin":299,"end":306},"obj":"protein"},{"id":"T175","span":{"begin":198,"end":200},"obj":"protein"},{"id":"T176","span":{"begin":148,"end":151},"obj":"protein"}],"text":"Definition 5 (Information)\nThe information carried by a DNA character or base in DNA sequence database D is defined as I(c) = -log|C|Pr(c,D), where |C| is the number of distinct characters in D and Pr(c) is the probability of c occurs in D.\nFor example, the occurrence probability of character A in Table 1 is Pr(A,D) = # of occurence(A)/|D|. So, the probability of character A is Pr(A,D) = 10/55 = 0.182 in our example database. Then, the information of character A in D is, I(A) = -log|C|Pr(A,D) = -log4(0.182) = 1.228."}

    BLAH6-GNI-CORPUS2

    {"project":"BLAH6-GNI-CORPUS2","denotations":[{"id":"T56","span":{"begin":490,"end":492},"obj":"protein"},{"id":"T57","span":{"begin":488,"end":489},"obj":"protein"},{"id":"T58","span":{"begin":476,"end":480},"obj":"protein"},{"id":"T59","span":{"begin":391,"end":393},"obj":"protein"},{"id":"T60","span":{"begin":381,"end":383},"obj":"protein"},{"id":"T61","span":{"begin":310,"end":312},"obj":"protein"},{"id":"T62","span":{"begin":299,"end":306},"obj":"protein"},{"id":"T63","span":{"begin":226,"end":227},"obj":"protein"},{"id":"T64","span":{"begin":201,"end":202},"obj":"protein"},{"id":"T65","span":{"begin":198,"end":200},"obj":"protein"},{"id":"T66","span":{"begin":149,"end":150},"obj":"protein"},{"id":"T67","span":{"begin":136,"end":137},"obj":"protein"},{"id":"T68","span":{"begin":133,"end":135},"obj":"protein"},{"id":"T69","span":{"begin":131,"end":132},"obj":"protein"},{"id":"T70","span":{"begin":121,"end":122},"obj":"protein"},{"id":"T71","span":{"begin":81,"end":93},"obj":"DNA"},{"id":"T72","span":{"begin":56,"end":59},"obj":"DNA"}],"text":"Definition 5 (Information)\nThe information carried by a DNA character or base in DNA sequence database D is defined as I(c) = -log|C|Pr(c,D), where |C| is the number of distinct characters in D and Pr(c) is the probability of c occurs in D.\nFor example, the occurrence probability of character A in Table 1 is Pr(A,D) = # of occurence(A)/|D|. So, the probability of character A is Pr(A,D) = 10/55 = 0.182 in our example database. Then, the information of character A in D is, I(A) = -log|C|Pr(A,D) = -log4(0.182) = 1.228."}