PMC:540021 / 1418-3739 JSONTXT

Annnotations TAB JSON ListView MergeView

    2_test

    {"project":"2_test","denotations":[{"id":"15608176-14681378-77178204","span":{"begin":277,"end":278},"obj":"14681378"},{"id":"15608176-14681379-77178205","span":{"begin":288,"end":289},"obj":"14681379"},{"id":"15608176-12520025-77178206","span":{"begin":302,"end":303},"obj":"12520025"},{"id":"15608176-14681377-77178207","span":{"begin":315,"end":316},"obj":"14681377"},{"id":"15608176-10592233-77178208","span":{"begin":327,"end":328},"obj":"10592233"},{"id":"15608176-12230034-77178209","span":{"begin":339,"end":340},"obj":"12230034"},{"id":"15608176-12520011-77178210","span":{"begin":356,"end":357},"obj":"12520011"},{"id":"15608176-11021970-77178211","span":{"begin":665,"end":666},"obj":"11021970"},{"id":"15608176-11021970-77178212","span":{"begin":1232,"end":1233},"obj":"11021970"},{"id":"15608176-11209054-77178212","span":{"begin":1232,"end":1233},"obj":"11209054"},{"id":"15608176-9783222-77178212","span":{"begin":1232,"end":1233},"obj":"9783222"},{"id":"15608176-12952881-77178213","span":{"begin":1321,"end":1323},"obj":"12952881"}],"text":"INTRODUCTION\nOne of the fundamental goals of the genomic era is to extract information about the function of proteins from sequence data on a large scale. To this end, many databases have been developed that group homologous protein sequences into families, for example, Pfam (1), SMART (2), TIGRFAMs (3), PROSITE (4), BLOCKS (5), PRINTS (6) and InterPro (7). InterPro, Pfam and SMART are the most widely used among these databases.\nThe membership of a protein to a particular family generally indicates the broad function it may perform. If more detailed functional aspects are sought, it is often necessary to analyze the subfamily membership within that family (8).\nA subfamily can be viewed as a set of proteins with related functions and domain organizations resulting from a particular line of evolution within a family. With the rapid growth of the sequence databases, the number of sequences belonging to a particular protein family is increasing sharply. As a consequence, it is becoming necessary to analyze the relationships between the numerous members of a protein family by categorizing them into subfamilies. Even though efforts have been made in this direction, they have only been applied to a handful of families (8–10). PANTHER is an exception, but is not freely available to the scientific community (11).\nMany protein families have evolved to accommodate a wide range of functions, with each subfamily performing a specific function even though the general function may be the same for all the subfamilies. Hence it is necessary to identify subfamilies in protein families and analyze them for function shifts to enable better functional annotation of protein sequences.\nConservation patterns in protein multiple sequence alignments can be used to analyze the evolutionary constraints operating on different subfamilies. We use here two kinds of sites to predict function shift between subfamilies. These are conservation shifting sites (CSS), which are conserved in two subfamilies but using different amino acid residues, and rate shifting sites (RSS), which have different evolutionary rates in two subfamilies.\nHere, we present a new database called FunShift that provides subfamily classifications and function shift analysis of the subfamilies derived from full alignments of the Pfam database."}