PubMed:1822233 / 113-116
A method for identifying a proposed carbohydrate-binding motif of proteins.
An examination of the binding sites of four carbohydrate binding proteins (Escherichia coli lactose repressor, E. coli arabinose-binding protein, yeast hexokinase A and Concanavalin A) revealed certain similarities of amino acid sequences and residues forming hydrogen bonds and hydrophobic interactions with the bound carbohydrate. These were: (i) Asx-Asx, hydrogen bonding to the pyranose ring oxygen and anomeric-OH group; (ii) Arg-X-X-X-(Ser/Thr), or the reverse sequence, with the Arg hydrogen bonding to the pyranose ring oxygen; (iii) Lys-(Ser/Thr)-X-X-Asp, or the reverse sequence and with interchange of the Lys-(Ser/Thr) positions, with hydrogen bonding of either or both the Lys and Asp residues to the -OH groups at carbons 2, 3, 4 or 6; (iv) a diaromatic sequence with possible hydrophobic interactions to the faces of the pyranose ring structure. An algorithm was devised to search the amino acid sequences of a large number of proteins, those known to bind carbohydrates as well as those without known carbohydrate-binding activities, for the four amino acid sequence criteria. The algorithm incorporated a weighted distance value (WDV) to assess the approximate distance between any two criteria, with the WDV being based on the predicted secondary structure of the protein amino acid sequence. When the algorithm using criteria 1 and 2 plus the WDV was applied to the sequences of 125 proteins, the method indicated the presence of the potential carbohydrate-binding site motif for 42% of proteins with known carbohydrate binding, only 8% of proteins were predicted as false positives, and the accuracy of the method was calculated to be 61.6%.(ABSTRACT TRUNCATED AT 250 WORDS)
|