Alignment studies like those above are helpful in forming initial hypotheses, but they are clearly not convincing by themselves when the degrees of match between aligned residues is weak; however, it is possible that the actual composition in terms of amino acids in the subsequence and potential motif of interest is more important than their actual order in that subsequence. Based on many studies exemplified by the above, it was noted that tryptophan is a recurrent, albeit not absolutely essential, feature of sugar binding including sialic acids. References to that also recur throughout the biochemical literature. For example, see e.g. Ref. [14], and also Fig. 2 shows the tryptophan interaction with and sialic acid in the Influenza Virus B neuraminidase (PDB Entry 2BAT). As might be expected by their similar aromatic character, the alternative amino acid residues as sugar binders, and residues frequently supporting tryptophan in the binding site, tend to be aromatic sidechains, notably tyrosine (Y), sometimes phenylalanine (F) and also histidine (H). A preliminary survey of sequence motif patterns in sugar binding proteins suggests that invariant amino acid residues across a family of proteins tend be one or more of the above residues supported by negatively charged aspartate (D), asparagine (N), serine (S), threonine (T) glycine (G) and sometimes alanine (A) that provide the hydrogen bonding. However, particularly in regard to the non-aromatic residues, the binding of acidic and non-acidic sugars should probably be distinguished. As discussed later below, charged amino acid residues glutamate (E), arginine (R), and lysine (K) also frequently make intimate contact with sialic acids but that is in three dimensions, not together in a subsequence. A likely relevant observation was that the first set of amino acid residues (the set containing aspartate) and binding sialic acids tended to occur in a subsequence that adopted a local loop conformation, while the second set (that containing glutamate) were frequently associated with α-helices and particularly their termini. However, this was an empirical and qualitative observation regarding a tendency, and a more objective quantification of the importance of the aspartate set is the purpose of the prediction algorithm developed below. Three dimensional considerations, however, give insight and sometimes explain why influences can be somewhat indirect. Hydrogen bonding that occurs between the hydroxyl groups of carbohydrate ligands and polar amino acid residues at the binding site is typically supported by water-mediated hydrogen bonding networks in which serine and threonine are fairly commonly involved. Nonetheless, the most outstanding feature of carbohydrate binding sites from a three dimensional perspective would appear to be the position and orientation of tryptophan (W), tyrosine (Y), and/or phenylalanine (F), which usually provide a hydrophobic plate for close interaction with the planar face of sugar rings, an interaction resembling hydrophobic stacking interactions, as in Fig. 2. The importance of these and to some extent of histidine (H) in a sequence motif seems reasonable. Fig. 2 The influenza virus B neuraminidase tryptophan interaction with and sialic Acid(PDB entry 2BAT).