CORD-19:12af81d8d81c5da78255316d73ab6564cc2e835e / 0-44 2 Projects
Viruses and Sialic Acids: Rules of Engagement
Abstract
Viral infections are initiated by specific attachment of a virus particle to receptors at the surface of the host cell. For many viruses, these receptors are glycans that are linked to either a protein or a lipid. Glycans terminating in sialic acid and its derivatives serve as receptors for a large number of viruses, including several human pathogens. In combination with glycan array analyses, structural analyses of complexes of viruses with sialylated oligosaccharides have provided insights into the parameters that underlie each interaction. Here, we compare the currently available structural data on viral attachment proteins in complex with sialic acid and its variants. The objective is to define common parameters of recognition and to provide a platform for understanding the determinants of specificity. This information could be of use for the prediction of the location of sialic acid binding sites in viruses for which structural information is still lacking. An improved understanding of the principles that govern the recognition of sialic acid and sialylated oligosaccharides would also advance efforts to develop efficient antiviral agents.
The monosaccharide sialic acid decorates all eukaryotic cell surfaces, capping many different oligosaccharide structures on N-and O-linked glycoproteins as well as on glycolipids [1] . Glycans terminating in sialic acid have emerged as a key class of receptors for an impressive number of viruses, many of which are human pathogens. The highly pathogenic Influenza A, B and C viruses as well as the human parainfluenzaviruses attach to sialic acids (reviewed in [2] [3] [4] ). Coxsackievirus A24 variant and Enterovirus 70, which cause Acute Hemorrhagic Conjunctivitis and have pandemic potential, also attach to sialylated oligosaccharides [5, 6] . Epidemic Keratoconjunctivitis (EKC) has been linked to several human D-type Adenoviruses, and one of these has recently been shown to attach to the disialylated GD1a motif [7] . The human JC and BK Polyomaviruses (JCV and BKV, respectively) cause a fatal demyelinating disease and kidney graft loss, respectively, in immunocompromised individuals. Both viruses use glycans terminating in sialic acid as their receptors [8, 9] . Moreover, the recently identified Merkel Cell Polyomavirus, a human oncovirus, likely uses the trisialylated ganglioside GT1b as a receptor [10, 11] , and other mammalian polyomaviruses such as Simian Virus 40 (SV40) and murine Polyomavirus (Polyoma) also bind glycans terminating in sialic acid [12] [13] [14] . Most human noroviruses, the causative agents of violent gastrointestinal illnesses, attach to non-sialylated histo-blood group antigens, in contrast to murine norovirus, which binds to a sialylated oligosaccharide [15] . However, some strains of human noroviruses were recently shown to bind to the sialyl-Lewis X motif as well [16] . Rotaviruses cause severe gastroenteritis in children. They have long been classified into strains that can be inhibited by neuraminidase treatment, which cleaves sialic acid from glycan sequences on host cells, and those that are insensitive to it [17, 18] . Neuraminidase-insensitive strains were presumed to use non-sialylated receptors. Interestingly, the "neuraminidase-insensitive" rotavirus strain Wa was recently shown to attach to the ganglioside GM1, which carries a branching sialic acid [19] . Due to its branched structure, this particular carbohydrate is difficult to cleave with neuraminidases [20, 21] . SV40 had likewise been presumed to attach to a non-sialylated carbohydrate [22] [23] [24] before GM1 was identified as its receptor [12] . Sialic acid therefore has to be considered as a possible receptor component even for viruses whose infectivity cannot be modulated by treatment of cells with commonly used neuraminidases.
The structure of the most common sialic acid in humans, α-5-N-acetyl-neuraminic acid (Neu5Ac), features four protruding functional groups (carboxylate, hydroxyl, N-acetyl and glycerol functions). Compared to more simple monosaccharides, the large number of functional groups enables sialic acids to participate in an unparalleled number of hydrogen bonds, salt bridges and non-polar interactions. Since sialic acid is typically located at the terminus of a glycan, its functions are easily accessible for interactions. Perhaps it is not surprising, therefore, that the sialic acid itself serves as the major point of contact with the glycan-binding viral attachment protein in all cases where structural information of sufficient resolution is available [8, 13, [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] .
In this review, we compare binding modes of sialic acids and sialylated receptors in representative structures of virus-receptor complexes in order to derive parameters that guide the recognition of sialic acid and its derivatives. Such parameters could be useful for the prediction of new sialic acid binding sites in viral proteins, or of altered modes of sialic acid (and its derivatives) binding in different viral serotypes and strains.
Several structures of viral attachment proteins in complex with sialylated compounds have been determined recently, providing new insights into viral specificity for glycan receptors [8, 13, [35] [36] [37] [38] [39] 41] . Taken together, the known structures now form a large database that is suitable for the closer examination of contacts in order to compare the modes of interaction and attempt to define common principles of sialic acid recognition. We have investigated here the mode of Neu5Ac binding for nine different viral attachment proteins ( Fig. 1 ): the hemagglutinins (HAs) of Influenza A and B viruses [26, 34] , the Adenovirus serotype 37 (Ad37) fiber knob [7] , the canine Adenovirus serotype 2 (cAd2) fiber knob [39] , the major capsid proteins VP1 of the polyomaviruses Polyoma [29] , SV40 [13] , and JCV [8] , the attachment protein σ1 of human type 3 orthoreovirus [41] , the attachment protein VP8* of Rhesus Rotavirus [30] , and the hemagglutinin-neuraminidase (HN) of Newcastle Disease Virus (NDV) [31] . As the interactions with terminal sialic acid are very similar among different types of Influenza A HAs, the structure of the H3 type [26] was chosen to represent this group. In addition, we have also analyzed contacts in the hemagglutinin-esterase-fusion (HEF) protein of Influenza C virus [27] and the hemagglutinin-esterase (HE) proteins of Bovine Coronavirus (BCoV) [35] , Bovine Torovirus (BToV) [36] and Porcine Torovirus (PToV) [36] (Fig. 2) . These four proteins bind to derivatives of Neu5Ac that are Oacetylated at position 9 or at positions 4 and 9. In cases where the attachment protein also has receptor-destroying enzymatic activity, such as in the HN and HE(F) proteins, only the sites that can clearly be attributed to attachment are considered here, thus excluding the dual function neuraminidase site of some HN proteins [32, 42, 43] .
The investigated viral attachment proteins are, for the most part, not homologous to one another and belong to unrelated viruses that differ in envelope structure and genome type. Nevertheless, their interactions with sialic acid display striking similarities (Figs. 1, 2) . In all complexes, the sialic acid adopts essentially the same conformation, namely a trans conformation of the 5-N-acetyl group and an α-conformation at the anomeric carbon, which is dominant in biological oligosaccharides. Interestingly, all attachment proteins, including the ones that bind to O-acetylated compounds, make extensive contacts with one face of the sialic acid ring, while the other face is engaged by only few contacts (Figs. 1A, 2A-C). A likely reason for this preference is the formation of two key contacts that are formed in all complexes in a similar manner. One of these contacts involves the negatively charged carboxylate group, which is most often recognized by two parallel hydrogen bonds or a salt bridge. Each of the analyzed proteins donates at least one hydrogen bond to a carboxylate oxygen atom. The second contact involves the nitrogen atom in the N-acetyl group. With only one exception, all proteins receive a hydrogen bond from this nitrogen atom. The spatial arrangement of carboxyl group and the N-acetyl nitrogen thus helps distinguish sialic acids from other monosaccharides. Both groups project from the same face of the sialic acid ring, accounting for the preferential binding of this face of the monosaccharide. Apart from these two key interactions, the proteins engage in different hydrogen bonding patterns to various hydroxyl groups of the glycerol chain or the ring, or to additional acetyl substituents.
Neu5Ac reveals that only about 50% of the contact surface of Neu5Ac participates in such contacts (Figs. 1A, 2C). The resulting shape resembles a rimmed imprint of the binding face of Neu5Ac on the protein surface (Fig. 1A, 2C ). In all complexes, van der Waals interactions are formed with the methyl group of the N-acetyl chain. However, the different proteins interacting with Neu5Ac sample different epitopes on the Neu5Ac contact surface. For example, JCV and SV40 VP1, Rhesus Rotavirus VP8*, and Influenza A HA all center their van der Waals contacts on the glycerol and N-acetyl chains (Fig 1E, F , I, J). The surfaces of all four viruses feature subtle protrusions that separate the recessed areas in which the glycerol and N-acetyl chains are bound. Polyoma VP1, on the other hand, mainly contacts Neu5Ac from the other side, and does not interact with the glycerol chain at all (Fig. 1B) . Examination of the binding surfaces demonstrates that shape complementarity is an important factor in the engagement of sialic acid. As the contact areas are quite small and the sialic acids are partially exposed to solvent, adding or removing a single contact can thus have significant effects on the affinity of a given virus for sialic acid or its variants.
The parent compound of Neu5Ac, neuraminic acid, can feature numerous modifications that give rise to over 40 different known sialic acid variants [44, 45] . Several of these modifications are predominantly found on specific cell types and tissues, or in selected species. It is perhaps not surprising, therefore, that some viruses exploit this divergence and preferentially recognize sialic acids other than Neu5Ac. The database of viral protein structures contains few examples of viruses attaching to O-acetylated Neu5Ac [27, 35, 36] , but their analysis is nevertheless informative (Fig. 2) . While the key hydrogen bonds to one face of the sialic acid are the same as described for Neu5Ac, the distribution of van der Waals contacts is somewhat altered. In the four complexes, the majority of van der Waals contacts are centered around the unique 9-O-acetyl groups as well as the adjacent side of the N-acetyl group, while the opposite side of the ring does not engage in as many interactions (Fig. 2B) . The 9-O-acetyl group inserts deeply into tight-fitting protein cavities, providing selectivity for sialic acids modified in this manner. Recognition of different sialic acids is also a likely cause of changes in tropism and host range. The interactions of SV40 with GM1 ganglioside containing α-N-5-glycolyl neuraminic acid (Neu5Gc), a sialic acid present in simians but not humans, illustrate this point [46] . SV40 VP1 features a large pocket near the Neu5Ac N-acetyl group (Fig. 1F, 3B) , and it is tempting to speculate that this pocket serves to accommodate the additional hydroxyl group of Neu5Gc [13] . VP1 of the human JCV (Fig. 1E, 3A) , whose sialic acid binding site is largely similar to that of SV40 VP1, features a much smaller pocket that likely prefers the smaller human Neu5Ac over the simian Neu5Gc.
Sialic acid binding sites are often highly conserved in homologous viruses. This is evident when comparing different HA types of Influenza A, the capsid proteins of JCV and SV40 (Fig. 3A, B) , or the HE proteins of PToV and BToV (Fig. 2F, G) . In all three cases, the sialic acid engages the two homologous proteins using similar contacts and is therefore bound in the same orientation and position. However, at least one example exists where homologous proteins, the Ad37 and cAd2 fiber knobs, bind sialic acid at different locations (Fig. 1C, D) . Interestingly, there are several examples of highly homologous proteins that bind sialic acid at the same site but in different orientations. The VP1 proteins of Polyoma and SV40, for example, feature a very high level of sequence identity, and they bind sialic acid in generally similar areas on the protein surface. However, the orientations of the bound sialic acids differ markedly (Fig. 1B, F) [13] . Similarly, Influenza C HEF and BCoV HE bind sialic acid at the same position, but again in different orientations with respect to the proteins and with contacts provided by different structural elements (Fig. 2D, E) [35] . These two examples demonstrate the need for caution when modeling interactions with sialic acid based on a homologous structure.
As sialic acid is ubiquitous at the cell surface, interactions with subsequent carbohydrates are typically employed to define specificity and tropism. Glycan microarrays have been highly useful in revealing the determinants of such interactions [47] . The critical role of the context of the sialic acid -linkage type, as well as length, sequence and conformational preferences of the remaining oligosaccharide chain -is perhaps best illustrated by its influence on the host range of Influenza viruses. Briefly, human Influenza A viruses engage long glycans terminating in α2,6-linked sialic acid that preferentially adopt a bent conformation and that are expressed extensively in the upper airway epithelia of humans. Avian strains predominantly recognize shorter glycans that adopt a linear conformation and that often contain α2,3-linked linked sialic acid [48] . The vast database on Influenza A HA structures in complex with sialylated ligands, and concurrent glycan array analyses, has been the subject of several excellent recent reviews [2, 3] , and will therefore not be discussed in detail here. However, glycan array screening has recently helped to unravel the identities of sialylated glycan receptors for two pathogenic human viruses, and structural biology has defined the nature of interaction in both cases [8] . We review below each of these two examples, which illuminate the importance of the context in which the terminal sialic acid is placed.
Several members of the polyomavirus family use sialylated receptors for cell attachment. Crystal structures of two members of the family in complex with their cognate receptors have been determined recently: SV40 VP1 has been crystallized in complex with the oligosaccharide portion of its ganglioside receptor GM1 [13] , whereas the structure of the VP1 protein of human JCV has been solved with the pentasaccharide receptor fragment LSTc (Lacto-series tetrasaccharide c) [8] . Both receptors feature terminal Neu5Ac, which is α2,3-linked in the branched GM1 molecule and α2,6-linked in the linear LSTc structure (Fig. 3) . In each case, glycan array screening has unequivocally identified the type of the receptor [13, 8] . Moreover, although both GM1 and LSTc were present on the arrays, JCV VP1 failed to interact with GM1, and SV40 VP1 also did not recognize the LSTc compound. Thus, both proteins are highly specific for their cognate receptors. A comparison of the two structures shows that the sialic acid portions of the two receptors form largely equivalent interactions with their respective proteins (Fig. 1E, F; Fig. 3 ). The remarkable specificity for each receptor can be attributed to contacts that involve the remaining parts of the oligosaccharides. The LSTc compound assumes a bent conformation, forming additional contacts via the N-acetyl group of its third sugar, GlcNAc, to N123 of JCV (Fig. 3A ). An α2,3-linked Neu5Ac would not adopt a similarly bent conformation, explaining why sialylparagloboside, which is identical to LSTc except for its α2,3-linked Neu5Ac, does not bind JCV. Modeling LSTc into the SV40 VP1 binding site by superimposing the two sialic acid structures suggests that LSTc could be tolerated by SV40. However, as the residue equivalent to N123 in JCV is a glycine (G131) in SV40, no favorable contacts to the GlcNAc residue can be generated (Fig. 3B) . The inability to form such an interaction most likely explains why SV40 cannot bind LSTc. It therefore appears that the formation of a very small number of contacts is largely responsible for defining the specificity of VP1 for LSTc. The reverse combination, a GM1 ligand bound to JCV, would likely be disfavored due to steric clashes with JCV VP1 residue S62, which is an alanine in SV40.
Due to their small contact surfaces and solvent-exposed binding sites, interactions between individual viral attachment proteins and sialylated oligosaccharides are typically of low affinity, with dissociation constants in the millimolar range [13, 28, 49, 50] . In many cases, high-affinity adherence to the target cell is achieved through the utilization of several low affinity binding sites. However, receptor clustering is not always necessary to achieve higher-affinity binding. The interaction of Ad37 with its recently identified glycan receptor GD1a illustrates another strategy. It has long been established that Ad37 fiber knobs bind receptors terminating in sialic acid [28, 51] , but the nature of the glycan has remained elusive. Glycan array screening has recently revealed that Ad37 fiber knobs specifically recognize the oligosaccharide GD1a, a disialylated compound that features two branches, each terminating in sialic acid [7] . A structural analysis of the trimeric Ad37 fiber knob in complex with GD1a established that the two terminal sialic acid residues bind to two different Ad37 fiber knob protomers in an identical manner, thus engaging two of the three possible binding sites [7] (Fig. 4) . This bivalent interaction results in a 250-fold higher affinity (Kd = 19 μM) [7] compared to the monovalent sialyllactose-Ad37 knob interaction (Kd = 5 mM) [28] . Thus, although each protomer in an Ad37 fiber knob would be able to bind sialic acid attached to different oligosaccharide structures, specificity for GD1a is generated by a multivalent interaction in which two protomers interact with the same receptor in an identical manner. It is conceivable that trivalent compounds that engage all three binding sites of the Ad37 fiber knobs would have even higher affinity, thus providing a platform for the development of antiviral inhibitors. Using such a strategy, a multivalent inhibitor has been developed that is able to neutralize pentameric shiga-like toxins with very high efficiency [52] . A similar strategy could be useful to develop molecules that inhibit viral attachment proteins, which usually occur as multimers at the viral surface.
A large number of viruses, including many serious human pathogens, uses sialylated oligosaccharides for cell attachment. Common principles of interaction can be established by comparing the sialic acid binding modes of different viruses. In most cases, interactions between a viral attachment protein and its glycan receptor involve primarily the sialic acid itself, which is bound with a relatively small contact area in a solvent-exposed region of the protein. Consistent with this, the affinities of such interactions are, at least in cases where they have been measured, very low. Nevertheless, many of the viruses discussed here achieve remarkable specificity for a single type of sialylated oligosaccharide by establishing a small number of auxiliary interactions with functional groups that lie beyond the sialic acid, and by excluding some possible ligands through steric clashes. The auxiliary interactions generally involve fewer hydrogen bonds and bury a smaller amount of surface compared to the interactions that involve the sialic acid itself. It thus appears that many viruses use the unique properties of sialic acid as a "hook" that allows them to adhere to the cell, and modulate binding in different strains or families by subtly altering structural elements in the vicinity of this hook. In a (so far) unique variation of this strategy, the Ad37 knob establishes selectivity for its GD1a glycan receptor by multivalent binding to a single receptor carrying two terminal sialic acid moieties, thus adhering to two identical "hooks" separated by a defined spacer. The prominence of sialic acid in viral attachment may form a basis for new approaches to combat viruses. Compounds that mimic sialic acid have already proved useful as inhibitors of the Influenza virus neuraminidase [53] and can also efficiently inhibit the receptor-binding site of the Influenza A virus hemagglutinin [54] . The structural analysis of the Ad37-GD1a interaction has also led to the design and synthesis of a trivalent compound designed to block attachment of adenoviruses that cause EKC [55] . Glycan microarrays have been extraordinarily useful in identifying the correct receptors for many viral proteins [8, 13, 47, 56] , which is a prerequisite for structural studies. However, proper interpretation of the information provided by glycan array screening and structural analyses requires affinity data. Such data are often difficult to obtain and compare, and they are currently lacking for many complexes. Being able to correlate affinity measurements with structural data would significantly advance the design of antiviral agents, and, together with oligosaccharide expression data, help to explain viral tropism. [34] ). In all cases, Neu5Ac is shown in stick representation and colored as in panel A. The protein surfaces are colored gray, with residues interacting with Neu5Ac shown in stick representation and colored by element. Protein atoms within a 4.0 Å radius around Neu5Ac are highlighted with colored spheres. In cases where the Neu5Ac binding site is formed by two protein chains, one of the chains is denoted with an asterisk. The oligosaccharides are shown in stick representation and colored by element, with oxygens in red and nitrogens in blue. Monosaccharides that approach the protein closer than 4.0 Å are colored in bright orange, while those not contacting the protein are colored gray. Oligosaccharide atoms within a 4.0 Å radius around the proteins are highlighted as spheres. The protein surface is shown in gray. Residues that define the different oligosaccharide specificities of the two proteins are shown as sticks and colored blue and pink for SV40 and JCV VP1, respectively. Residues from a different polypeptide chain are denoted with an asterisk. [7] . The three different Ad37 fiber knob protomers are shown in surface representation and are colored gray, light red and blue. The GD1a glycan is drawn in stick representation, with both Neu5Ac residues highlighted in color (carbons in orange, oxygens in red and nitrogens in blue). The bridging glycan residues are shown in dark gray. The third binding site (marked by "X") is blocked due to crystal contacts. The arrow indicates the viewing direction shown in panel B. (B). Interactions between Ad37 fiber knob residues and GD1a. Two different Ad37 fiber knob protomers are shown in transparent surface representation (white and light red). The third protomer is not shown for clarity. Ad37 residues contacting GD1a are shown in stick representation, with oxygens in red and nitrogens in blue. The GD1a glycan is shown in stick representation, with both terminal Neu5Ac residues highlighted in color (carbons in orange, oxygens in red and nitrogens in blue). The bridging glycan residues are shown in gray. Glycan atoms within a distance of 4 Å to Ad37 protein atoms are drawn as spheres. Hydrogen bonds to GD1a glycan are represented with black dashes. Residues from different protomers are denoted with an asterisk.
|
Annnotations
- Denotations: 1
- Blocks: 0
- Relations: 0