Methods for analysis of viral glycosylation Historically, herpesviruses and togaviruses were some of the first viruses investigated for modification with carbohydrates, which led to important findings regarding general aspects of viral glycosylation. Early methods of analysis were based on monitoring the incorporation of radioactively labeled monosaccharides and amino acids into newly synthesized viral proteins during infection (Keller et al. 1970; Spear and Roizman 1970; Kim et al. 1976). It was thereby established that viral protein glycosylation took place at the cellular membranes and not the cytoplasmic compartment (Spear and Roizman 1970). Early glycoprofiling experiments utilized pronase digestion of labeled glycoproteins followed by gel filtration, which allowed separation of the short glycopeptides, bearing different glycan moieties (Honess and Roizman 1975; Schwarz et al. 1977). Moreover, differentially labeled sugars were used to provide insight into the putative composition of individual glycan structures (Sefton 1975). An important conclusion based on such experiments was that the extent of sugar incorporation into individual viral proteins differed depending on the infected cell type (Keller et al. 1970), establishing that viruses are dependent on the host glycosylation machinery. Use of glycosylation inhibitors, such as tunicamycin, 2-deoxy-d-glucose and glucosamine, provided additional insights into regulation of glycan synthesis, its impact on viral replication, and showed that glycosylation of viral proteins is critical for infectivity and cell–cell fusion (Knowles and Person 1976; Leavitt et al. 1977; Olofsson and Lycke 1980; Herrler and Compans 1983; Lambert and Pons 1983; Mann et al. 1983). Studies in cell lines deficient for specific glycosyltransferases or intracellular transport inhibitors provided additional means for investigating biological consequences of disrupted glycan synthesis and maturation, respectively (Campadelli-Fiume et al. 1982; Serafini-Cessi, Dall’Olio, Scannavini, Campadelli-Fiume et al. 1983; Edwardson 1984). Alongside numerous studies addressing viral N-glycan composition and function, it was discovered that viral envelope glycoproteins could also be modified with O-linked glycans (Olofsson et al. 1981; Shida and Dales 1981; Niemann et al. 1982; Gruber and Levine 1985; Montalvo et al. 1985; Gong et al. 1987; Lundstrom et al. 1991). As for human proteins, the Golgi apparatus was identified as the site of viral O-glycosylation (Johnson and Spear 1983; Locker et al. 1992). Historically it was presumed that few viruses were O-glycosylated and the function of this modification remained undetermined for some time (Feldmann et al. 1991; Bernstein et al. 1994). One of the first functions of viral O-glycosylation was discovered in vaccinia virus, where it has been demonstrated that the hemagglutinating activity of glycoprotein HA was entirely dependent on O-linked glycans (Shida and Dales 1981). A similar carbohydrate-dependent function was described for rubella virus, where treatment with a mix of glycosidases removing all glycans resulted in inhibition of hemagglutination (Ho-Terry and Cohen 1984). Development of new biochemical methods facilitated the isolation and analysis of viral glycoproteins and glycans. Use of plant lectins or sera from immunized animals as well as vaccinated patients facilitated purification of viral envelope proteins and enabled analysis of glycans at the individual protein level (Eisenberg et al. 1979; Wenske et al. 1982; Friedrichs and Grose 1984; Respess et al. 1984; Montalvo et al. 1985). Introduction of chemical glycan release from the peptides or sequential enzymatic deglycosylation enabled more precise characterization of viral glycan size and composition compared to pronase digests (Burke and Keegstra 1979; Rasilo and Renkonen 1979), and led to determination of type and structure of N- and O-linked glycans for many viruses (Pesonen 1979; Pesonen, Kuismanen et al. 1982; Pesonen, Ronnholm et al. 1982; Niemann et al. 1984). The subsequent introduction of reverse phase HPLC further facilitated glycopeptide analysis, allowing separation of larger glycopeptides, generated by digestion of proteins with proteases of defined specificity, such as trypsin. Subsequent enzymatic glycan release enabled N-glycan analysis on isolated glycopeptides (Rosner and Robbins 1982; Cohen et al. 1983; Hsieh et al. 1983), and demonstrated site-specific glycan microheterogeneity in different hosts (Hsieh et al. 1983). Development and advancement of mass spectrometry-based applications had a large impact on the analysis of glycans, which was quickly adopted in the virology field. More recent advances in mass spectrometry-based glycoprofiling and glycoproteomics of enveloped viruses are discussed in the following sections. Glycoprofiling of viruses using mass spectrometry Modern mass spectrometry-based methods of analysis enable robust characterization of N- or O-linked glycans in complex biological samples with unprecedented sensitivity and resolution. It has become routine practice in the characterization of recombinant therapeutics as well as vaccine candidates (Xie et al. 2011; Dubayle et al. 2015; Jacob et al. 2015). Matrix-assisted laser desorption/ionization mass spectrometry (MALDI-MS) is one of the most commonly used tools in modern day glycan analysis. Combining MALDI-MS with lectin microarray enables more reliable characterization of glycan structures, where lectin binding profiles can provide additional information on the glycosidic linkage (Mechref et al. 2003; Lei et al. 2015). Moreover, tandem MALDI-MS or tandem electrospray ionization MS allows fragmentation of selected precursor ions, enabling more detailed characterization of structures of a given composition (Mechref et al. 2003; Ritchie et al. 2010). The techniques were soon adapted for analysis of N-glycans on recombinant viral proteins and mature viral particles. Applying mass spectrometry-based analysis to viral glycans released from mature viral particles illustrated the heterogeneity of N-glycan structures present on a single protein. For example, detailed studies on Dengue virions revealed enormous heterogeneity of N-glycan structures modifying two putative sites of Dengue virus glycoprotein, with 19 distinct structures identified (Lei et al. 2015). In addition, MS-based glycoprofiling demonstrated distinct glycosylation patterns in different host species for Chikungunya virus (Lancaster et al. 2016), confirming earlier observations on togaviruses (Hsieh et al. 1983). MALDI-MS has also been used to compare glycan profiles of recombinantly expressed viral proteins and proteins isolated from native viruses. Comparison of N-glycan profiles on recombinant HIV-1 gp120 monomers and gp120/gp41 trimers, present on native viruses, revealed that while monomeric recombinant protein contained a high proportion of complex-type glycans, native trimers with preserved protein quaternary structure predominantly contained high-mannose type carbohydrates (Doores, Bonomelli et al. 2010; Bonomelli et al. 2011). The high-mannose type glycans consistently found in monomeric and trimeric formulations are part of the intrinsic mannose patch that contains densely spaced underprocessed glycans (Bonomelli et al. 2011). In contrast, analysis of dimeric viral glycoprotein of Hendra virus suggests that oligomerization does not always impede glycan processing (Bowden et al. 2010). Viral protein architecture, however, is not the only requirement for recapitulation of physiologically relevant glycosylation, as the repertoire of glycosyltransferases naturally also plays an important role. For example, N-glycans on gp120 on pseudovirions from the cell line HEK293T mainly carried α2,3-linked sialic acids, while N-glycans on gp120 from virions derived from peripheral blood mononuclear cells predominantly carried α2,6-linked sialic acids (Pritchard, Harvey et al. 2015). Such differences might have huge implications on immunogenicity of recombinant vaccine antigens, given that some broadly neutralizing antibodies to HIV-1 recognize glycan-containing epitopes (Doores and Burton 2010; Mouquet et al. 2012; McCoy and Burton 2017). Therefore, both antigen formulations and production cell lines must be considered very carefully. Another concern when analyzing recombinant viral proteins as the source of information for viral glycosylation, is the occasional need for artificial formulations of recombinant viral proteins that often lack natural processing and oligomerization signals when not assembled and presented in the viral context (Bowden et al. 2008; Ritchie et al. 2010). For example, it has been shown that lack of proteolytic cleavage of the membrane-truncated soluble HIV-1 envelope glycoprotein (gp140) resulted in aberrant glycosylation that altered the conformation of the trimer (AlSalmi et al. 2015). Ebola virus glycoprotein represents another example of a highly elaborate protein complex assembly. It exists as three distinct species due to transcriptional stuttering—sGP, GP and ssGP—two of which are secreted (Lee and Saphire 2009). The membrane bound GP is proteolytically processed to GP1 and GP2, and forms a trimer at the cell surface (Lee and Saphire 2009). Nevertheless, formulations of GPs that differ from the natural viral context have been pursued for glycoanalysis. N-glycan analysis was performed for soluble fragment of GP1 and sGP. The GP1 exhibited a mix of complex, hybrid and high-mannose structures, while sGP carried a higher proportion of processed structures (Ritchie et al. 2010). A more recent glycoprofiling of stabilized monomeric GP1,2 constructs from five different strains of Ebola virus confirmed the presence of heterogeneous and mostly complex-type N-glycans (Collar et al. 2016). This is in contrast to a highly variable O-glycan pattern in the different strains (Collar et al. 2016). A few other recombinant glycoproteins from Nipah virus and Machupo virus were expressed in their monomeric forms and reported to predominantly carry complex-type N-glycans (Bowden et al. 2008, 2009). In conclusion, it is difficult to predict glycosylation patterns of native oligomeric protein complexes based on studies of recombinant monomeric proteins. It is, however, sometimes the only solution because of difficulties in expression of correctly folded and authentic protein complexes. MALDI-MS can also be applied for O-glycan analysis (Canis et al. 2010; Franc et al. 2013); however, it has not been widely used in the virology field and only a few examples of O-glycoprofiling of viral glycoproteins using MALDI-MS exist (Schmitt et al. 1999; Collar et al. 2016). It can probably be explained by lower interest in viral O-glycosylation as opposed to N-glycosylation, primarily due to poorly characterized functions of viral O-glycans and the presumption that most viral glycoproteins are not heavily O-glycosylated. In addition, more sophisticated methods allowing simultaneous determination of modified amino acids have rapidly taken over, which have in the recent years been widely used for analysis of viral O-glycosylation. Glycoproteomics of viruses While glycoprofiling studies provide important information on distribution and composition of different glycan structures present on a given protein, it lacks the information on individual glycosite location, occupancy and glycan structure heterogeneity. The recent development of instruments equipped with ECD or ETD MS2 fragmentation has allowed for simultaneous determination of the peptide sequence and the position to which the carbohydrate moiety is attached (Levery et al. 2015). When relevant reporter glycan oxonium ions are present and it is easy to predict the glycosylation position within peptides, HCD MS2 fragmentation can also be used (Wuhrer et al. 2007). Interfacing the mass spectrometer with capillary liquid chromatography enables separation of complex proteome-scale glycopeptide mixtures and allows identification of thousands of glycopeptides in a single run. At the single protein level, it is now possible to determine the exact position, structure heterogeneity, and site occupancy for each individual glycosite (Wuhrer et al. 2007). Both N- and O-glycosylation can be analyzed by tandem MS. Biosynthetic features of N-glycosylation enable relatively easy identification and quantification of deglycosylated sites by MS2 sequencing. Due to asparagine deamidation during enzymatic N-glycan removal, the site occupancy can be determined by calculating the intensity ratio of naked (Asn) and deglycosylated (Asp) peptides (Pabst et al. 2012; An et al. 2015). However, the more appropriate methodology includes carrying out the reaction using heavy-oxygen water (O18–H2O), which discriminates deglycosylated sites from spontaneous Asn deamidation (Palmisano et al. 2012; Cao et al. 2017). It is also possible to evaluate the overall distribution of N-glycosite occupancy on intact glycoproteins by metabolically simplifying the glycans to homogeneous structures (Struwe et al. 2017). For O-linked glycans, complete O-glycan removal does not result in chemical peptide modification and cannot be used for identification of glycosylated amino acid positions (Levery et al. 2015). ETD and ECD MS2 techniques allow analysis of intact O-glycopeptides by fragmentation of the peptide backbone without the loss of O-glycan modification, however, it is of limited use for determining site occupancy (Wuhrer et al. 2007; Sihlbom et al. 2009; Zauner et al. 2012). Approximations are yet often made by comparing intensities of nonmodified and glycosylated peptides (Brautigam et al. 2013; Stansell et al. 2015). The relative quantitation of both N- and O-glycosylated peptides is not as accurate due to different ionization efficiencies of nonglycosylated peptides and peptides carrying complex glycans. Dependent on the complexity of the glycans, glycopeptides generally exhibit poorer ionization compared to peptides (Stavenhagen et al. 2013). The analysis of complex proteins therefore often requires enrichment with hydrophilic interaction chromatography or specific lectins, which is particularly relevant for proteome-wide applications (Bunkenborg et al. 2004; Zielinska et al. 2010; Khatri et al. 2014; Levery et al. 2015). However, information regarding site occupancy needs to be sacrificed. N-glycoproteomics Tandem mass spectrometry has been widely used for characterization of N-glycosylation sites on viral proteins, with respect to individual site occupancy status (macroheterogeneity) and site-specific structural diversity (microheterogeneity), as these are important features that can affect protein–protein interactions and immunogenicity. Comprehensive glycoproteomic analysis has, for example, mapped N-linked glycosylation of seasonal influenza A virus H3N2 HA, identifying more than 90 % site occupancy of all putative N-linked glycan sites (An et al. 2015). Moreover, the globular head glycosites, associated with host immune receptor interaction, were strictly high-mannose type (An et al. 2015). In a separate study on the highly pathogenic H5N1 influenza A virus, it was also shown that all potential N-glycosites were consistently occupied between several different strains (Blake et al. 2009), suggesting that N-glycosylation of HA is conserved between the different isolates of the same virus subtype, with a high occupancy of potential N-glycosites. In a similar way, it has been demonstrated that all predicted N-glycan sequons were utilized in Hepatitis C virus E2 and Murray Valley encephalitis virus NS1 (Blitvich et al. 2001; Iacob et al. 2008), where the majority of HCV E2 glycosites were modified with high-mannose type glycans (Iacob et al. 2008). On the densely glycosylated HIV-1 envelope glycoprotein several strategies have been employed to characterize the nature and location of the many glycans, with up to 27 mostly highly occupied N-glycosites identified, some of which were exclusively high-mannose type (Pabst et al. 2012; Go et al. 2013; Yang W et al. 2014). Tandem mass spectrometry has also been used for addressing differences in cell-type specific glycosylation and influence of protein conformation. Out of convenience, soluble HIV-1 gp120 or gp140 preparations are often analyzed. HIV-1 gp120 produced in CHO and 293 T cells had very similar occupancy, degree of fucosylation, sialylation, and glycan maturation, with a larger share of glycosites predominantly carrying hybrid and complex-type N-glycans (Go et al. 2013). The positions of some of the exclusively high-mannose N-glycans were located within the intrinsic mannose patch of gp120 (Go et al. 2013; Behrens et al. 2016). In contrast, a much higher proportion of the N-glycosites on gp140 expressed in CHO, recombinant trimers, or those on gp120 purified from native virions were modified with high-mannose type N-glycans (Pabst et al. 2012; Behrens et al. 2016; Panico et al. 2016; Go et al. 2017). This again signifies the importance of analyzing the native protein conformation for generation of relevant glycosylation patterns. Glycosylation in different cell lines was also investigated for Hendra virus recombinant glycoprotein G, where all seven potential N-glycan sites were occupied in HeLa cells (Colgrave et al. 2012). In contrast, only four sites were N-glycosylated in HEK293, although the degree of glycan maturation was similar in both cell lines (Colgrave et al. 2012). To summarize the results obtained from various N-glycoprofiling and glycoproteomic studies, it seems that most of the putative N-glycosylation sequons are glycosylated with high occupancy on viral proteins. However, the site occupancy of viral protein N-glycosylation can vary in different producer cell lines. Moreover, high-mannose type N-glycans may constitute a substantial, if not the major, proportion of viral N-glycosites, particularly when native protein structure and oligomerization is taken into account. Thus, results obtained from analysis of recombinant monomeric proteins should be interpreted with caution also when considering the occupancy of individual N-glycosylation sites as well as complexity of glycan structures as discussed above. O-glycoproteomics For decades very little information has been available regarding site-specific O-glycosylation of viral proteins. Some of the first described virus-derived site-specific O-glycans were on isoform M of HBV surface antigen purified from patient-derived viral particles (Schmitt et al. 1999). The O-glycosylation site was identified by combining MALDI analysis of exoglycosidase-treated glycopeptide and Edman sequencing of the underlying peptide. The position of the glycan attachment site was deduced by carboxypeptidase digestion and confirmed by collision-induced dissociation tandem mass spectrometry, representing some of the early glycoproteomic experiments of viral proteins (Schmitt et al. 1999). The recent advances in mass spectrometry-based proteomics have resulted in numerous studies addressing O-glycosylation of individual viral glycoproteins. O-glycoproteomic analyses have been performed for several recombinant or isolated viral glycoproteins, including HIV-1 gp120, influenza A virus HA1, HCV E2, HSV-1 gC and Hendra virus glycoprotein G (Colgrave et al. 2012; Brautigam et al. 2013; Go et al. 2013; Yang W et al. 2014; Norden et al. 2015; Stansell et al. 2015). Recombinantly expressed gp120 and HA1 were glycosylated at a single position each, which in both cases was modified with core 1 or core 2 elongated O-glycans (Stansell et al. 2015). Interestingly, site occupancy was much lower in recombinant gp140 undergoing proteolytic cleavage to gp120, assuming equivalent ionization efficiencies of nonmodified and glycosylated peptides. Recombinant gp140 possessed shorter less sialylated, predominantly core 1 O-glycans, again underlining the importance of native protein conformation for analogous studies (Stansell et al. 2015). In contrast, gp120 purified from T-cell derived virions was devoid of the single site found glycosylated in recombinantly expressed gp120 (Stansell et al. 2015). This, however, might be related to the viral strain used, as the single site was reported to be glycosylated in a separate study using virions from a different HIV-1 strain (Yang W et al. 2014). In addition, treatment of passaged or plasma-derived HIV-1 virions with antibodies against O-linked carbohydrate structures resulted in inhibition of cell entry, and virus neutralization (Hansen et al. 1990, 1991), suggesting that HIV-1 gp120 can indeed be O-glycosylated in vivo. Comparison of recombinant gp120 O-glycosylation in two different cell lines revealed predominant core 1 O-glycosylation in CHO cells, compared to core 1, core 2 and core 4 in 293 T cells (Go et al. 2013). In a similar manner, Hendra virus glycoprotein G expressed in HeLa and HEK293 cells, differed considerably with different numbers of O-glycosites identified and carrying different core structures (Colgrave et al. 2012). Recombinant HCV E2 was O-glycosylated at six positions, with predominantly core 1 and core 2 O-glycan structures (Brautigam et al. 2013). More than 80 % occupancy was estimated for five of the six sites, whereas one site had very low occupancy. Moreover, a high level of structural heterogeneity was observed for the O-glycans localized at the individual sites, with up to 14 different structures identified (Brautigam et al. 2013). A recent study on O-linked glycosylation of HSV-1 mucin-like protein gC provided some insight into O-glycan synthesis, suggesting that the eleven O-glycosites were added in an orderly fashion, before elongation took place (Norden et al. 2015). These studies underscore the high heterogeneity of O-glycan structures, which are both cell type and protein specific, and the need for careful selection of candidates and expression cell lines for clinical applications. Moreover, comprehensive analysis of immune responses mounted by these different structures would be highly beneficial. Global O-glycoproteomics While it is clear that single protein-targeted mass spectrometry approaches can provide comprehensive information on single site occupancy and structure heterogeneity, we lack robust methods of analysis for site-specific O-glycosylation in complex or proteome-wide samples. To solve this problem, we recently introduced a method for globally mapping O-glycosylation sites in glycoengineered cell lines lacking O-linked glycan elongation, which is based on Vicia villosa lectin (VVA) affinity enrichment of simple glycopeptides coupled to tandem mass spectrometry (Steentoft et al. 2011). The method is also applicable for analysis of wild type cells, predominantly expressing core 1 O-glycans, by using peanut agglutinin (PNA) enrichment of desialylated glycopeptides (Yang Z et al. 2014). We have applied these methods in the analysis of O-glycosylation of HSV-1 infected human fibroblasts by performing a sequential enrichment with PNA and VVA, thus reporting the first comprehensive viral O-glycoproteome (Bagdonaite et al. 2015). This approach provides several clear advantages: first of all, it allows simultaneous analysis of all viral glycoproteins expressed in an infected cell. Secondly, the strategy takes into account the endogenous glycosylation of a permissive cell, dictated by the repertoire of glycosyltransferases, as well as native conformations of proteins and the cytopathic effects of viral infection. Irrelevant cell lines are often chosen for recombinant expression of viral proteins. The glycosylation obtained in these cells lines does not always reflect the glycosylation pattern in a natural host. Using herpesviruses as a model system, we have applied the same method for defining the O-glycoproteomes of other members of Herpesviridae family—HSV-2, VZV, HCMV and EBV (Bagdonaite et al. 2016; Iversen et al. 2016). The wide occurrence, associated complications and shortage of prophylactic measures make herpesviruses a relevant model system to analyze O-glycans and their importance in viral life cycle (Vazquez et al. 2001; Cohen et al. 2006; Adjei et al. 2008; Kramer et al. 2008; Oxman 2010; Shiley and Blumberg 2010; Lopo et al. 2011; Sauerbrei et al. 2011; Levine et al. 2012; van Rijckevorsel et al. 2012; Astuto et al. 2013; Conde-Glez et al. 2013; Fishman 2013; Gorfinkel et al. 2013; Odland et al. 2013; Pembrey et al. 2013; Rowe et al. 2013; Awasthi and Friedman 2014; Bradley et al. 2014; Fu et al. 2014; Sabugo et al. 2014; Sili et al. 2014; Chen et al. 2015; Cohen 2015; Korndewal et al. 2015; Shaiegan et al. 2015). In addition, the large proteomes of herpesviruses highlight the benefits of global viral O-glycoproteomics. Human herpesviruses encode seven to 12 glycoproteins associated with the viral particle; however, many more viral proteins possess signal peptides and transit through the host secretory pathway. Some of them have previously been investigated for glycan modifications in focused studies. N-linked glycans have been identified on viral envelope glycoproteins from all 8 human herpesviruses (Wenske et al. 1982; Edson and Thorley-Lawson 1983; Friedrichs and Grose 1984; Serafini-Cessi et al. 1984, 1985, 1989; Montalvo et al. 1985; Montalvo and Grose 1986, 1987; Gong et al. 1987; Britt and Vugler 1989; Gong and Kieff 1990; Okuno et al. 1990, 1992; Foa-Tomasi et al. 1992; Nolan and Morgan 1995; Pfeiffer et al. 1995; Hata et al. 1996; Mukai et al. 1997; Chandran et al. 1998; Pertel et al. 1998; Huber and Compton 1999; Li et al. 1999; Zhu et al. 1999; Baghian et al. 2000; Skrincosky et al. 2000; Wu et al. 2000; Maresova et al. 2000; Theiler and Compton 2002; Koyano et al. 2003; Paulsen et al. 2005; Yamagishi et al. 2008; Gore and Hutt-Fletcher 2009; Luo et al. 2015), where individual glycoproteins have been demonstrated to exhibit variable extent and pattern of glycan chain maturation (Wenske et al. 1982; Edson and Thorley-Lawson 1983; Friedrichs and Grose 1984; Serafini-Cessi et al. 1984, 1985, 1989; Montalvo et al. 1985; Montalvo and Grose 1986, 1987; Britt and Vugler 1989; Gong and Kieff 1990; Okuno et al. 1990, 1992; Huber and Compton 1999; Maresova et al. 2000; Theiler and Compton 2002; Yamagishi et al. 2008). A relatively smaller proportion of envelope glycoproteins of herpesviruses have been investigated in terms of O-glycosylation, and in some of these envelope proteins O-glycans have been detected by biochemical assays (Serafini-Cessi, Dall’Olio, Scannavini, Costanzo et al. 1983; Montalvo et al. 1985; Gong et al. 1987; Montalvo and Grose 1987; Serafini-Cessi et al. 1988, 1989; Britt and Vugler 1989; Kari et al. 1992; Yao et al. 1993; Nolan and Morgan 1995; Borza and Hutt-Fletcher 1998; Cardinali et al. 1998; Lake et al. 1998; Peng et al. 1998; Torrisi et al. 1999; Zhu et al. 1999; Wu et al. 2000; Theiler and Compton 2002; Xiao et al. 2007). Only a few of these proteins have merited more thorough investigation with most attention devoted to proteins containing mucin-like domains. HSV-1 attachment factor gC was the first envelope glycoprotein described to carry O-glycans, acquiring distinct structures in different cell types (Olofsson et al. 1981, 1983; Dall’Olio et al. 1985; Lundstrom et al. 1987), and specific O-glycosites have recently been mapped to the mucin-like region (Bagdonaite et al. 2015; Norden et al. 2015). Furthermore, the HSV-2 and VZV orthologs were also found to be O-glycosylated (Zezulak and Spear 1983; Bagdonaite et al. 2016). Similarly, other mucin-like region-containing proteins such as HSV-1 gI, HSV-2 gG, EBV gp150 and gp350 have been shown to accommodate high density of O-glycosylation, and the types of O-glycan structures were identified for some of these proteins (Serafini-Cessi et al. 1985, 1989; Nolan and Morgan 1995; Borza and Hutt-Fletcher 1998; Norberg et al. 2007). The conserved viral fusion effector gB has been shown or predicted to be O-glycosylated in all herpesvirus subfamilies (Serafini-Cessi, Dall’Olio, Scannavini, Costanzo et al. 1983; Gong et al. 1987; Montalvo and Grose 1987; Britt and Vugler 1989). The era of proteome-wide mass spectrometry-based applications allowed robust characterization of viral O-glycoproteomes (Bagdonaite et al. 2015, 2016; Iversen et al. 2016). The characterizations confirmed the identity of the majority of previously described O-glycoproteins of herpesviruses, and provided a tremendous expansion of site-specific O-glycosylation (Bagdonaite et al. 2015, 2016; Iversen et al. 2016). While GalNAc-type O-glycosylation is often associated with dense glycosylation in mucin-like regions, it is also abundantly found in isolation or small clusters in human proteins (Steentoft et al. 2013), which is more difficult to predict. In agreement with this, we have demonstrated ample presence of isolated O-glycan sites on viral glycoproteins of HSV-1, HSV-2, VZV, HCMV and EBV by glycoproteomic approaches (Bagdonaite et al. 2015, 2016; Iversen et al. 2016). Location of O-glycosites identified via proteome-wide MS/MS approaches with respect to protein structural features suggests possible involvement in the protein–protein interactions (Bagdonaite et al. 2015, 2016), as exemplified in subsequent sections. Large scale glycoproteomic analyses of human herpesviruses of varying phylogeny (HSV-1, HSV-2, VZV, HCMV and EBV) have made it possible to compare the O-glycosite patterns in homologous proteins (Bagdonaite et al. 2015, 2016; Iversen et al. 2016). Comparison of O-glycosite conservation between alphaherpesviruses HSV-1 and HSV-2 suggests that sequence homology is an important determinant for O-glycosylation in closely related viruses (Figure 1A and F). Isolated homologous glycosites were mainly situated on highly homologous peptide stretches, whereas densely spaced glycosites in Pro/Ser/Thr-rich regions were glycosylated irrespective of low sequence identity, as expected (Bagdonaite et al. 2016). Several glycoproteins are homologous between all herpesviruses, including gB, gH, gL, gM and gN, of which gB, gH and gL comprise the conserved cell entry machinery (McGeoch et al. 2006). We identified a large number of O-glycosites on HSV-1 fusogenic effector gB, and predicted that a number of O-glycosites could be conserved in most, if not all, human herpesviruses (Figure 1A–C) (Bagdonaite et al. 2016). Based on multiple sequence alignments across investigated herpesvirus family members, enrichment of O-glycosylation was found in the extreme N-terminus of gB regardless of the underlying considerable sequence variation between different herpesviruses. This suggests that glycosylation patches are less dependent on the underlying sequence, and might serve a glycan specific function, such as protection of the N-terminal exposed region of gB from proteolytic degradation or immune recognition. In contrast, conserved single glycosites were predominantly found between HSV-1, HSV-2, and, to a smaller extent, VZV, and suggest that they mainly exert subfamily-specific functions. The conserved protein gH, which is another essential component of the fusion machinery, was found glycosylated in four out of five investigated viruses. Although no clear conserved pattern of glycosylation was observed, the O-glycosites were predominantly localized to the two exposed N-terminal domains involved in interaction with other viral proteins (Figure 1D and E) (Bagdonaite et al. 2016). Fig. 1. O-glycosylation of herpesvirus conserved fusion machinery. (A), Crystal structure representation of HSV-1 gB monomer. From “Heldwein EE Lou H Bender FC Cohen GH Eisenberg RJ Harrison S 2006. Crystal structure of glycoprotein B from herpes simplex virus 1. Science, 313:217–220”. Reprinted with permission from AAAS. Blue boxes mark the parts of the molecule where O-glycans are consistently found between at least two investigated herpesviruses. Modified with permission from the authors. (B), (D) and (F), Conservation of O-linked glycosylation sites on homologous envelope glycoproteins of human herpesviruses (from Bagdonaite et al., 2016). Reprinted with permission. © 2008 The American Society for Biochemistry and Molecular Biology. All rights reserved. Clustal Omega server was used to align amino acid sequences of gB (B), gH (D) and gL (F) between HSV-1 (Bagdonaite et al. 2015), HSV-2 (Iversen et al. 2016), VZV (Bagdonaite et al. 2016), HCMV (Bagdonaite et al. 2016) and EBV (Bagdonaite et al. 2016). Protein backbones are depicted as broken black lines, where spaces represent gaps in the alignment. Individual alignments were drawn to scale (indicated below each graph). Sequence conservation is indicated above the aligned sequences for each set, and is represented by a greyscale barcode that maps to the clustal alignment score, as shown in the legend. In brief, for the clustal alignment score, an asterisk indicates positions with fully conserved residues, a colon indicates conservation of amino acids with strongly similar properties, whereas a period indicates conservation of amino acids with weakly similar properties. Predicted signal peptides and transmembrane regions are shaded in pink and blue, respectively. Unambiguous O-glycosylation sites are shown as yellow squares, whereas ambiguous sites are marked as yellow lines within the protein backbone, where the number below indicates the number of glycosites. An ambiguous O-glycosylation site from our previous publication (Bagdonaite et al. 2015, HSV-1 gB 109–123 (HexHexNAc)) was omitted from the graph, as we cannot exclude the possibility it could be part of an elongated structure on an adjacent site. Reference strain sequences were used for HSV-2, VZV and EBV due to incomplete or unavailable annotation of investigated strains. HSV-1—human herpes simplex virus type 1 (strain 17), HSV-2—human herpes simplex virus type 2 (strain HG52), VZV—varicella-zoster virus (strain Dumas), HCMV—human cytomegalovirus (strain Towne), EBV—Epstein-Barr virus (strain AG876). (C) and (E) Cartoon depiction of HSV-1 gB trimers (C) or gH–gL complexes and accessory proteins (E) of the five herpesviruses. O-glycosylation sites are shown as yellow squares. (B) and (C) Colored boxes mark association with herpesvirus gB domains as defined in (A). In summary, global O-glycoproteomics of viruses open up possibilities to rapidly “scan” the proteome of viruses for O-glycan modifications. Although the occupancy and the relevance of the individual glycan sites are still unknown, the information can be used to follow up by complimentary techniques at individual protein and glycosite level. It can be applied to any human virus of interest; given relevant propagation systems are available. The method, of course, has its limitations, such as a limited number of glycoforms that can be captured, as well as the availability of protein sequences in the databases, which is challenging when analyzing emerging or poorly annotated viruses, as well as clinical isolates. Another aim for the future is to make the results broadly available to the scientific community not only by means of publishing, but also by inclusion into public protein databases. Ideally, a virus database compiling structural data, sequence variability, available glycomic and glycoproteomic data as well as antigenic sites could be created to advance basic and applied research in virology. If sufficient experimental data is compiled, machine learning bioinformatic techniques could be applied to predict glycosylation patterns of emerging viral strains within distinct virus species or even families.