Global O-glycoproteomics While it is clear that single protein-targeted mass spectrometry approaches can provide comprehensive information on single site occupancy and structure heterogeneity, we lack robust methods of analysis for site-specific O-glycosylation in complex or proteome-wide samples. To solve this problem, we recently introduced a method for globally mapping O-glycosylation sites in glycoengineered cell lines lacking O-linked glycan elongation, which is based on Vicia villosa lectin (VVA) affinity enrichment of simple glycopeptides coupled to tandem mass spectrometry (Steentoft et al. 2011). The method is also applicable for analysis of wild type cells, predominantly expressing core 1 O-glycans, by using peanut agglutinin (PNA) enrichment of desialylated glycopeptides (Yang Z et al. 2014). We have applied these methods in the analysis of O-glycosylation of HSV-1 infected human fibroblasts by performing a sequential enrichment with PNA and VVA, thus reporting the first comprehensive viral O-glycoproteome (Bagdonaite et al. 2015). This approach provides several clear advantages: first of all, it allows simultaneous analysis of all viral glycoproteins expressed in an infected cell. Secondly, the strategy takes into account the endogenous glycosylation of a permissive cell, dictated by the repertoire of glycosyltransferases, as well as native conformations of proteins and the cytopathic effects of viral infection. Irrelevant cell lines are often chosen for recombinant expression of viral proteins. The glycosylation obtained in these cells lines does not always reflect the glycosylation pattern in a natural host. Using herpesviruses as a model system, we have applied the same method for defining the O-glycoproteomes of other members of Herpesviridae family—HSV-2, VZV, HCMV and EBV (Bagdonaite et al. 2016; Iversen et al. 2016). The wide occurrence, associated complications and shortage of prophylactic measures make herpesviruses a relevant model system to analyze O-glycans and their importance in viral life cycle (Vazquez et al. 2001; Cohen et al. 2006; Adjei et al. 2008; Kramer et al. 2008; Oxman 2010; Shiley and Blumberg 2010; Lopo et al. 2011; Sauerbrei et al. 2011; Levine et al. 2012; van Rijckevorsel et al. 2012; Astuto et al. 2013; Conde-Glez et al. 2013; Fishman 2013; Gorfinkel et al. 2013; Odland et al. 2013; Pembrey et al. 2013; Rowe et al. 2013; Awasthi and Friedman 2014; Bradley et al. 2014; Fu et al. 2014; Sabugo et al. 2014; Sili et al. 2014; Chen et al. 2015; Cohen 2015; Korndewal et al. 2015; Shaiegan et al. 2015). In addition, the large proteomes of herpesviruses highlight the benefits of global viral O-glycoproteomics. Human herpesviruses encode seven to 12 glycoproteins associated with the viral particle; however, many more viral proteins possess signal peptides and transit through the host secretory pathway. Some of them have previously been investigated for glycan modifications in focused studies. N-linked glycans have been identified on viral envelope glycoproteins from all 8 human herpesviruses (Wenske et al. 1982; Edson and Thorley-Lawson 1983; Friedrichs and Grose 1984; Serafini-Cessi et al. 1984, 1985, 1989; Montalvo et al. 1985; Montalvo and Grose 1986, 1987; Gong et al. 1987; Britt and Vugler 1989; Gong and Kieff 1990; Okuno et al. 1990, 1992; Foa-Tomasi et al. 1992; Nolan and Morgan 1995; Pfeiffer et al. 1995; Hata et al. 1996; Mukai et al. 1997; Chandran et al. 1998; Pertel et al. 1998; Huber and Compton 1999; Li et al. 1999; Zhu et al. 1999; Baghian et al. 2000; Skrincosky et al. 2000; Wu et al. 2000; Maresova et al. 2000; Theiler and Compton 2002; Koyano et al. 2003; Paulsen et al. 2005; Yamagishi et al. 2008; Gore and Hutt-Fletcher 2009; Luo et al. 2015), where individual glycoproteins have been demonstrated to exhibit variable extent and pattern of glycan chain maturation (Wenske et al. 1982; Edson and Thorley-Lawson 1983; Friedrichs and Grose 1984; Serafini-Cessi et al. 1984, 1985, 1989; Montalvo et al. 1985; Montalvo and Grose 1986, 1987; Britt and Vugler 1989; Gong and Kieff 1990; Okuno et al. 1990, 1992; Huber and Compton 1999; Maresova et al. 2000; Theiler and Compton 2002; Yamagishi et al. 2008). A relatively smaller proportion of envelope glycoproteins of herpesviruses have been investigated in terms of O-glycosylation, and in some of these envelope proteins O-glycans have been detected by biochemical assays (Serafini-Cessi, Dall’Olio, Scannavini, Costanzo et al. 1983; Montalvo et al. 1985; Gong et al. 1987; Montalvo and Grose 1987; Serafini-Cessi et al. 1988, 1989; Britt and Vugler 1989; Kari et al. 1992; Yao et al. 1993; Nolan and Morgan 1995; Borza and Hutt-Fletcher 1998; Cardinali et al. 1998; Lake et al. 1998; Peng et al. 1998; Torrisi et al. 1999; Zhu et al. 1999; Wu et al. 2000; Theiler and Compton 2002; Xiao et al. 2007). Only a few of these proteins have merited more thorough investigation with most attention devoted to proteins containing mucin-like domains. HSV-1 attachment factor gC was the first envelope glycoprotein described to carry O-glycans, acquiring distinct structures in different cell types (Olofsson et al. 1981, 1983; Dall’Olio et al. 1985; Lundstrom et al. 1987), and specific O-glycosites have recently been mapped to the mucin-like region (Bagdonaite et al. 2015; Norden et al. 2015). Furthermore, the HSV-2 and VZV orthologs were also found to be O-glycosylated (Zezulak and Spear 1983; Bagdonaite et al. 2016). Similarly, other mucin-like region-containing proteins such as HSV-1 gI, HSV-2 gG, EBV gp150 and gp350 have been shown to accommodate high density of O-glycosylation, and the types of O-glycan structures were identified for some of these proteins (Serafini-Cessi et al. 1985, 1989; Nolan and Morgan 1995; Borza and Hutt-Fletcher 1998; Norberg et al. 2007). The conserved viral fusion effector gB has been shown or predicted to be O-glycosylated in all herpesvirus subfamilies (Serafini-Cessi, Dall’Olio, Scannavini, Costanzo et al. 1983; Gong et al. 1987; Montalvo and Grose 1987; Britt and Vugler 1989). The era of proteome-wide mass spectrometry-based applications allowed robust characterization of viral O-glycoproteomes (Bagdonaite et al. 2015, 2016; Iversen et al. 2016). The characterizations confirmed the identity of the majority of previously described O-glycoproteins of herpesviruses, and provided a tremendous expansion of site-specific O-glycosylation (Bagdonaite et al. 2015, 2016; Iversen et al. 2016). While GalNAc-type O-glycosylation is often associated with dense glycosylation in mucin-like regions, it is also abundantly found in isolation or small clusters in human proteins (Steentoft et al. 2013), which is more difficult to predict. In agreement with this, we have demonstrated ample presence of isolated O-glycan sites on viral glycoproteins of HSV-1, HSV-2, VZV, HCMV and EBV by glycoproteomic approaches (Bagdonaite et al. 2015, 2016; Iversen et al. 2016). Location of O-glycosites identified via proteome-wide MS/MS approaches with respect to protein structural features suggests possible involvement in the protein–protein interactions (Bagdonaite et al. 2015, 2016), as exemplified in subsequent sections. Large scale glycoproteomic analyses of human herpesviruses of varying phylogeny (HSV-1, HSV-2, VZV, HCMV and EBV) have made it possible to compare the O-glycosite patterns in homologous proteins (Bagdonaite et al. 2015, 2016; Iversen et al. 2016). Comparison of O-glycosite conservation between alphaherpesviruses HSV-1 and HSV-2 suggests that sequence homology is an important determinant for O-glycosylation in closely related viruses (Figure 1A and F). Isolated homologous glycosites were mainly situated on highly homologous peptide stretches, whereas densely spaced glycosites in Pro/Ser/Thr-rich regions were glycosylated irrespective of low sequence identity, as expected (Bagdonaite et al. 2016). Several glycoproteins are homologous between all herpesviruses, including gB, gH, gL, gM and gN, of which gB, gH and gL comprise the conserved cell entry machinery (McGeoch et al. 2006). We identified a large number of O-glycosites on HSV-1 fusogenic effector gB, and predicted that a number of O-glycosites could be conserved in most, if not all, human herpesviruses (Figure 1A–C) (Bagdonaite et al. 2016). Based on multiple sequence alignments across investigated herpesvirus family members, enrichment of O-glycosylation was found in the extreme N-terminus of gB regardless of the underlying considerable sequence variation between different herpesviruses. This suggests that glycosylation patches are less dependent on the underlying sequence, and might serve a glycan specific function, such as protection of the N-terminal exposed region of gB from proteolytic degradation or immune recognition. In contrast, conserved single glycosites were predominantly found between HSV-1, HSV-2, and, to a smaller extent, VZV, and suggest that they mainly exert subfamily-specific functions. The conserved protein gH, which is another essential component of the fusion machinery, was found glycosylated in four out of five investigated viruses. Although no clear conserved pattern of glycosylation was observed, the O-glycosites were predominantly localized to the two exposed N-terminal domains involved in interaction with other viral proteins (Figure 1D and E) (Bagdonaite et al. 2016). Fig. 1. O-glycosylation of herpesvirus conserved fusion machinery. (A), Crystal structure representation of HSV-1 gB monomer. From “Heldwein EE Lou H Bender FC Cohen GH Eisenberg RJ Harrison S 2006. Crystal structure of glycoprotein B from herpes simplex virus 1. Science, 313:217–220”. Reprinted with permission from AAAS. Blue boxes mark the parts of the molecule where O-glycans are consistently found between at least two investigated herpesviruses. Modified with permission from the authors. (B), (D) and (F), Conservation of O-linked glycosylation sites on homologous envelope glycoproteins of human herpesviruses (from Bagdonaite et al., 2016). Reprinted with permission. © 2008 The American Society for Biochemistry and Molecular Biology. All rights reserved. Clustal Omega server was used to align amino acid sequences of gB (B), gH (D) and gL (F) between HSV-1 (Bagdonaite et al. 2015), HSV-2 (Iversen et al. 2016), VZV (Bagdonaite et al. 2016), HCMV (Bagdonaite et al. 2016) and EBV (Bagdonaite et al. 2016). Protein backbones are depicted as broken black lines, where spaces represent gaps in the alignment. Individual alignments were drawn to scale (indicated below each graph). Sequence conservation is indicated above the aligned sequences for each set, and is represented by a greyscale barcode that maps to the clustal alignment score, as shown in the legend. In brief, for the clustal alignment score, an asterisk indicates positions with fully conserved residues, a colon indicates conservation of amino acids with strongly similar properties, whereas a period indicates conservation of amino acids with weakly similar properties. Predicted signal peptides and transmembrane regions are shaded in pink and blue, respectively. Unambiguous O-glycosylation sites are shown as yellow squares, whereas ambiguous sites are marked as yellow lines within the protein backbone, where the number below indicates the number of glycosites. An ambiguous O-glycosylation site from our previous publication (Bagdonaite et al. 2015, HSV-1 gB 109–123 (HexHexNAc)) was omitted from the graph, as we cannot exclude the possibility it could be part of an elongated structure on an adjacent site. Reference strain sequences were used for HSV-2, VZV and EBV due to incomplete or unavailable annotation of investigated strains. HSV-1—human herpes simplex virus type 1 (strain 17), HSV-2—human herpes simplex virus type 2 (strain HG52), VZV—varicella-zoster virus (strain Dumas), HCMV—human cytomegalovirus (strain Towne), EBV—Epstein-Barr virus (strain AG876). (C) and (E) Cartoon depiction of HSV-1 gB trimers (C) or gH–gL complexes and accessory proteins (E) of the five herpesviruses. O-glycosylation sites are shown as yellow squares. (B) and (C) Colored boxes mark association with herpesvirus gB domains as defined in (A). In summary, global O-glycoproteomics of viruses open up possibilities to rapidly “scan” the proteome of viruses for O-glycan modifications. Although the occupancy and the relevance of the individual glycan sites are still unknown, the information can be used to follow up by complimentary techniques at individual protein and glycosite level. It can be applied to any human virus of interest; given relevant propagation systems are available. The method, of course, has its limitations, such as a limited number of glycoforms that can be captured, as well as the availability of protein sequences in the databases, which is challenging when analyzing emerging or poorly annotated viruses, as well as clinical isolates. Another aim for the future is to make the results broadly available to the scientific community not only by means of publishing, but also by inclusion into public protein databases. Ideally, a virus database compiling structural data, sequence variability, available glycomic and glycoproteomic data as well as antigenic sites could be created to advance basic and applied research in virology. If sufficient experimental data is compiled, machine learning bioinformatic techniques could be applied to predict glycosylation patterns of emerging viral strains within distinct virus species or even families.