3D Structural Modeling of Glycosylated SARS-CoV-2 Trimer Immunogen Enables Predictions of Epitope Accessibility and Other Key Features A 3D structure of the S trimer was generated by using a homology model of the S trimer described previously (based on PDB: 6VSB; Wrapp et al., 2020). Onto this 3D structure, we installed explicitly defined glycans at each glycosylated sequon based on one of three separate sets of criteria, thereby generating three different glycoform models for comparison that we denote as “Abundance,” “Oxford Class,” and “Processed” models (STAR Methods; Table S1). These criteria were chosen in order to generate glycoform models that represent reasonable expectations for glycosylation microheterogeneity and integrate cross-validating glycomic and glycoproteomic characterization of S and ACE2. The three glycoform models were subjected to multiple all-atom MD simulations with explicit water. Information from analyses of these structures is presented in Figure 4 A along with the sequence of the SARS-CoV-2 S protomer. We also determined variants in S that are emerging in the virus that have been sequenced to date (Table S11). The inter-residue distances were measured between the most α-carbon-distal atoms of the N-glycan sites and Spike glycoprotein population variant sites in 3D space (Figure 4B). Notable from this analysis, there are several variants that don’t ablate the N-linked sequon but are sufficiently close in 3D space to N-glycosites, such as D138H, H655Y, S939F, and L1203F, to warrant further investigation. Figure 4 3D Structural Modeling of Glycosylated SARS-CoV-2 Spike Trimer Immunogen Reveals Predictions for Antigen Accessibility and Other Key Features Results from glycomics and glycoproteomics experiments were combined with results from bioinformatics analyses and used to model several versions of glycosylated SARS-CoV-2 S trimer immunogen. (A) Sequence of the SARS-CoV-2 S immunogen displaying computed antigen accessibility and other information. Antigen accessibility is indicated by red shading across the amino acid sequence. (B) Emerging variants confirmed by independent sequencing experiments were analyzed based on the 3D structure of SARS-CoV-2 S to generate a proximity chart to the determined N-linked glycosylation sites. (C) SARS-CoV-2 S trimer immunogen model from MD simulation displaying abundance glycoforms and antigen accessibility shaded in red for most accessible, white for partial, and black for inaccessible (see Video S1). (D) SARS-CoV-2 S trimer immunogen model from MD simulation displaying Oxford Class glycoforms and sequence variants. Asterisk indicates not visible, whereas the box represents three amino acid variants that are clustered together in 3D space. (E) SARS-CoV-2 S trimer immunogen model from MD simulation displaying processed glycoforms plus shading of Thr-323 that has O-glycosylation at low stoichiometry in yellow. The percentage of simulation time that each S protein residue is accessible to a probe that approximates the size of an antibody variable domain was calculated for a model of the S trimer by using the Abundance glycoforms (Table S1) (Ferreira et al., 2018). The predicted antibody accessibility is visualized across the sequence, as well as mapped onto the 3D surface, via color shading (Figures 4A and 4C; Table S13; Video S1). Additionally, the Oxford Class glycoforms model (Table S1), which is arguably the most encompassing means for representing glycan microheterogeneity because it captures abundant structural topologies (Table S8), is shown with the sequence variant information (Figure 4D; Table S11). A substantial number of these variants occur (directly by comparison to Figure 4A or visually by comparison to Figure 4C) in regions of high calculated epitope accessibility (e.g., N74K, T76I, R78M, D138H, H146Y, S151I, D253G, V483A, etc.; Table S14), suggesting potential selective pressure to avoid host immune response. Also, it is interesting to note that three of the emerging variants would eliminate N-linked sequons in S; N74K and T76I would eliminate N-glycosylation of N74 (found in the insert variable region 1 of CoV-2 S compared to CoV-1 S), and S151I eliminates N-glycosylation of N149 (found in the insert variable region 2) (Figures 4A and S7; Table S11). Lastly, the SARS-CoV-2 S Processed glycoform model is shown (Table S1), along with marking amino acid T0323 that has a modest (11% occupancy, Figure S6; Table S10) amount of O-glycosylation to represent the most heavily glycosylated form of S (Figure 4E). Video S1. Glycosylated S Antigen Accessibility, Related to Figure 4C