PMC:7443692 JSONTXT 22 Projects

Annnotations TAB TSV DIC JSON TextAE Lectin_function

Id Subject Object Predicate Lexical cue
T1 0-84 Sentence denotes Virus-Receptor Interactions of Glycosylated SARS-CoV-2 Spike and Human ACE2 Receptor
T2 86-94 Sentence denotes Abstract
T3 95-298 Sentence denotes The SARS-CoV-2 betacoronavirus uses its highly glycosylated trimeric Spike protein to bind to the cell surface receptor angiotensin converting enzyme 2 (ACE2) glycoprotein and facilitate host cell entry.
T4 299-501 Sentence denotes We utilized glycomics-informed glycoproteomics to characterize site-specific microheterogeneity of glycosylation for a recombinant trimer Spike mimetic immunogen and for a soluble version of human ACE2.
T5 502-742 Sentence denotes We combined this information with bioinformatics analyses of natural variants and with existing 3D structures of both glycoproteins to generate molecular dynamics simulations of each glycoprotein both alone and interacting with one another.
T6 743-874 Sentence denotes Our results highlight roles for glycans in sterically masking polypeptide epitopes and directly modulating Spike-ACE2 interactions.
T7 875-1056 Sentence denotes Furthermore, our results illustrate the impact of viral evolution and divergence on Spike glycosylation, as well as the influence of natural variants on ACE2 receptor glycosylation.
T8 1057-1212 Sentence denotes Taken together, these data can facilitate immunogen design to achieve antibody neutralization and inform therapeutic strategies to inhibit viral infection.
T9 1214-1232 Sentence denotes Graphical Abstract
T10 1234-1244 Sentence denotes Highlights
T11 1245-1332 Sentence denotes • Site-specific N-linked microheterogeneity is defined at 22 sites of SARS-CoV-2 Spike
T12 1333-1413 Sentence denotes • Six sites of N-linked microheterogeneity of human ACE2 receptor are described
T13 1414-1504 Sentence denotes • Molecular dynamics simulations of Spike and ACE2 show essential roles for glycosylation
T14 1505-1586 Sentence denotes • We uncover roles for variants in protein-glycan and glycan-glycan interactions
T15 1588-1851 Sentence denotes Combining glycomics-informed glycoproteomics and bioinformatic analyses of variants with molecular dynamics simulations, Zhao et al. detail a role for glycan-protein and glycan-glycan interactions in the SARS-CoV-2 viral Spike protein-ACE2 human receptor complex.
T16 1853-1865 Sentence denotes Introduction
T17 1866-2089 Sentence denotes The SARS-CoV-2 coronavirus, a positive-sense single-stranded RNA virus, is responsible for the severe acute respiratory syndrome referred to as COVID-19 that was first reported in China in December 2019 (Zhou et al., 2020).
T18 2090-2321 Sentence denotes In approximately six months, this betacoronavirus has spread globally, with more than 14 million people testing positive worldwide resulting in greater than 600,000 deaths as of July 20, 2020 (https://coronavirus.jhu.edu/map.html).
T19 2322-2556 Sentence denotes The SARS-CoV-2 coronavirus is highly similar (nearly 80% identical at the genomic level) to SARS-CoV-1, which was responsible for the severe acute respiratory syndrome outbreak that began in 2002 (Lu et al., 2020; Zhong et al., 2003).
T20 2557-2767 Sentence denotes Furthermore, human SARS-CoV-2 at the whole-genome level is >95% identical to a bat coronavirus (RaTG13), the natural reservoir host for multiple coronaviruses (Xia, 2020; Zhang et al., 2020; Zhou et al., 2020).
T21 2768-3254 Sentence denotes Given the rapid appearance and spread of this virus, there is no current validated vaccine or SARS-CoV-2-specific targeting therapy that is clinically approved, although statins, heparin, and steroids look promising for lowering fatality rates, and antivirals likely reduce the duration of symptomatic disease presentation (Alijotas-Reig et al., 2020; Beigel et al., 2020; Beun et al., 2020; Dashti-Khavidaki and Khalili, 2020; Fedson et al., 2020; Shi et al., 2020; Tang et al., 2020).
T22 3255-3420 Sentence denotes SARS-CoV-2, like SARS-CoV-1, utilizes the host angiotensin-converting enzyme 2 (ACE2) for binding and entry into host cells (Hoffmann et al., 2020; Li et al., 2003).
T23 3421-3596 Sentence denotes Like many viruses, SARS-CoV-2 utilizes a Spike glycoprotein trimer for recognition and binding to the host cell entry receptor and for membrane fusion (Watanabe et al., 2019).
T24 3597-3977 Sentence denotes Given the importance of viral Spike proteins for targeting and entry into host cells along with their location on the viral surface, Spike proteins are often used as immunogens for vaccines to generate neutralizing antibodies and frequently targeted for inhibition by small molecules that might block host receptor binding and/or membrane fusion (Li, 2016; Watanabe et al., 2019).
T25 3978-4221 Sentence denotes In similar fashion, wild-type or catalytically impaired ACE2 has also been investigated as a potential therapeutic biologic that might interfere with the infection cycle of ACE2-targeting coronaviruses (Lei et al., 2020; Monteil et al., 2020).
T26 4222-4429 Sentence denotes Thus, a detailed understanding of SARS-CoV-2 Spike binding to ACE2 is critical for elucidating mechanisms of viral binding and entry, as well as for undertaking the rational design of effective therapeutics.
T27 4430-4594 Sentence denotes The SARS-CoV-2 Spike glycoprotein consists of two subunits, a receptor binding subunit (S1) and a membrane fusion subunit (S2) (Lu et al., 2020; Zhou et al., 2020).
T28 4595-4879 Sentence denotes The Spike glycoprotein assembles into stable homotrimers that together possess 66 canonical sequons for N-linked glycosylation (N-X-S/T, where X is any amino acid except P) as well as a number of potential O-linked glycosylation sites (Watanabe et al., 2020a; Watanabe et al., 2020b).
T29 4880-5203 Sentence denotes Interestingly, coronaviruses virions bud into the lumen of the endoplasmic reticulum-Golgi intermediate compartment, ERGIC, raising unanswered questions regarding the precise mechanisms by which viral surface glycoproteins are processed as they traverse the secretory pathway (Stertz et al., 2007; Ujike and Taguchi, 2015).
T30 5204-5509 Sentence denotes Although this and similar studies (Shajahan et al., 2020; Watanabe et al., 2020a) analyze recombinant proteins, a previous study on SARS-CoV-1 suggested that glycosylation of the Spike can be impacted by this intracellular budding, and this remains to be investigated in SARS-CoV-2 (Ritchie et al., 2010).
T31 5510-5877 Sentence denotes Nonetheless, it has been proposed that this virus, and others, acquires a glycan coat sufficient and similar enough to endogenous host protein glycosylation that it serves as a glycan shield, facilitating immune evasion by masking non-self viral peptides with self-glycans (Stertz et al., 2007; Ujike and Taguchi, 2015; Watanabe et al., 2020b; Watanabe et al., 2019).
T32 5878-6236 Sentence denotes In parallel with their potential masking functions, glycan-dependent epitopes can elicit specific, even neutralizing, antibody responses, as has been described for HIV-1 (Duan et al., 2018; Escolano et al., 2019; Pinto et al., 2020; Seabright et al., 2020; Watanabe et al., 2019; Yu et al., 2018; https://www.biorxiv.org/content/10.1101/2020.06.30.178897v1).
T33 6237-6426 Sentence denotes Thus, understanding the glycosylation of the viral Spike trimer is fundamental for the development of efficacious vaccines, neutralizing antibodies, and therapeutic inhibitors of infection.
T34 6427-6542 Sentence denotes ACE2 is an integral membrane metalloproteinase that regulates the renin-angiotensin system (Tikellis et al., 2011).
T35 6543-6717 Sentence denotes Both SARS-CoV-1 and SARS-CoV-2 have co-opted ACE2 to function as the receptor by which these viruses attach and fuse with host cells (Hoffmann et al., 2020; Li et al., 2003).
T36 6718-6979 Sentence denotes ACE2 is cleavable by ADAM proteases at the cell surface (Lambert et al., 2005), resulting in the shedding of a soluble ectodomain that can be detected in apical secretions of various epithelial layers (gastric, airway, etc.) and in serum (Epelman et al., 2009).
T37 6980-7119 Sentence denotes The N-terminal extracellular domain of ACE2 contains six canonical sequons for N-linked glycosylation and several potential O-linked sites.
T38 7120-7368 Sentence denotes Several nonsynonymous single-nucleotide polymorphisms (SNPs) in the ACE2 gene have been identified in the human population and could potentially alter ACE2 glycosylation and/or affinity of the receptor for the viral Spike protein (Li et al., 2005).
T39 7369-7739 Sentence denotes Given that glycosylation can affect the half-life of circulating glycoproteins in addition to modulating the affinity of their interactions with receptors and immune/inflammatory signaling pathways (Marth and Grewal, 2008; Varki, 2017), understanding the impact of glycosylation of ACE2 with respect to its binding of SARS-CoV-2 Spike glycoprotein is of high importance.
T40 7740-8029 Sentence denotes The proposed use of soluble extracellular domains of ACE2 as decoy, competitive inhibitors for SARS-CoV-2 infection emphasizes the critical need for understanding the glycosylation profile of ACE2 so that optimally active biologics can be produced (Lei et al., 2020; Monteil et al., 2020).
T41 8030-8392 Sentence denotes To accomplish the task of characterizing site-specific glycosylation of the trimer Spike of SARS-CoV-2 and the host receptor ACE2, we began by expressing and purifying a stabilized, soluble trimer Spike glycoprotein mimetic immunogen (that we define here and forward as S, [Yu et al., 2020]) and a soluble version of the ACE2 glycoprotein from a human cell line.
T42 8393-8575 Sentence denotes We utilized multiple mass-spectrometry-based approaches, including glycomic and glycoproteomic approaches, to determine occupancy and site-specific heterogeneity of N-linked glycans.
T43 8576-8759 Sentence denotes Occupancy (i.e., the percent of any given residue being modified by a glycan) is an important consideration when developing neutralizing antibodies against a glycan-dependent epitope.
T44 8760-8871 Sentence denotes We also identified sites of O-linked glycosylation and the heterogeneity of the O-linked glycans on S and ACE2.
T45 8872-9087 Sentence denotes We leveraged this rich dataset, along with existing 3D-structures of both glycoproteins, to generate static and molecular dynamics (MD) models of S alone, and in complex with the glycosylated, soluble ACE2 receptor.
T46 9088-9363 Sentence denotes By combining bioinformatics characterization of viral evolution and variants of S and ACE2 with MD simulations of the glycosylated S-ACE2 interaction, we identified important roles for glycans in multiple processes, including receptor-viral binding and glycan shielding of S.
T47 9364-9602 Sentence denotes Our rich characterization of the recombinant, glycosylated S trimer mimetic immunogen of SARS-CoV-2 in complex with the soluble human ACE2 receptor provides a detailed platform for guiding rational vaccine, antibody, and inhibitor design.
T48 9604-9611 Sentence denotes Results
T49 9613-9722 Sentence denotes Expression, Purification, and Characterization of SARS-CoV-2 Spike Glycoprotein Trimer and Soluble Human ACE2
T50 9723-10229 Sentence denotes A trimer-stabilized, soluble variant of the SARS-CoV-2 S that contains 22 canonical N-linked glycosylation sequons per protomer and a soluble version of human ACE2 that contains six, lacking the most C-terminal seventh, canonical N-linked glycosylation sequons (Figure 1 A) were purified from the media of transfected HEK293 cells, and the quaternary structure confirmed by negative EM staining for the S trimer (Figure 1B) and purity examined by SDS-PAGE Coomassie G-250 stained gels for both (Figure 1C).
T51 10230-10358 Sentence denotes In addition, proteolytic digestions followed by proteomic analyses confirmed that the proteins were highly purified (Table S12).
T52 10359-10558 Sentence denotes Finally, the N terminus of both the mature S and the soluble mature ACE2 were empirically determined via proteolytic digestions and liquid chromatography-tandem mass spectrometry (LC-MS/MS) analyses.
T53 10559-10787 Sentence denotes These results confirmed that both the secreted, mature forms of S protein and ACE2 begin with an N-terminal glutamine that has undergone condensation to form pyroglutamine at residues 14 and 18, respectively (Figures 1D and S1).
T54 10788-11035 Sentence denotes The N-terminal peptide observed for S also contains a glycan at Asn-0017 (Figure 1D), and mass spectrometry analysis of non-reducing proteolytic digestions confirmed that Cys-0015 of S is in a disulfide linkage with Cys-0136 (Figure S2; Table S2).
T55 11036-11369 Sentence denotes Given that SignalP (Almagro Armenteros et al., 2019) predicts signal sequence cleavage between Cys-0015 and Val-0016 but we observed cleavage between Ser-0013 and Gln-0014, we examined the possibility that an in-frame upstream methionine to the proposed start methionine (Figure 1A) might be used to initiate translation (Figure S3).
T56 11370-11589 Sentence denotes If one examines the predicted signal sequence cleavage using the in-frame Met that is encoded nine amino acids upstream, SignalP now predicts cleavage between the Ser and Gln that we observed in our studies (Figure S3).
T57 11590-11831 Sentence denotes To examine whether this impacted S expression, we expressed constructs that contained or did not contain the upstream 27 nucleotides in a pseudovirus (VSV) system expressing SARS-CoV-2 S (Figure S4) and in our HEK293 system (data not shown).
T58 11832-11953 Sentence denotes Both expression systems produced a similar amount of S regardless of which expression construct was utilized (Figure S4).
T59 11954-12159 Sentence denotes Thus, while the translation initiation start site has still not been fully defined, allowing for earlier translation in expression construct design did not have a significant impact on the generation of S.
T60 12160-12273 Sentence denotes Figure 1 Expression and Characterization of SARS-CoV-2 Spike Glycoprotein Trimer Immunogen and Soluble Human ACE2
T61 12274-12337 Sentence denotes (A) Sequences of SARS-CoV-2 S immunogen and soluble human ACE2.
T62 12338-12444 Sentence denotes The N-terminal pyroglutamines for both mature protein monomers are bolded, underlined, and shown in green.
T63 12445-12531 Sentence denotes The canonical N-linked glycosylation sequons are bolded, underlined, and shown in red.
T64 12532-12741 Sentence denotes (B and C) Negative stain electron microscopy of the purified trimer (B) and Coomassie G-250-stained reducing SDS-PAGE gels (C) confirmed purity of the SARS-CoV-2 S protein trimer and of the soluble human ACE2.
T65 12742-12772 Sentence denotes MWM, molecular weight markers.
T66 12773-12942 Sentence denotes (D) A representative Step-HCD fragmentation spectrum from mass-spectrometry analysis of a tryptic digest of S annotated manually based on search results from pGlyco 2.2.
T67 12943-13035 Sentence denotes This spectrum defines the N terminus of the mature protein monomer as (pyro-)glutamine 0014.
T68 13036-13197 Sentence denotes A representative N-glycan consistent with this annotation and our glycomics data (Figure 2) is overlaid by using the Symbol Nomenclature For Glycans (SNFG) code.
T69 13198-13234 Sentence denotes This complex glycan occurs at N0017.
T70 13235-13354 Sentence denotes Note, that as expected, the cysteine is carbamidomethylated, and the mass accuracy of the assigned peptide is 0.98 ppm.
T71 13355-13467 Sentence denotes On the sequence of the N-terminal peptide and in the spectrum, the assigned b (blue) and y (red) ions are shown.
T72 13468-13621 Sentence denotes In the spectrum, purple highlights glycan oxonium ions and green marks intact peptide fragment ions with various partial glycan sequences still attached.
T73 13622-13789 Sentence denotes Note that the green-labeled ions allow for limited topology to be extracted including defining that the fucose is on the core and not the antennae of the glycopeptide.
T74 13791-13896 Sentence denotes Glycomics-Informed Glycoproteomics Reveals Site-Specific Microheterogeneity of SARS-CoV-2 S Glycosylation
T75 13897-13981 Sentence denotes We utilized multiple approaches to examine glycosylation of the SARS-CoV-2 S trimer.
T76 13982-14117 Sentence denotes First, the portfolio of glycans linked to SARS-CoV-2 S trimer immunogen was analyzed after their release from the polypeptide backbone.
T77 14118-14244 Sentence denotes N-glycans were released from protein by treatment with PNGase F- and O-glycans were subsequently released by beta-elimination.
T78 14245-14441 Sentence denotes After permethylation to enhance detection sensitivity and structural characterization, released glycans were analyzed by multi-stage mass spectrometry (MSn) (Aoki et al., 2007; Aoki et al., 2008).
T79 14442-14567 Sentence denotes Mass spectra were processed by GRITS Toolbox, and the resulting annotations were validated manually (Weatherly et al., 2019).
T80 14568-14724 Sentence denotes Glycan assignments were grouped by type and by additional structural features for relative quantification of profile characteristics (Figure 2 A; Table S3).
T81 14725-14893 Sentence denotes This analysis quantified 49 N-glycans and revealed that 55% of the total glycan abundance was of the complex type, 17% was of the hybrid type, and 28% was high mannose.
T82 14894-15043 Sentence denotes Among the complex and hybrid N-glycans, we observed a high degree of core fucosylation and significant abundance of bisected and LacDiNAc structures.
T83 15044-15285 Sentence denotes We also observed sulfated N-linked glycans by using negative mode MSn analyses (Table S13), although signal intensity was too low in positive ion mode (at least 10-fold lower than any of the non-sulfated glycans) for accurate quantification.
T84 15286-15373 Sentence denotes In addition, we detected 15 O-glycans released from the S trimer (Figure S5; Table S4).
T85 15374-15512 Sentence denotes Figure 2 Glycomics-Informed Glycoproteomics Reveals Substantial Site-Specific Microheterogeneity of N-linked Glycosylation on SARS-CoV-2 S
T86 15513-15616 Sentence denotes (A) Glycans released from SARS-CoV-2 S protein trimer immunogen were permethylated and analyzed by MSn.
T87 15617-15738 Sentence denotes Structures were assigned and grouped by type and structural features, and prevalence was determined based on ion current.
T88 15739-15797 Sentence denotes The pie chart shows basic division by broad N-glycan type.
T89 15798-15866 Sentence denotes The bar graph provides additional detail about the glycans detected.
T90 15867-16040 Sentence denotes The most abundant structure with a unique categorization by glycomics for each N-glycan type in the pie chart, or above each feature category in the bar graph, is indicated.
T91 16041-16265 Sentence denotes (B–E) Glycopeptides were prepared from SARS-CoV-2 S protein trimer immunogen by using multiple combinations of proteases, analyzed by LC-MSn, and the resulting data were searched by using several different software packages.
T92 16266-16388 Sentence denotes Four representative sites of N-linked glycosylation with specific features of interest were chosen and are presented here.
T93 16389-16617 Sentence denotes N0074 (B) and N0149 (C) are shown that occur in variable insert regions of S compared to SARS-CoV and other related coronaviruses, and there are emerging variants of SARS-CoV-2 that disrupt these two sites of glycosylation in S.
T94 16618-16676 Sentence denotes N0234 (D) contains the most high-mannose N-linked glycans.
T95 16677-16827 Sentence denotes N0801 (D) is an example of glycosylation in the S2 region of the immunogen and displays a high degree of hybrid glycosylation compared to other sites.
T96 16828-16910 Sentence denotes The abundance of each composition is graphed in terms of assigned spectral counts.
T97 16911-17031 Sentence denotes Representative glycans (as determined by glycomics analysis) for several abundant compositions are shown in SNFG format.
T98 17032-17163 Sentence denotes The abbreviations used here and throughout the manuscript are as follows: N, HexNAc; H, hexose; F, fucose; A, Neu5Ac; S, sulfation.
T99 17164-17328 Sentence denotes Note that the graphs for the other 18 sites and other graphs grouping the microheterogeneity observed by other properties are presented in Supplemental Information.
T100 17329-17569 Sentence denotes To determine occupancy of N-linked glycans at each site, we employed a sequential deglycoslyation approach by using Endoglycosidase H and PNGase F in the presence of 18O-H2O after tryptic digestion of S (Wang et al., 2020; Yu et al., 2018).
T101 17570-17701 Sentence denotes After LC-MS/MS analyses, the resulting data confirmed that 19 of the canonical sequons had occupancies greater than 95% (Table S5).
T102 17702-17866 Sentence denotes One canonical sequence, N0149, had insufficient spectral counts for quantification by this method, but subsequent analyses described below suggested high occupancy.
T103 17867-17972 Sentence denotes The two most C-terminal N-linked sites, N1173 and N1194, had reduced occupancy, 52% and 82% respectively.
T104 17973-18152 Sentence denotes Reduced occupancy at these sites could reflect hindered en bloc transfer by the oligosaccharyltransferase (OST) due to primary amino acid sequences at or near the N-linked sequon.
T105 18153-18485 Sentence denotes Alternatively, this could reflect these two sites being post-translationally modified after release of the protein by the ribosome by a less efficient STT3B-containing OST, either due to activity or initial folding of the polypeptide, as opposed to co-translationally modified by the STT3A-containing OST (Ruiz-Canada et al., 2009).
T106 18486-18812 Sentence denotes None of the non-canonical sequons (three N-X-C sites and four N-G-L/I/V sites; Zielinska et al., 2010) showed significant occupancy (>5%), except for N0501, which showed moderate (19%) conversion to 18O-Asp that could be due to deamidation that is facilitated by glycine at the +1 position (Table S5) (Palmisano et al., 2012).
T107 18813-18968 Sentence denotes Further analysis of this site (see below) by direct glycopeptide analyses allowed us to determine that N0501 undergoes deamidation but is not glycosylated.
T108 18969-19130 Sentence denotes Thus, all, and only the, 22 canonical sequences for N-linked glycosylation (N-X-S/T) are utilized, with only N1173 and N1194 demonstrating occupancies below 95%.
T109 19131-19293 Sentence denotes Next, we applied three different proteolytic digestion strategies to the SARS-CoV-2 S immunogen to maximize glycopeptide coverage by subsequent LC-MS/MS analyses.
T110 19294-19594 Sentence denotes Extended gradient nanoflow reverse-phase LC-MS/MS was carried out on a ThermoFisher Lumos Tribrid instrument using Step-HCD fragmentation on each of the samples (see STAR Methods for details, as well as Duan et al., 2018; Escolano et al., 2019; Wang et al., 2020; Yu et al., 2018; Zhou et al., 2017).
T111 19595-19908 Sentence denotes After data analyses using pGlyco 2.2.2 (Liu et al., 2017), Byonic (Bern et al., 2012), and manual validation of glycan compositions against our released glycomics findings (Figure 2A; Tables S3 and S13), we were able to determine the microheterogeneity at each of the 22 canonical sites (Figures 2B–2E; Table S6).
T112 19909-20017 Sentence denotes Notably, none of the non-canonical consensus sequences, including N0501, displayed any quantifiable glycans.
T113 20018-20145 Sentence denotes The N-glycosites N0074 (Figure 2B) and N0149 (Figure 2C) are highly processed and display a typical mammalian N-glycan profile.
T114 20146-20236 Sentence denotes N0149 is, however, modified with several hybrid N-glycan structures, whereas N0074 is not.
T115 20237-20427 Sentence denotes N0234 (Figure 2D) and N0801 (Figure 2E) have N-glycan profiles more similar to those found on other viruses such as HIV (Watanabe et al., 2019) that are dominated by high-mannose structures.
T116 20428-20582 Sentence denotes N0234 (Figure 2D) displays an abundance of Man7-Man9 high-mannose structures, suggesting stalled processing by early-acting ER and cis-Golgi mannosidases.
T117 20583-20780 Sentence denotes In contrast, N0801 (Figure 2E) is processed more efficiently to Man5 high-mannose and hybrid structures, suggesting that access to the glycan at this site by MGAT1 and α-Mannosidase II is hindered.
T118 20781-21048 Sentence denotes In general, for all 22 sites (Figures 2B–2E; Table S6), we observed underprocessing of complex glycan antennae (i.e., under-galactosylation and under-sialylation) and a high degree of core fucosylation in agreement with released glycan analyses (Figure 2A; Table S3).
T119 21049-21147 Sentence denotes We also observed a small percent of sulfated N-linked glycans at several sites (Tables S6 and S8).
T120 21148-21363 Sentence denotes Based on the assignments and the spectral counts for each topology, we were able to determine the percent of total N-linked glycan types (high-mannose, hybrid, or complex) present at each site (Figure 3 ; Table S7).
T121 21364-21610 Sentence denotes Notably, three of the sites (N0234, N0709, and N0717) displayed more than 50% high-mannose glycans, whereas 11 other sites (N0017, N0074, N0149, N0165, N0282, N0331, N0657, N1134, N1158, N1173, and N1194) were more than 90% complex when occupied.
T122 21611-21677 Sentence denotes The other eight sites were distributed between these two extremes.
T123 21678-21808 Sentence denotes Notably, only one site (N0717 at 45%), which also had greater than 50% high mannose (55%), had greater than 33% hybrid structures.
T124 21809-22079 Sentence denotes To further evaluate the heterogeneity, we grouped all the topologies into the 20 classes recently described by the Crispin laboratory, adding two categories (sulfated and unoccupied) that we refer to here as the Oxford classification (Table S8) (Watanabe et al., 2020a).
T125 22080-22379 Sentence denotes Among other features observed, this classification allowed us to observe that although most sites with high-mannose structures were dominated by the Man5GlcNAc2 structure, N0234 and N0717 were dominated by the higher Man structures of Man8GlcNAc2 and Man7GlcNAc2, respectively (Figure S7; Table S8).
T126 22380-22603 Sentence denotes Limited processing at N0234 is in agreement with a recent report suggesting that high-mannose structures at this site help to stabilize the receptor-binding domain of S (www.biorxiv.org/content/10.1101/2020.06.11.146522v1).
T127 22604-22931 Sentence denotes Furthermore, applying the Oxford classifications to our dataset clearly demonstrates that the three most C-terminal sites (N1158, N1173, and N1194), dominated by complex-type glycans, were more often further processed (i.e., multiple antennae) and elaborated (i.e., galactosylation and sialylation) than other sites (Table S8).
T128 22932-23026 Sentence denotes Figure 3 SARS-CoV-2 S Immunogen N-glycan Sites Are Predominantly Modified by Complex N-glycans
T129 23027-23308 Sentence denotes N-glycan topologies were assigned to all 22 sites of the S protomer and the spectral counts for each of the three types of N-glycans (high-mannose, hybrid, and complex), as well as the unoccupied peptide spectral match counts at each site, were summed and visualized as pie charts.
T130 23309-23396 Sentence denotes Note that only N1173 and N1194 show an appreciable amount of the unoccupied amino acid.
T131 23397-23680 Sentence denotes We also analyzed our generated mass spectrometry data for the presence of O-linked glycans based on our glycomic findings (Figure S5; Table S4) and a recent manuscript suggesting significant levels of O-glycosylation of S1 and S2 when expressed independently (Shajahan et al., 2020).
T132 23681-23817 Sentence denotes We were able to confirm sites of O-glycan modification with microheterogeneity observed for the vast majority of these sites (Table S9).
T133 23818-24021 Sentence denotes However, occupancy at each site, determined by spectral counts, was observed to be very low (below 4%), except for Thr0323, which had a modestly higher but still low 11% occupancy (Figure S6; Table S10).
T134 24023-24157 Sentence denotes 3D Structural Modeling of Glycosylated SARS-CoV-2 Trimer Immunogen Enables Predictions of Epitope Accessibility and Other Key Features
T135 24158-24280 Sentence denotes A 3D structure of the S trimer was generated by using a homology model of the S trimer described previously (based on PDB:
T136 24281-24307 Sentence denotes 6VSB; Wrapp et al., 2020).
T137 24308-24611 Sentence denotes Onto this 3D structure, we installed explicitly defined glycans at each glycosylated sequon based on one of three separate sets of criteria, thereby generating three different glycoform models for comparison that we denote as “Abundance,” “Oxford Class,” and “Processed” models (STAR Methods; Table S1).
T138 24612-24843 Sentence denotes These criteria were chosen in order to generate glycoform models that represent reasonable expectations for glycosylation microheterogeneity and integrate cross-validating glycomic and glycoproteomic characterization of S and ACE2.
T139 24844-24942 Sentence denotes The three glycoform models were subjected to multiple all-atom MD simulations with explicit water.
T140 24943-25069 Sentence denotes Information from analyses of these structures is presented in Figure 4 A along with the sequence of the SARS-CoV-2 S protomer.
T141 25070-25179 Sentence denotes We also determined variants in S that are emerging in the virus that have been sequenced to date (Table S11).
T142 25180-25355 Sentence denotes The inter-residue distances were measured between the most α-carbon-distal atoms of the N-glycan sites and Spike glycoprotein population variant sites in 3D space (Figure 4B).
T143 25356-25579 Sentence denotes Notable from this analysis, there are several variants that don’t ablate the N-linked sequon but are sufficiently close in 3D space to N-glycosites, such as D138H, H655Y, S939F, and L1203F, to warrant further investigation.
T144 25580-25730 Sentence denotes Figure 4 3D Structural Modeling of Glycosylated SARS-CoV-2 Spike Trimer Immunogen Reveals Predictions for Antigen Accessibility and Other Key Features
T145 25731-25923 Sentence denotes Results from glycomics and glycoproteomics experiments were combined with results from bioinformatics analyses and used to model several versions of glycosylated SARS-CoV-2 S trimer immunogen.
T146 25924-26031 Sentence denotes (A) Sequence of the SARS-CoV-2 S immunogen displaying computed antigen accessibility and other information.
T147 26032-26113 Sentence denotes Antigen accessibility is indicated by red shading across the amino acid sequence.
T148 26114-26317 Sentence denotes (B) Emerging variants confirmed by independent sequencing experiments were analyzed based on the 3D structure of SARS-CoV-2 S to generate a proximity chart to the determined N-linked glycosylation sites.
T149 26318-26531 Sentence denotes (C) SARS-CoV-2 S trimer immunogen model from MD simulation displaying abundance glycoforms and antigen accessibility shaded in red for most accessible, white for partial, and black for inaccessible (see Video S1).
T150 26532-26648 Sentence denotes (D) SARS-CoV-2 S trimer immunogen model from MD simulation displaying Oxford Class glycoforms and sequence variants.
T151 26649-26774 Sentence denotes Asterisk indicates not visible, whereas the box represents three amino acid variants that are clustered together in 3D space.
T152 26775-26946 Sentence denotes (E) SARS-CoV-2 S trimer immunogen model from MD simulation displaying processed glycoforms plus shading of Thr-323 that has O-glycosylation at low stoichiometry in yellow.
T153 26947-27204 Sentence denotes The percentage of simulation time that each S protein residue is accessible to a probe that approximates the size of an antibody variable domain was calculated for a model of the S trimer by using the Abundance glycoforms (Table S1) (Ferreira et al., 2018).
T154 27205-27375 Sentence denotes The predicted antibody accessibility is visualized across the sequence, as well as mapped onto the 3D surface, via color shading (Figures 4A and 4C; Table S13; Video S1).
T155 27376-27658 Sentence denotes Additionally, the Oxford Class glycoforms model (Table S1), which is arguably the most encompassing means for representing glycan microheterogeneity because it captures abundant structural topologies (Table S8), is shown with the sequence variant information (Figure 4D; Table S11).
T156 27659-27981 Sentence denotes A substantial number of these variants occur (directly by comparison to Figure 4A or visually by comparison to Figure 4C) in regions of high calculated epitope accessibility (e.g., N74K, T76I, R78M, D138H, H146Y, S151I, D253G, V483A, etc.; Table S14), suggesting potential selective pressure to avoid host immune response.
T157 27982-28330 Sentence denotes Also, it is interesting to note that three of the emerging variants would eliminate N-linked sequons in S; N74K and T76I would eliminate N-glycosylation of N74 (found in the insert variable region 1 of CoV-2 S compared to CoV-1 S), and S151I eliminates N-glycosylation of N149 (found in the insert variable region 2) (Figures 4A and S7; Table S11).
T158 28331-28586 Sentence denotes Lastly, the SARS-CoV-2 S Processed glycoform model is shown (Table S1), along with marking amino acid T0323 that has a modest (11% occupancy, Figure S6; Table S10) amount of O-glycosylation to represent the most heavily glycosylated form of S (Figure 4E).
T159 28587-28596 Sentence denotes Video S1.
T160 28597-28655 Sentence denotes Glycosylated S Antigen Accessibility, Related to Figure 4C
T161 28657-28738 Sentence denotes Glycomics-Informed Glycoproteomics Reveals Complex N-linked Glycosylation of ACE2
T162 28739-28857 Sentence denotes We also analyzed ACE2 glycosylation utilizing the same glycomic and glycoproteomic approaches described for S protein.
T163 28858-29087 Sentence denotes Glycomic analyses of released N-linked glycans (Figure 5 A; Table S3) revealed that the majority of glycans on ACE2 are complex with limited high-mannose and hybrid glycans, and we were unable to detect sulfated N-linked glycans.
T164 29088-29294 Sentence denotes Glycoproteomic analyses revealed that occupancy was high (> 75%) at all six sites, and significant microheterogeneity dominated by complex N-glycans was observed for each site (Figures 5B–5G; Tables S5–S8).
T165 29295-29555 Sentence denotes We also observed, consistent with the O-glycomics (Figure S5; Table S4), that Ser 155 and several S/T residues at the C terminus of ACE2 outside of the peptidase domain were O-glycosylated, but stoichiometry was extremely low (less than 2%; Tables S9 and S10).
T166 29556-29676 Sentence denotes Figure 5 Glycomics-Informed Glycoproteomics of Soluble Human ACE2 Reveals High Occupancy, Complex N-linked Glycosylation
T167 29677-29765 Sentence denotes (A) Glycans released from soluble, purified ACE2 were permethylated and analyzed by MSn.
T168 29766-29884 Sentence denotes Structures were assigned, grouped by type and structural features, and prevalence was determined based on ion current.
T169 29885-29943 Sentence denotes The pie chart shows basic division by broad N-glycan type.
T170 29944-30012 Sentence denotes The bar graph provides additional detail about the glycans detected.
T171 30013-30186 Sentence denotes The most abundant structure with a unique categorization by glycomics for each N-glycan type in the pie chart, or above each feature category in the bar graph, is indicated.
T172 30187-30392 Sentence denotes (B–G) Glycopeptides were prepared from soluble human ACE2 by using multiple combinations of proteases, analyzed by LC-MSn, and the resulting data were searched by using several different software packages.
T173 30393-30452 Sentence denotes All six sites of N-linked glycosylation are presented here.
T174 30453-30567 Sentence denotes Displayed in the bar graphs are the individual compositions observed graphed in terms of assigned spectral counts.
T175 30568-30688 Sentence denotes Representative glycans (as determined by glycomics analysis) for several abundant compositions are shown in SNFG format.
T176 30689-30805 Sentence denotes The pie chart (analogous to Figure 3 for SARS-CoV-2 S) for each site is displayed in the upper corner of each panel.
T177 30806-30815 Sentence denotes (B) N053.
T178 30816-30825 Sentence denotes (C) N090.
T179 30826-30835 Sentence denotes (D) N103.
T180 30836-30845 Sentence denotes (E) N322.
T181 30846-30855 Sentence denotes (F) N432.
T182 30856-30919 Sentence denotes (G) N546, a site that does not exist in three in 10,000 people.
T183 30921-31014 Sentence denotes 3D Structural Modeling of Glycosylated, Soluble, ACE2-Highlighting Glycosylation and Variants
T184 31015-31140 Sentence denotes We integrated our glycomics, glycoproteomics, and population variant analyses results with a 3D model of Ace 2 (based on PDB:
T185 31141-31290 Sentence denotes 6M0J (Lan et al., 2020; see STAR Methods for details) to generate two versions of the soluble glycosylated ACE2 for visualization and MD simulations.
T186 31291-31509 Sentence denotes We visualized the ACE2 glycoprotein with the Abundance glycoform model simulated at each site as well as highlighting the naturally occurring variants observed in the human population (Figure 6 A; Video S2; Table S11).
T187 31510-31630 Sentence denotes Note, that the Abundance glycoform model and the Oxford Class glycoform model for ACE2 are identical (Tables S1 and S8).
T188 31631-31818 Sentence denotes Notably, one site of N-linked glycosylation (N546) is predicted to not be present in three out of 10,000 humans based on naturally occurring variation in the human population (Table S11).
T189 31819-31888 Sentence denotes We also modeled ACE2 using the Processed glycoform model (Figure 6B).
T190 31889-31976 Sentence denotes In both models, the interaction domain with S is defined (Figures 6A and 6B; Video S2).
T191 31977-32043 Sentence denotes Figure 6 3D Structural Modeling of Glycosylated Soluble Human ACE2
T192 32044-32225 Sentence denotes Results from glycomics and glycoproteomics experiments were combined with results from bioinformatics analyses and used to model several versions of glycosylated soluble human ACE2.
T193 32226-32358 Sentence denotes (A) Soluble human ACE2 model from MD simulations displaying abundance glycoforms, interaction surface with S, and sequence variants.
T194 32359-32450 Sentence denotes N546 variant is boxed that would remove N-linked glycosylation at that site (see Video S2).
T195 32451-32563 Sentence denotes (B) Soluble human ACE2 model from MD simulations displaying processed glycoforms and interaction surface with S.
T196 32564-32573 Sentence denotes Video S2.
T197 32574-32627 Sentence denotes Glycosylated ACE2 with Variants, Related to Figure 6A
T198 32629-32780 Sentence denotes MD Simulation of the Glycosylated Trimer Spike of SARS-CoV-2 in Complex with Glycosylated, Soluble, Human Ace 2 Reveals Protein and Glycan Interactions
T199 32781-32905 Sentence denotes MD simulations were performed to examine the co-complex (generated from a crystal structure of the ACE2-RBD co-complex, PDB:
T200 32906-33088 Sentence denotes 6M0J; Lan et al., 2020) of glycosylated S with glycosylated ACE2 with the three different glycoforms models (Abundance, Oxford Class, and Processed; Table S1; Videos S5, S6, and S7).
T201 33089-33327 Sentence denotes Information from these analyses is laid out along the primary structure (sequence) of the SARS-CoV-2 S protomer and ACE2 highlighting regions of glycan-protein interaction observed in the MD simulations (Table S14; Videos S5, S6, and S7).
T202 33328-33533 Sentence denotes Interestingly, two glycans on ACE2 (at N090 and N322), which are highlighted in Figure 7 A and shown in a more close-up view in Figure 7B, are predicted to form interactions with the S protein (Table S15).
T203 33534-33756 Sentence denotes The N322 glycan interaction with the S trimer is outside of the receptor-binding domain, and the interaction is observed across multiple simulations and throughout each simulation (Figures 7A and 7B; Video S5, S6, and S7).
T204 33757-34048 Sentence denotes The ACE2 glycan at N090 is close enough to the S trimer surface to repeatedly form interactions; however, the glycan arms interact with multiple regions of the surface over the course of the simulations, reflecting the relatively high degree of glycan dynamics (Figures 7A and 7B; Video S3).
T205 34049-34233 Sentence denotes Inter-molecule glycan-glycan interactions are also observed repeatedly between the glycan at N546 of ACE2 and those in the S protein at residues N0074 and N0165 (Figure 7D; Table S16).
T206 34234-34408 Sentence denotes Finally, a full view of the ACE2-S complex with Oxford class glycoforms on ACE2 illustrates the extensive glycosylation at the interface of the complex (Figure 7C; Video S4).
T207 34409-34566 Sentence denotes Figure 7 Interactions of Glycosylated Soluble Human ACE2 and Glycosylated SARS-CoV-2 S Trimer Immunogen Revealed By 3D-Structural Modeling and MD Simulations
T208 34567-34707 Sentence denotes (A) MD simulation of glycosylated soluble human ACE2 and glycosylated SARS-CoV-2 S trimer immunogen interaction (see Videos S5, S6, and S7).
T209 34708-34809 Sentence denotes ACE2 (top) is colored red with glycans in pink, whereas S is colored white with glycans in dark gray.
T210 34810-34892 Sentence denotes Highlighted are ACE2 glycans that interact with S that are magnified to the right.
T211 34893-35082 Sentence denotes (B) Magnification of ACE2-S interface highlighting ACE2 glycan interactions by using 3D-SNFG icons (Thieker et al., 2016) with S protein (pink) as well as ACE2-S glycan-glycan interactions.
T212 35083-35203 Sentence denotes (C) Magnification of dynamics trajectory of glycans at the interface of soluble human ACE2 and S (see Videos S3 and S4).
T213 35204-35213 Sentence denotes Video S3.
T214 35214-35263 Sentence denotes Interface of ACE2-S Complex, Related to Figure 7C
T215 35264-35273 Sentence denotes Video S4.
T216 35274-35327 Sentence denotes The Glycosylated ACE2-S Complex, Related to Figure 7C
T217 35328-35337 Sentence denotes Video S5.
T218 35338-35398 Sentence denotes Abundance Glycoforms on ACE2-S Complex, Related to Figure 7A
T219 35399-35408 Sentence denotes Video S6.
T220 35409-35472 Sentence denotes Oxford Class Glycoforms on ACE2-S Complex, Related to Figure 7A
T221 35473-35482 Sentence denotes Video S7.
T222 35483-35543 Sentence denotes Processed Glycoforms on ACE2-S Complex, Related to Figure 7A
T223 35545-35555 Sentence denotes Discussion
T224 35556-36023 Sentence denotes We have defined the glycomics-informed, site-specific microheterogeneity of 22 sites of N-linked glycosylation per monomer on a SARS-CoV-2 trimer and the six sites of N-linked glycosylation on a soluble version of its human ACE2 receptor by using a combination of mass spectrometry approaches coupled with evolutionary and variant sequence analyses to provide a detailed understanding of the glycosylation states of these glycoproteins (Figures 1, 2, 3, 4, 5, and 6).
T225 36024-36194 Sentence denotes Our results suggest essential roles for glycosylation in mediating receptor binding, antigenic shielding, and potentially the evolution/divergence of these glycoproteins.
T226 36195-36582 Sentence denotes The highly glycosylated SARS-CoV-2 Spike protein, unlike several other viral proteins including HIV-1 (Watanabe et al., 2019) but in agreement with another recent report (Watanabe et al., 2020a), presents significantly more processing of N-glycans toward complex glycosylation, suggesting that steric hindrance to processing enzymes is not a major factor at most sites (Figures 2 and 3).
T227 36583-36678 Sentence denotes However, the N-glycans still provide considerable shielding of the peptide backbone (Figure 4).
T228 36679-37091 Sentence denotes Our glycomics-guided glycoproteomic data are generally in strong agreement with the trimer immunogen data recently published by Crispin (Watanabe et al., 2020a), although we also observed sulfated N-linked glycans; were able to differentiate branching, bisected, and diLacNAc containing structures by glycomics; and observed less occupancy on the two most C-terminal N-linked sites by using a different approach.
T229 37092-37291 Sentence denotes Our detection of sulfated N-linked glycans at multiple sites on S is in agreement with a recent manuscript re-analyzing the Crispin data (https://www.biorxiv.org/content/10.1101/2020.05.31.125302v1).
T230 37292-37433 Sentence denotes Sulfated N-linked glycans could potentially play key roles in immune regulation and receptor binding as in other viruses (Wang et al., 2009).
T231 37434-37553 Sentence denotes This result is especially significant in that sulfated N-glycans were not observed when we performed glycomics on ACE2.
T232 37554-37818 Sentence denotes At each individual site, the glycans we observed on our immunogen appear to be slightly more processed, but the overlap between our analysis and the Crispin’s group results (Watanabe et al., 2020a) at each site in terms of major features are nearly superimposable.
T233 37819-38033 Sentence denotes This agreement differs substantially when comparing our and Crispin’s data (Watanabe et al., 2020a) to that of the Azadi group (Shajahan et al., 2020), which analyzed S1 and S2 that had been expressed individually.
T234 38034-38310 Sentence denotes When expressed as two separate polypeptides and not purified for trimers, several unoccupied sites of N-linked glycosylation were observed and processing at several sites was significantly different (Shajahan et al., 2020) than we and others (Watanabe et al., 2020a) observed.
T235 38311-38602 Sentence denotes Although O-glycosylation has recently been reported for individually expressed S1 and S2 domains of the Spike glycoprotein (Shajahan et al., 2020), in trimeric form the level of O-glycosylation is extremely low, with the highest level of occupancy we observed being 11% at T0323 (Figure 4E).
T236 38603-38872 Sentence denotes The low level of O-linked occupancy we observed is in agreement with the Crispin group’s analysis of a Spike Trimer immunogen (Watanabe et al., 2020a) but differs significantly from the Azadi group’s analyses of individually expressed S1 and S2 (Shajahan et al., 2020).
T237 38873-39151 Sentence denotes Thus, the context in which the Spike protein is expressed and purified before analysis significantly alters the glycosylation of the protomer that is reminiscent of previous studies looking at expression of the HIV-1 envelope Spike (Behrens et al., 2017; Watanabe et al., 2019).
T238 39152-39306 Sentence denotes The soluble ACE2 protein examined here contains six highly utilized sites of N-linked glycosylation dominated by complex type N-linked glycans (Figure 5).
T239 39307-39411 Sentence denotes O-glycans were also present on this glycoprotein but at very low levels of occupancy at all sites (<2%).
T240 39412-39622 Sentence denotes Our glycomics-informed glycoproteomics allowed us to assign defined sets of glycans to specific glycosylation sites on 3D-structures of S and ACE2 glycoproteins based on experimental evidence (Figures 4 and 6).
T241 39623-39862 Sentence denotes Similar to almost all glycoproteins, microheterogeneity is evident at most glycosylation sites of S and ACE2; each glycosylation site can be modified with one of several glycan structures, generating site-specific glycosylation portfolios.
T242 39863-39957 Sentence denotes For modeling purposes, however, explicit structures must be placed at each glycosylation site.
T243 39958-40128 Sentence denotes In order to capture the impact of microheterogeneity on S and ACE2 MD we chose to generate glycoforms for modeling that represented reasonable portfolios of glycan types.
T244 40129-40410 Sentence denotes Using three glycoform models for S (Abundance, Oxford Class, and Processed) and two models for ACE2 (Abundance, which was equivalent to Oxford Class, and Processed), we generated three MD simulations of the co-complexes of these two glycoproteins (Figure 7; Videos S5, S6, and S7).
T245 40411-40579 Sentence denotes The observed interactions over time allowed us to evaluate glycan-protein contacts between the two proteins and examine potential glycan-glycan interactions (Figure 7).
T246 40580-40686 Sentence denotes We observed glycan-mediated interactions between the S trimer and glycans at N090, N322, and N546 of ACE2.
T247 40687-40838 Sentence denotes Thus, variations in glycan occupancy or processing at these sites could alter the affinity of the SARS-CoV-2–ACE2 interaction and modulate infectivity.
T248 40839-41094 Sentence denotes It is well established that glycosylation states vary depending on tissue and cell type as well as in the case of humans, on age (Krištić et al., 2014), underlying disease (Pavić et al., 2018; Rudman et al., 2019), and ethnicity (Gebrehiwot et al., 2018).
T249 41095-41217 Sentence denotes Thus, glycosylation portfolios could in part be responsible for tissue tropism and individual susceptibility to infection.
T250 41218-41565 Sentence denotes The importance of glycosylation for S binding to ACE2 is even more emphatically demonstrated by the direct glycan-glycan interactions observed (Figure 7) between S glycans (at N0074 and N0165) and an ACE2 receptor glycan (at N546), adding an additional layer of complexity for interpreting the impact of glycosylation on individual susceptibility.
T251 41566-41691 Sentence denotes Several emerging variants of the virus appear to be altering N-linked glycosylation occupancy by disrupting N-linked sequons.
T252 41692-41911 Sentence denotes Interestingly, the two N-linked sequons in SARS-CoV-2 S directly impacted by variants, N0074 and N0149, are in divergent insert regions 1 and 2, respectively, of SARS-CoV-2 S in comparison with SARS-CoV-1 S (Figure 4A).
T253 41912-42155 Sentence denotes The N0074, in particular, is one of the S glycans that interact directly with ACE2 glycan (at N546; Figure 7), suggesting that glycan-glycan interactions could contribute to the unique infectivity differences between SARS-CoV-2 and SARS-CoV-1.
T254 42156-42375 Sentence denotes These sequon variants will also be important to examine in terms of glycan shielding that could influence immunogenicity and efficacy of neutralizing antibodies, as well as interactions with the host cell receptor ACE2.
T255 42376-42603 Sentence denotes Naturally occurring amino acid-changing SNPs in the ACE2 gene generate a number of variants including one variant, with a frequency of three in 10,000 humans, that eliminates a site of N-linked glycosylation at N546 (Figure 6).
T256 42604-42886 Sentence denotes Understanding the impact of ACE2 variants on glycosylation and more importantly on S binding, especially for N546S, which impacts the glycan-glycan interaction between S and ACE2 (Figure 7), should be prioritized in light of efforts to develop ACE2 as a potential decoy therapeutic.
T257 42887-43034 Sentence denotes Intelligent manipulation of ACE2 glycosylation could lead to more potent biologics capable of acting as better competitive inhibitors of S binding.
T258 43035-43369 Sentence denotes The data presented here, and related similar recent findings (Casalino et al., 2020; Watanabe et al., 2020a; Wrobel et al., 2020), provide a framework to facilitate the production of immunogens, vaccines, antibodies, and inhibitors as well as additional information regarding mechanisms by which glycan microheterogeneity is achieved.
T259 43370-43504 Sentence denotes However, considerable efforts still remain in order to fully understand the role of glycans in SARS-CoV-2 infection and pathogenicity.
T260 43505-43754 Sentence denotes Although HEK-expressed S and ACE2 provide a useful window for understanding human glycosylation of these proteins, glycoproteomic characterization after expression in cell lines of more direct relevance to disease and target tissue is sorely needed.
T261 43755-43952 Sentence denotes Although site occupancy could change depending on presentation and cell type (Struwe et al., 2018), processing of N-linked glycans will almost certainly be altered in a cell-type-dependent fashion.
T262 43953-44244 Sentence denotes Thus, analyses of the Spike trimer extracted from pseudoviruses, virion-like particles, and ultimately from infectious SARS-CoV-2 virions harvested from airway cells or patients will provide the most accurate view of how trimer immunogens reflect the true glycosylation pattern of the virus.
T263 44245-44452 Sentence denotes Detailed analyses of the impact of emerging variants in S and natural and designed-for-biologics variants of ACE2 on glycosylation and binding properties are important next steps for developing therapeutics.
T264 44453-44713 Sentence denotes Finally, it will be important to monitor the slow evolution of the virus to determine if existing sites of glycosylation are lost or new sites emerge with selective pressure that might alter the efficacy of vaccines, neutralizing antibodies, and/or inhibitors.
T265 44715-44727 Sentence denotes STAR★Methods
T266 44729-44748 Sentence denotes Key Resources Table
T267 44749-44786 Sentence denotes REAGENT or RESOURCE SOURCE IDENTIFIER
T268 44787-44832 Sentence denotes Chemicals, Peptides, and Recombinant Proteins
T269 44833-44868 Sentence denotes SARS-CoV-2 S protein This Study N/A
T270 44869-44902 Sentence denotes Human ACE2 protein This Study N/A
T271 44903-44948 Sentence denotes 2x Laemmli sample buffer Bio-Rad Cat#161-0737
T272 44949-45042 Sentence denotes Invitrogen NuPAGE 4 to 12%, Bis-Tris, Mini Protein Gel Thermo Fisher Scientific Cat#NP0321PK2
T273 45043-45112 Sentence denotes Coomassie Brilliant Blue G-250 Dye Thermo Fisher Scientific Cat#20279
T274 45113-45151 Sentence denotes Dithiothreitol Sigma Aldrich Cat#43815
T275 45152-45189 Sentence denotes Iodoacetamide Sigma Aldrich Cat#I1149
T276 45190-45215 Sentence denotes Trypsin Promega Cat#V5111
T277 45216-45239 Sentence denotes Lys-C Promega Cat#V1671
T278 45240-45263 Sentence denotes Arg-C Promega Cat#V1881
T279 45264-45287 Sentence denotes Glu-C Promega Cat#V1651
T280 45288-45312 Sentence denotes Asp-N Promega Cat#VA1160
T281 45313-45348 Sentence denotes Endoglycosidase H Promega Cat#V4871
T282 45349-45374 Sentence denotes PNGaseF Promega Cat#V4831
T283 45375-45435 Sentence denotes Chymotrypsin Athens Research and Technology Cat#16-19-030820
T284 45436-45486 Sentence denotes Alpha lytic protease New England BioLabs Cat#P8113
T285 45487-45540 Sentence denotes 18O water Cambridge Isotope Laboratories OLM-782-10-1
T286 45541-45583 Sentence denotes O-protease OpeRATOR Genovis Cat#G1-OP1-020
T287 45584-45598 Sentence denotes Deposited Data
T288 45599-45700 Sentence denotes MS data for site-specific N-linked glycopeptides for SARS-Cov-2 S and human ACE2 This Study PXD019937
T289 45701-45802 Sentence denotes MS data for site-specific O-linked glycopeptides for SARS-Cov-2 S and human ACE2 This Study PXD019940
T290 45803-45905 Sentence denotes MS data for deglycosylated N-linked glycopeptides for SARS-Cov-2 S and human ACE2 This Study PXD019938
T291 45906-45979 Sentence denotes MS data for disulfide bond analysis for SARS-Cov-2 S This Study PXD019939
T292 45980-46055 Sentence denotes MS data for N-linked glycomics deposited at GlycoPost This Study GPST000120
T293 46056-46131 Sentence denotes MS data for O-linked glycomics deposited at GlycoPost This Study GPST000121
T294 46132-46152 Sentence denotes Experimental Models:
T295 46153-46163 Sentence denotes Cell Lines
T296 46164-46192 Sentence denotes 293-F Cells GIBCO Cat#R79007
T297 46193-46218 Sentence denotes Vero-6 Cells ATCC CRL1586
T298 46219-46239 Sentence denotes Experimental Models:
T299 46240-46257 Sentence denotes Organisms/Strains
T300 46258-46293 Sentence denotes VSV(G)-Pseudoviruses This Study N/A
T301 46294-46317 Sentence denotes Software and Algorithms
T302 46318-46398 Sentence denotes pGlyco v2.2.2 Liu et al., 2017 http://pfind.ict.ac.cn/software/pGlyco/index.html
T303 46399-46464 Sentence denotes Proteome Discoverer v1.4 Thermo Fisher Scientific CAT#OPTON-30945
T304 46465-46548 Sentence denotes Byonic v3.8.13 Protein Metrics Inc. https://www.proteinmetrics.com/products/byonic/
T305 46549-46671 Sentence denotes ProteoIQ v2.7 Premier Biosoft (Bern et al., 2012) http://www.premierbiosoft.com/protein_quantification_software/index.html
T306 46672-46743 Sentence denotes GRITS Toolbox V1.1 Weatherly et al., 2019 http://www.grits-toolbox.org/
T307 46744-46829 Sentence denotes EMBOSS needle v6.6.0 Rice et al., 2000 https://www.ebi.ac.uk/Tools/psa/emboss_needle/
T308 46830-46886 Sentence denotes Biopython v1.76 Cock et al., 2009 https://biopython.org/
T309 46887-46934 Sentence denotes Rpdb v2.3 Julien Ide https://rdrr.io/cran/Rpdb/
T310 46935-47019 Sentence denotes SignalP V5.0 Almagro Armenteros et al., 2019 http://www.cbs.dtu.dk/services/SignalP/
T311 47020-47118 Sentence denotes LibreOFFICE Writer v6.4.4.2 The Document Foundation https://www.libreoffice.org/download/download/
T312 47119-47171 Sentence denotes GlyGen V1.5 York et al., 2020 https://www.glygen.org
T313 47172-47262 Sentence denotes GNOme V1.5.5 OBO Foundry https://github.com/glygen-glycan-data/GNOme/blob/master/README.md
T314 47263-47329 Sentence denotes GlyTouCan V3.1.0 Aoki-Kinoshita et al., 2016 https://glytoucan.org
T315 47330-47416 Sentence denotes Inkscape V1.0 Inkscape project contributors https://inkscape.org/release/inkscape-1.0/
T316 47417-47470 Sentence denotes ffmpeg V3.4 The FFmpeg developers https://ffmpeg.org/
T317 47471-47526 Sentence denotes Cygwin V3.1.5 Cygwin developers https://www.cygwin.com/
T318 47528-47549 Sentence denotes Resource Availability
T319 47551-47563 Sentence denotes Lead Contact
T320 47564-47772 Sentence denotes Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Lance Wells (lwells@ccrc.uga.edu) or alternatively by Peng Zhao (pengzhao@uga.edu).
T321 47774-47796 Sentence denotes Materials Availability
T322 47797-47845 Sentence denotes This study did not generate new unique reagents.
T323 47847-47873 Sentence denotes Data and Code Availability
T324 47874-48010 Sentence denotes The mass spectrometry proteomics data are available via ProteomeXchange with identifiers PXD019937, PXD019940, PXD019938, and PXD019939.
T325 48011-48119 Sentence denotes The mass spectrometry glycomics data are available via GlycoPost with identifiers GPST000120 and GPST000121.
T326 48121-48159 Sentence denotes Experimental Model and Subject Details
T327 48160-48271 Sentence denotes HEK293-F cells (GIBCO) were maintained and passaged in FreeStyle Media (GIBCO) containing 1% Pen Strep (GIBCO).
T328 48272-48433 Sentence denotes Vero-6 cells (ATCC) were maintained and passaged in DMEM medium supplemented with 10% fetal bovine serum and 1% Pen Strep (GIBCO) and amphotericin B antibiotics.
T329 48434-48510 Sentence denotes All cells were maintained at 37°C with 5% CO2 before and after transfection.
T330 48512-48526 Sentence denotes Method Details
T331 48528-48614 Sentence denotes Expression, Purification, and Characterization of SARS-CoV-2 S and Human ACE2 Proteins
T332 48615-49042 Sentence denotes To express a stabilized ectodomain of Spike protein, a synthetic gene encoding residues 1−1208 of SARS-CoV-2 Spike with the furin cleavage site (residues 682–685) replaced by a “GGSG” sequence, proline substitutions at residues 986 and 987, and a foldon trimerization motif followed by a C-terminal 6xHisTag was created and cloned into the mammalian expression vector pCMV-IRES-puro (Codex BioSolutions, Inc, Gaithersburg, MD).
T333 49043-49172 Sentence denotes The expression construct was transiently transfected in HEK293F cells using polyethylenimine (Polysciences, Inc, Warrington, PA).
T334 49173-49417 Sentence denotes Protein was purified from cell supernatants using Ni-NTA resin (QIAGEN, Germany), the eluted fractions containing S protein were pooled, concentrated, and further purified by gel filtration chromatography on a Superose 6 column (GE Healthcare).
T335 49418-49515 Sentence denotes Negative stain electron microscopy (EM) analysis was performed as described (Shaik et al., 2019).
T336 49516-49809 Sentence denotes Briefly, analysis was performed at room temperature with a magnification of 52,000x and a defocus value of 1.5 μm following low-dose procedures, using a Philips Tecnai F20 electron microscope (Thermo Fisher Scientific) equipped with a Gatan US4000 CCD camera and operated at voltage of 200 kV.
T337 49810-49955 Sentence denotes The DNA fragment encoding human ACE2 (1-615) with a 6xHis tag at C terminus was synthesized by Genscript and cloned to the vector pCMV-IRES-puro.
T338 49956-50037 Sentence denotes The expression construct was transfected in HEK293F cells using polyethylenimine.
T339 50038-50114 Sentence denotes The medium was discarded and replaced with FreeStyle 293 medium after 6-8 h.
T340 50115-50240 Sentence denotes After incubation in 37°C with 5.5% CO2 for 5 days, the supernatant was collected and loaded to Ni-NTA resin for purification.
T341 50241-50316 Sentence denotes The elution was concentrated and further purified by a Superdex 200 column.
T342 50318-50373 Sentence denotes In-Gel Analysis of SARS-CoV-2 S and Human ACE2 Proteins
T343 50374-50634 Sentence denotes A 3.5-μg aliquot of SARS-CoV-2 S protein as well as a 2-μg aliquot of human ACE2 were combined with Laemmli sample buffer, analyzed on a 4%–12% Invitrogen NuPage Bis-Tris gel using the MES pH 6.5 running buffer, and stained with Coomassie Brilliant Blue G-250.
T344 50636-50728 Sentence denotes Analysis of N-linked and O-linked Glycans Released from SARS-Cov-2 S and Human ACE2 Proteins
T345 50729-50883 Sentence denotes Aliquots of approximately 25-50 μg of S or ACE2 protein were processed for glycan analysis as previously described (Aoki et al., 2007; Aoki et al., 2008).
T346 50884-50954 Sentence denotes For N-linked glycan analysis, the proteins were digested with trypsin.
T347 50955-51087 Sentence denotes Following trypsinization, glycopeptides were enriched by C18 Sep-Pak and subjected to PNGaseF digestion to release N-linked glycans.
T348 51088-51225 Sentence denotes Following PNGaseF digestion, released glycans were separated from residual glycosylated peptides bearing O-linked glycans by C18 Sep-Pak.
T349 51226-51345 Sentence denotes O-glycosylated peptides were eluted from the Sep-Pak and subjected to reductive β-elimination to release the O-glycans.
T350 51346-51463 Sentence denotes Another 25-50 μg aliquot of each protein was denatured with SDS and digested with PNGaseF to remove N-linked glycans.
T351 51464-51604 Sentence denotes The de-N-glycosylated, intact protein was precipitated with cold ethanol and then subjected to reductive β-elimination to release O-glycans.
T352 51605-51705 Sentence denotes The profiles of O-glycans released from peptides or from intact protein were found to be comparable.
T353 51706-51889 Sentence denotes N- and O-linked glycans released from glycoproteins were permethylated with methyliodide according to the method of Anumula and Taylor prior to MS analysis (Anumula and Taylor, 1992).
T354 51890-52011 Sentence denotes Glycan structural analysis was performed using an LTQ-Orbitrap instrument (Orbitrap Discovery, Thermo Fisher Scientific).
T355 52012-52322 Sentence denotes Detection and relative quantification of the prevalence of individual glycans was accomplished using the total ion mapping (TIM) and neutral loss scan (NL scan) functionality of the Xcalibur software package version 2.0 (Thermo Fisher Scientific) as previously described (Aoki et al., 2007; Aoki et al., 2008).
T356 52323-52436 Sentence denotes Mass accuracy and detector response was tuned with a permethylated oligosaccharide standard in positive ion mode.
T357 52437-52558 Sentence denotes For fragmentation by collision-induced dissociation (CID in MS2 and MSn), normalized collision energy of 45% was applied.
T358 52559-52665 Sentence denotes Most permethylated glycans were identified as singly or doubly charged, sodiated species in positive mode.
T359 52666-52770 Sentence denotes Sulfated N-glycans were detected as singly or doubly charged, deprotonated species in negative ion mode.
T360 52771-52867 Sentence denotes Peaks for all charge states were deconvoluted by the charge state and summed for quantification.
T361 52868-52920 Sentence denotes All spectra were manually interpreted and annotated.
T362 52921-53045 Sentence denotes The explicit identities of individual monosaccharide residues have been assigned based on known human biosynthetic pathways.
T363 53046-53242 Sentence denotes Graphical representations of monosaccharide residues are consistent with the Symbol Nomenclature for Glycans (SNFG), which has been broadly adopted by the glycomics community (Varki et al., 2015).
T364 53243-53431 Sentence denotes The MS-based glycomics data generated in these analyses and the associated annotations are presented in accordance with the MIRAGE standards and the Athens Guidelines (Wells et al., 2013).
T365 53432-53647 Sentence denotes Data annotation and assignment of glycan accession identifiers were facilitated by GRITS Toolbox, GlyTouCan, GNOme, and GlyGen (Kahsay et al., 2020; Tiemeyer et al., 2017; Weatherly et al., 2019; York et al., 2020).
T366 53649-53710 Sentence denotes Analysis of Disulfide Bonds for SARS-Cov-2 S Protein by LC-MS
T367 53711-53896 Sentence denotes Two 10-μg aliquots of SARS-CoV-2 S protein were denatured by incubating with 20% acetonitrile at room temperature and alkylated by 13.75 mM of iodoacetamide at room temperature in dark.
T368 53897-54031 Sentence denotes The two aliquots of proteins were then digested respectively using alpha lytic protease, or a combination of trypsin, Lys-C and Glu-C.
T369 54032-54107 Sentence denotes Following digestion, the proteins were deglycosylated by PNGaseF treatment.
T370 54108-54331 Sentence denotes The resulting peptides were separated on an Acclaim PepMap RSLC C18 column (75 μm x 15 cm) and eluted into the nano-electrospray ion source of an Orbitrap Fusion Lumos Tribrid mass spectrometer at a flow rate of 200 nL/min.
T371 54332-54477 Sentence denotes The elution gradient consists of 1%–40% acetonitrile in 0.1% formic acid over 370 min followed by 10 min of 80% acetonitrile in 0.1% formic acid.
T372 54478-54575 Sentence denotes The spray voltage was set to 2.2 kV and the temperature of the heated capillary was set to 280°C.
T373 54576-54756 Sentence denotes Full MS scans were acquired from m/z 200 to 2000 at 60k resolution, and MS/MS scans following electron transfer dissociation (ETD) were collected in the Orbitrap at 15k resolution.
T374 54757-54897 Sentence denotes The raw spectra were analyzed by Byonic (v3.8.13, Protein Metrics Inc.) with mass tolerance set as 20 ppm for both precursors and fragments.
T375 54898-54978 Sentence denotes The search output was filtered at 1% false discovery rate and 10 ppm mass error.
T376 54979-55061 Sentence denotes The spectra assigned as cross-linked peptides were manually evaluated for Cys0015.
T377 55063-55161 Sentence denotes Analysis of Site-Specific N-linked Glycopeptides for SARS-Cov-2 S and Human ACE2 Proteins by LC-MS
T378 55162-55341 Sentence denotes Four 3.5-μg aliquots of SARS-CoV-2 S protein were reduced by incubating with 10 mM of dithiothreitol at 56°C and alkylated by 27.5 mM of iodoacetamide at room temperature in dark.
T379 55342-55517 Sentence denotes The four aliquots of proteins were then digested respectively using alpha lytic protease, chymotrypsin, a combination of trypsin and Glu-C, or a combination of Glu-C and AspN.
T380 55518-55689 Sentence denotes Three 10-μg aliquots of ACE2 protein were reduced by incubating with 5 mM of dithiothreitol at 56°C and alkylated by 13.75 mM of iodoacetamide at room temperature in dark.
T381 55690-55833 Sentence denotes The three aliquots of proteins were then digested respectively using alpha lytic protease, chymotrypsin, or a combination of trypsin and Lys-C.
T382 55834-56057 Sentence denotes The resulting peptides were separated on an Acclaim PepMap RSLC C18 column (75 μm x 15 cm) and eluted into the nano-electrospray ion source of an Orbitrap Fusion Lumos Tribrid mass spectrometer at a flow rate of 200 nL/min.
T383 56058-56203 Sentence denotes The elution gradient consists of 1%–40% acetonitrile in 0.1% formic acid over 370 min followed by 10 min of 80% acetonitrile in 0.1% formic acid.
T384 56204-56301 Sentence denotes The spray voltage was set to 2.2 kV and the temperature of the heated capillary was set to 280°C.
T385 56302-56669 Sentence denotes Full MS scans were acquired from m/z 200 to 2000 at 60k resolution, and MS/MS scans following higher-energy collisional dissociation (HCD) with stepped collision energy (15%, 25%, 35%) were collected in the Orbitrap at 15k resolution. pGlyco v2.2.2 (Liu et al., 2017) was used for database searches with mass tolerance set as 20 ppm for both precursors and fragments.
T386 56670-56778 Sentence denotes The database search output was filtered to reach a 1% false discovery rate for glycans and 10% for peptides.
T387 56779-56878 Sentence denotes Quantitation was performed by calculating spectral counts for each glycan composition at each site.
T388 56879-56974 Sentence denotes Any N-linked glycan compositions identified by only one spectra were removed from quantitation.
T389 56975-57060 Sentence denotes N-linked glycan compositions were categorized into 22 classes (including Unoccupied):
T390 57061-57403 Sentence denotes HexNAc(2)Hex(9∼5)Fuc(0∼1) was classified as M9 to M5 respectively; HexNAc(2)Hex(4∼1)Fuc(0∼1) was classified as M1-M4; HexNAc(3∼6)Hex(5∼9)Fuc(0)NeuAc(0∼1) was classified as Hybrid with HexNAc(3∼6)Hex(5∼9)Fuc(1∼2)NeuAc(0∼1) classified as F-Hybrid; Complex-type glycans are classified based on the number of antenna, fucosylation, and sulfation:
T391 57404-58142 Sentence denotes HexNAc(3)Hex(3∼4)Fuc(0)NeuAc(0∼1) is assigned as A1 with HexNAc(3)Hex(3∼4)Fuc(1∼2)NeuAc(0∼1) assigned as F-A1; HexNAc(4)Hex(3∼5)Fuc(0)NeuAc(0∼2) is assigned as A2/A1B with HexNAc(4)Hex(3∼5)Fuc(1∼5)NeuAc(0∼2) assigned as F-A2/A1B; HexNAc(5)Hex(3∼6)Fuc(0)NeuAc(0∼3) is assigned as A3/A2B with HexNAc(5)Hex(3∼6)Fuc(1∼3)NeuAc(0∼3) assigned as F-A3/A2B; HexNAc(6)Hex(3∼7)Fuc(0)NeuAc(0∼4) is assigned as A4/A3B with HexNAc(6)Hex(3∼7)Fuc(1∼3)NeuAc(0∼4) assigned as F-A4/A3B; HexNAc(7)Hex(3∼8)Fuc(0)NeuAc(0∼1) is assigned as A5/A4B with HexNAc(7)Hex(3∼8)Fuc(1∼3)NeuAc(0∼1) as F-A5/A4B; HexNAc(8)Hex(3∼9)Fuc(0) is assigned as A6/A5B with HexNAc(8)Hex(3∼9)Fuc(1) assigned as F-A6/A5B; any glycans identified with a sulfate are assigned as Sulfated.
T392 58144-58216 Sentence denotes Analysis of Deglycosylated SARS-Cov-2 S and Human ACE2 Proteins by LC-MS
T393 58217-58397 Sentence denotes Three 3.5-μg aliquots of SARS-CoV-2 S protein were reduced by incubating with 10 mM of dithiothreitol at 56°C and alkylated by 27.5 mM of iodoacetamide at room temperature in dark.
T394 58398-58514 Sentence denotes The three aliquots were then digested respectively using chymotrypsin, Asp-N, or a combination of trypsin and Glu-C.
T395 58515-58684 Sentence denotes Two 10-μg aliquots of ACE2 protein were reduced by incubating with 5 mM of dithiothreitol at 56°C and alkylated by 13.75 mM of iodoacetamide at room temperature in dark.
T396 58685-58792 Sentence denotes The two aliquots were then digested respectively using chymotrypsin, or a combination of trypsin and Lys-C.
T397 58793-58927 Sentence denotes Following digestion, the proteins were deglycosylated by Endoglycosidase H followed by PNGaseF treatment in the presence of 18O water.
T398 58928-59151 Sentence denotes The resulting peptides were separated on an Acclaim PepMap RSLC C18 column (75 μm x 15 cm) and eluted into the nano-electrospray ion source of an Orbitrap Fusion Lumos Tribrid mass spectrometer at a flow rate of 200 nL/min.
T399 59152-59297 Sentence denotes The elution gradient consists of 1%–40% acetonitrile in 0.1% formic acid over 370 min followed by 10 min of 80% acetonitrile in 0.1% formic acid.
T400 59298-59395 Sentence denotes The spray voltage was set to 2.2 kV and the temperature of the heated capillary was set to 280°C.
T401 59396-59582 Sentence denotes Full MS scans were acquired from m/z 200 to 2000 at 60k resolution, and MS/MS scans following collision-induced dissociation (CID) at 38% collision energy were collected in the ion trap.
T402 59583-59723 Sentence denotes The spectra were analyzed using SEQUEST (Proteome Discoverer 1.4) with mass tolerance set as 20 ppm for precursors and 0.5 Da for fragments.
T403 59724-59854 Sentence denotes The search output was filtered using ProteoIQ (v2.7) to reach a 1% false discovery rate at protein level and 10% at peptide level.
T404 59855-60073 Sentence denotes Occupancy of each N-linked glycosylation site was calculated using spectral counts assigned to the 18O-Asp-containing (PNGaseF-cleaved) and/or HexNAc-modified (EndoH-cleaved) peptides and their unmodified counterparts.
T405 60075-60173 Sentence denotes Analysis of Site-Specific O-linked Glycopeptides for SARS-Cov-2 S and Human ACE2 Proteins by LC-MS
T406 60174-60391 Sentence denotes Three 10-μg aliquots of SARS-CoV-2 S protein and one 10-μg aliquot of ACE2 protein were reduced by incubating with 5 mM of dithiothreitol at 56°C and alkylated by 13.75 mM of iodoacetamide at room temperature in dark.
T407 60392-60509 Sentence denotes The four aliquots were then digested respectively using trypsin, Lys-C, Arg-C, or a combination of trypsin and Lys-C.
T408 60510-60629 Sentence denotes Following digestion, the proteins were deglycosylated by PNGaseF treatment and then digested with O-protease OpeRATOR®.
T409 60630-60853 Sentence denotes The resulting peptides were separated on an Acclaim PepMap RSLC C18 column (75 μm x 15 cm) and eluted into the nano-electrospray ion source of an Orbitrap Fusion Lumos Tribrid mass spectrometer at a flow rate of 200 nL/min.
T410 60854-60999 Sentence denotes The elution gradient consists of 1%–40% acetonitrile in 0.1% formic acid over 370 min followed by 10 min of 80% acetonitrile in 0.1% formic acid.
T411 61000-61097 Sentence denotes The spray voltage was set to 2.2 kV and the temperature of the heated capillary was set to 280°C.
T412 61098-61372 Sentence denotes Full MS scans were acquired from m/z 200 to 2000 at 60k resolution, and MS/MS scans following higher-energy collisional dissociation (HCD) with stepped collision energy (15%, 25%, 35%) or electron transfer dissociation (ETD) were collected in the Orbitrap at 15k resolution.
T413 61373-61491 Sentence denotes The raw spectra were analyzed by Byonic (v3.8.13) with mass tolerance set as 20 ppm for both precursors and fragments.
T414 61492-61593 Sentence denotes MS/MS filtering was applied to only allow for spectra where the oxonium ions of HexNAc were observed.
T415 61594-61674 Sentence denotes The search output was filtered at 1% false discovery rate and 10 ppm mass error.
T416 61675-61746 Sentence denotes The spectra assigned as O-linked glycopeptides were manually evaluated.
T417 61747-61846 Sentence denotes Quantitation was performed by calculating spectral counts for each glycan composition at each site.
T418 61847-61942 Sentence denotes Any O-linked glycan compositions identified by only one spectra were removed from quantitation.
T419 61943-62136 Sentence denotes Occupancy of each O-linked glycosylation site was calculated using spectral counts assigned to any glycosylated peptides and their unmodified counterparts from searches without MS/MS filtering.
T420 62138-62195 Sentence denotes Sequence Analysis of SARS-CoV-2 S and Human ACE2 Proteins
T421 62196-62341 Sentence denotes The genomes of SARS-CoV as well as bat and pangolin coronavirus sequences reported to be closely related to SARS-CoV-2 were downloaded from NCBI.
T422 62342-62513 Sentence denotes The S protein sequences from all of those genomes were aligned using EMBOSS needle v6.6.0 (Rice et al., 2000) via the EMBL-EBI provided web service (Madeira et al., 2019).
T423 62514-62614 Sentence denotes Manual analysis was performed in the regions containing canonical N-glycosylation sequons (N-X-S/T).
T424 62615-62934 Sentence denotes For further sequence analysis of SARS-CoV-2 S variants, the genomes of SARS-CoV-2 were downloaded from NCBI and GISAID and further processed using Biopython 1.76 to extract all sequences annotated as “surface glycoprotein” and to remove any incomplete sequence as well as any sequence containing unassigned amino acids.
T425 62935-63158 Sentence denotes For sequence analysis of human ACE2 variants, the single nucleotide polymorphisms (SNPs) of ACE2 were extracted from the NCBI dbSNP database and filtered for missense mutation entries with a reported minor allele frequency.
T426 63159-63320 Sentence denotes Manual analysis was performed on both SARS-CoV-2 S and human ACE2 variants to further examine the regions containing canonical N-glycosylation sequons (N-X-S/T).
T427 63321-63430 Sentence denotes LibreOffice Writer and its macro capabilities was used to shade regions on the linear sequence of S and ACE2.
T428 63432-63541 Sentence denotes 3D Structural Modeling and Molecular Dynamics Simulation of Glycosylated SARS-CoV-2 S and Human ACE2 Proteins
T429 63542-63844 Sentence denotes SARS-CoV-2 Spike (S) protein structure and ACE2 co-complex – A 3D structure of the prefusion form of the S protein (RefSeq: YP_009724390.1, UniProt: P0DTC2 SPIKE_SARS2), based on a Cryo-EM structure (PDB code 6VSB) (Wrapp et al., 2020), was obtained from the SWISS-MODEL server (swissmodel.expasy.org).
T430 63845-63911 Sentence denotes The model has 95% coverage (residues 27 to 1146) of the S protein.
T431 63912-64073 Sentence denotes The receptor binding domain (RBD) in the “open” conformation was replaced with the RBD from an ACE2 co-complex (PDB code 6M0J) by grafting residues C336 to V524.
T432 64074-64342 Sentence denotes Glycoform generation – Glycans (detected by glycomics) were selected for installation on glycosylated S and ACE2 sequons (detected by glycoproteomics) based on three sets of criteria designed to reasonably capture different aspects of glycosylation microheterogeneity.
T433 64343-64682 Sentence denotes We denote the first of these glycoform models as “Abundance.” The glycans selected for installation to generate the Abundance model were chosen because they were identified as the most abundant glycan structure (detected by glycomics) that matched the most abundant glycan composition (detected by glycoproteomics) at each individual site.
T434 64683-65068 Sentence denotes We denote the second glycoform model as “Oxford Class.” The glycans selected for installation to generate the Oxford Class model were chosen because they were the most abundant glycan structure, (detected by glycomics) that was contained within the most highly represented Oxford classification group (detected by glycoproteomics) at each individual site (Figure S7; Tables S1 and S8).
T435 65069-65503 Sentence denotes Finally, we denote the third glycoform model as “Processed.” The glycans selected for installation to generate the Processed model were chosen because they were the most highly trimmed, elaborated, or terminally decorated structure (detected by glycomics) that corresponded to a composition (detected by glycoproteomics) which was present at ≥ 1/3rd of the abundance of the most highly represented composition at each site (Table S1).
T436 65504-65680 Sentence denotes 3D structures of the three glycoforms (Abundance, Oxford Class, Processed) were generated for the SARS-CoV-2 S protein alone, and in complex with the glycosylated ACE2 protein.
T437 65681-66044 Sentence denotes The glycoprotein builder available at GLYCAM-Web (www.glycam.org) was employed together with an in-house program that adjusts the asparagine side chain torsion angles and glycosidic linkages within known low-energy ranges (Nivedha et al., 2014) to relieve any atomic overlaps with the core protein, as described previously (Grant et al., 2016; Peng et al., 2017).
T438 66045-66244 Sentence denotes Energy minimization and Molecular dynamics (MD) simulations – Each glycosylated structure was placed in a periodic box of TIP3P water molecules with a 10 Å buffer between the solute and the box edge.
T439 66245-66440 Sentence denotes Energy minimization of all atoms was performed for 20,000 steps (10,000 steepest decent, followed by 10,000 conjugant gradient) under constant pressure (1 atm) and temperature (300 K) conditions.
T440 66441-66683 Sentence denotes All MD simulations were performed under nPT conditions with the CUDA implementation of the PMEMD (Götz et al., 2012; Salomon-Ferrer et al., 2013) simulation code, as present in the Amber14 software suite (University of California, San Diego).
T441 66684-66852 Sentence denotes The GLYCAM06j force field (Kirschner et al., 2008) and Amber14SB force field (Maier et al., 2015) were employed for the carbohydrate and protein moieties, respectively.
T442 66853-67046 Sentence denotes A Berendsen barostat with a time constant of 1 ps was employed for pressure regulation, while a Langevin thermostat with a collision frequency of 2 ps-1 was employed for temperature regulation.
T443 67047-67099 Sentence denotes A nonbonded interaction cut-off of 8 Å was employed.
T444 67100-67209 Sentence denotes Long-range electrostatics were treated with the particle-mesh Ewald (PME) method (Darden and Pedersen, 1993).
T445 67210-67344 Sentence denotes Covalent bonds involving hydrogen were constrained with the SHAKE algorithm, allowing an integration time step of 2 fs to be employed.
T446 67345-67458 Sentence denotes The energy minimized coordinates were equilibrated at 300K over 400 ps with restraints on the solute heavy atoms.
T447 67459-67707 Sentence denotes Each system was then equilibrated with restraints on the Ca atoms of the protein for 1ns, prior to initiating 4 independent 250 ns production MD simulations with random starting seeds for a total time of 1 μs per system, with no restraints applied.
T448 67708-67735 Sentence denotes Antigenic surface analysis.
T449 67736-68090 Sentence denotes A series of 3D structure snapshots of the simulation were taken at 1 ns intervals and analyzed in terms of their ability to interact with a spherical probe based on the average size of hypervariable loops present in an antibody complementarity determining region (CDR), as described recently (https://www.biorxiv.org/content/10.1101/2020.04.07.030445v2).
T450 68091-68244 Sentence denotes The percentage of simulation time each residue was exposed to the AbASA probe was calculated and plotted onto both the 3D structure and primary sequence.
T451 68246-68311 Sentence denotes Analysis of SARS-CoV-2 Spike VSV Pseudoparticles (ppVSV-SARS-2-S)
T452 68312-68419 Sentence denotes 293T cells were transfected with an expression plasmid encoding SARS-CoV-2 Spike (pcDNAintron-SARS-2-SΔ19).
T453 68420-68532 Sentence denotes To increase cell surface expression, the last 19 amino acids containing the Golgi retention signal were removed.
T454 68533-68614 Sentence denotes Two SΔ19 constructs were compared, one started with Met1 and the other with Met2.
T455 68615-68748 Sentence denotes Twenty-four h following transfection, cells were transduced with ppVSVΔG-VSV-G (particles that were pseudotyped with VSV-G in trans).
T456 68749-68831 Sentence denotes One h following transduction cells were extensively washed and media was replaced.
T457 68832-68946 Sentence denotes Supernatant containing particles were collected 12-24 h following transduction and cleared through centrifugation.
T458 68947-69002 Sentence denotes Cleared supernatant was frozen at −80°C for future use.
T459 69003-69100 Sentence denotes Target cells Vero E6 were seeded in 24-well plates (5x105 cells/mL) at a density of 80% coverage.
T460 69101-69290 Sentence denotes The following day, ppVSV-SARS-2-S/GFP particles were transduced into target cells for 60 min, particles pseudotyped with VSV-G, Lassa virus GP, or no glycoprotein were included as controls.
T461 69291-69511 Sentence denotes 24 h following transduction, transduced cells were released from the plate with trypsin, fixed with 4% formaldehyde, and GFP-positive virus-transduced cells were quantified using flow cytometry (Bectin Dickson BD-LSRII).
T462 69512-69748 Sentence denotes To quantify the ability of various SARS-CoV-2 S mutants to mediate fusion, effector cells (HEK293T) were transiently transfected with the indicated pcDNAintron-SARS-2-S expression vector or measles virus H and F (Brindley et al., 2014).
T463 69749-69869 Sentence denotes Effector cells were infected with MVA-T7 four h following transduction to produce the T7 polymerase (Paal et al., 2009).
T464 69870-70102 Sentence denotes Target cells naturally expressing the receptor ACE2 (Vero) or ACE2 negative cells (HEK293T) were transfected with pTM1-luciferase, which encodes for firefly luciferase under the control of a T7 promoter (Brindley and Plemper, 2010).
T465 70103-70208 Sentence denotes 24 h following transfection, the target cells were lifted and added to the effector cells at a 1:1 ratio.
T466 70209-70339 Sentence denotes 4 h following co-cultivation, cells were washed, lysed and luciferase levels were quantified using Promega’s Steady-Glo substrate.
T467 70340-70455 Sentence denotes To visualize cell-to-cell fusion, Vero cells were co-transfected with pGFP and the pcDNAintron-SARS-2-S constructs.
T468 70456-70536 Sentence denotes 24 h following transfection, syncytia was visualized by fluorescence microscopy.
T469 70538-70577 Sentence denotes Quantification and Statistical Analysis
T470 70578-70705 Sentence denotes Raw glycoproteomic data from the mass spectrometers was searched using Proteome Discoverer v1.4 (SEQUEST), Protein Metrics Inc.
T471 70706-70740 Sentence denotes Byonic v3.8.13, and pGlyco v2.2.2.
T472 70741-70873 Sentence denotes For data searches using Proteome Discoverer, the results were processed to apply false discovery rate filtering using ProteoIQ v2.7.
T473 70874-71046 Sentence denotes For the deglycosylated protein work, search results from SEQUEST were filtered in ProteoIQ with a 1% false discovery rate at the protein level and 10% at the peptide level.
T474 71047-71180 Sentence denotes For N-linked glycopeptide analysis, pGlyco was used with false discovery rate of 1% at the glycan level and 10% at the peptide level.
T475 71181-71297 Sentence denotes For disulfide bond analysis and O-glycopeptide searches, Byonic was used and the false discovery rate was set to 1%.
T476 71298-71350 Sentence denotes All mass spectrometry results were manually curated.
T477 71351-71603 Sentence denotes Antigen accessibility simulations were carried out as described in the Method Details section and the mean of four simulations (three of length 350ns, one of length 200ns; amounting to 1.25 μs of total molecular dynamics simulation time) were utilized.
T478 71604-71922 Sentence denotes Glycan-glycan and glycan-peptide interactions were also calculated based on simulations as a percentage of time residues were in contact and averaged (mean) to produce the corresponding supplemental (colored) sequence figures with the raw numbers for coloring present also in each corresponding supplemental table tab.
T479 71923-72019 Sentence denotes 3D distances were computed using Rpdb as described in more detail in the Method Details section.
T480 72020-72116 Sentence denotes This data is presented using box & whisker plots with all underlying statistics calculated in R.
T481 72118-72142 Sentence denotes Supplemental Information
T482 72143-72155 Sentence denotes Document S1.
T483 72156-72169 Sentence denotes Figures S1–S7
T484 72170-72182 Sentence denotes Document S2.
T485 72183-72196 Sentence denotes Tables S1–S16
T486 72197-72209 Sentence denotes Document S2.
T487 72210-72247 Sentence denotes Article plus Supplemental Information
T488 72249-72264 Sentence denotes Acknowledgments
T489 72265-72446 Sentence denotes The authors would like to thank Protein Metrics for providing licenses for their software used here and the developers of pGlyco for productive discussions regarding their software.
T490 72447-72553 Sentence denotes We would also like to thank Galit Alter of the Ragon Institute for facilitating this collaborative effort.
T491 72554-72767 Sentence denotes This effort was facilitated by the ThermoFisher Scientific appointed Center of Excellence in Glycoproteomics at the Complex Carbohydrate Research Center at the University of Georgia (co-directed by M.T. and L.W.).
T492 72768-73141 Sentence denotes This research is supported in part by the National Institutes of Health R35GM119850 (N.E.L.), NNF10CC1016517 (N.E.L.), R01AI139238 (M.A.B.), R01AI147884-01A1S1 (B.C.); Massachusetts Consortium on Pathogen Readiness (B.C.); and National Institutes of Health U01CA207824 (R.J.W.), P41GM103390 (R.J.W.), P41GM103490 (M.T. and L.W.), U01GM125267 (M.T.), and R01GM130915 (L.W.).
T493 73142-73289 Sentence denotes The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
T494 73291-73311 Sentence denotes Author Contributions
T495 73312-73848 Sentence denotes Conceptualization, M.T., B.C., R.J.W., and L.W.; Methodology, Software, Validation, Formal Analysis, Investigation, Resources, and Data Curation, P.Z., J.L.P., O.C.G., Y.C., T.X., K.E.R., K.A., B.P.K., R.B., D.H.B., M.A.B., N.E.L., M.T., B.C., R.J.W., and L.W.; Writing—Original Draft, P.Z., J.L.P., and L.W.; Writing—Review & Editing, all authors; Visualization, P.Z., J.L.P., O.C.G., Y.C., M.T., B.C., R.J.W., and L.W.; Supervision, Project Administration, and Funding Acquisition, D.H.B., M.A.B., N.E.L., M.T., B.C., R.J.W., and L.W.
T496 73850-73874 Sentence denotes Declaration of Interests
T497 73875-73918 Sentence denotes The authors declare no competing interests.
T498 73919-74010 Sentence denotes Supplemental Information can be found online at https://doi.org/10.1016/j.chom.2020.08.004.