PMC:7443692 / 1853-72116 JSONTXT 15 Projects

Annnotations TAB TSV DIC JSON TextAE-old TextAE

Id Subject Object Predicate Lexical cue
T16 0-12 Sentence denotes Introduction
T17 13-236 Sentence denotes The SARS-CoV-2 coronavirus, a positive-sense single-stranded RNA virus, is responsible for the severe acute respiratory syndrome referred to as COVID-19 that was first reported in China in December 2019 (Zhou et al., 2020).
T18 237-468 Sentence denotes In approximately six months, this betacoronavirus has spread globally, with more than 14 million people testing positive worldwide resulting in greater than 600,000 deaths as of July 20, 2020 (https://coronavirus.jhu.edu/map.html).
T19 469-703 Sentence denotes The SARS-CoV-2 coronavirus is highly similar (nearly 80% identical at the genomic level) to SARS-CoV-1, which was responsible for the severe acute respiratory syndrome outbreak that began in 2002 (Lu et al., 2020; Zhong et al., 2003).
T20 704-914 Sentence denotes Furthermore, human SARS-CoV-2 at the whole-genome level is >95% identical to a bat coronavirus (RaTG13), the natural reservoir host for multiple coronaviruses (Xia, 2020; Zhang et al., 2020; Zhou et al., 2020).
T21 915-1401 Sentence denotes Given the rapid appearance and spread of this virus, there is no current validated vaccine or SARS-CoV-2-specific targeting therapy that is clinically approved, although statins, heparin, and steroids look promising for lowering fatality rates, and antivirals likely reduce the duration of symptomatic disease presentation (Alijotas-Reig et al., 2020; Beigel et al., 2020; Beun et al., 2020; Dashti-Khavidaki and Khalili, 2020; Fedson et al., 2020; Shi et al., 2020; Tang et al., 2020).
T22 1402-1567 Sentence denotes SARS-CoV-2, like SARS-CoV-1, utilizes the host angiotensin-converting enzyme 2 (ACE2) for binding and entry into host cells (Hoffmann et al., 2020; Li et al., 2003).
T23 1568-1743 Sentence denotes Like many viruses, SARS-CoV-2 utilizes a Spike glycoprotein trimer for recognition and binding to the host cell entry receptor and for membrane fusion (Watanabe et al., 2019).
T24 1744-2124 Sentence denotes Given the importance of viral Spike proteins for targeting and entry into host cells along with their location on the viral surface, Spike proteins are often used as immunogens for vaccines to generate neutralizing antibodies and frequently targeted for inhibition by small molecules that might block host receptor binding and/or membrane fusion (Li, 2016; Watanabe et al., 2019).
T25 2125-2368 Sentence denotes In similar fashion, wild-type or catalytically impaired ACE2 has also been investigated as a potential therapeutic biologic that might interfere with the infection cycle of ACE2-targeting coronaviruses (Lei et al., 2020; Monteil et al., 2020).
T26 2369-2576 Sentence denotes Thus, a detailed understanding of SARS-CoV-2 Spike binding to ACE2 is critical for elucidating mechanisms of viral binding and entry, as well as for undertaking the rational design of effective therapeutics.
T27 2577-2741 Sentence denotes The SARS-CoV-2 Spike glycoprotein consists of two subunits, a receptor binding subunit (S1) and a membrane fusion subunit (S2) (Lu et al., 2020; Zhou et al., 2020).
T28 2742-3026 Sentence denotes The Spike glycoprotein assembles into stable homotrimers that together possess 66 canonical sequons for N-linked glycosylation (N-X-S/T, where X is any amino acid except P) as well as a number of potential O-linked glycosylation sites (Watanabe et al., 2020a; Watanabe et al., 2020b).
T29 3027-3350 Sentence denotes Interestingly, coronaviruses virions bud into the lumen of the endoplasmic reticulum-Golgi intermediate compartment, ERGIC, raising unanswered questions regarding the precise mechanisms by which viral surface glycoproteins are processed as they traverse the secretory pathway (Stertz et al., 2007; Ujike and Taguchi, 2015).
T30 3351-3656 Sentence denotes Although this and similar studies (Shajahan et al., 2020; Watanabe et al., 2020a) analyze recombinant proteins, a previous study on SARS-CoV-1 suggested that glycosylation of the Spike can be impacted by this intracellular budding, and this remains to be investigated in SARS-CoV-2 (Ritchie et al., 2010).
T31 3657-4024 Sentence denotes Nonetheless, it has been proposed that this virus, and others, acquires a glycan coat sufficient and similar enough to endogenous host protein glycosylation that it serves as a glycan shield, facilitating immune evasion by masking non-self viral peptides with self-glycans (Stertz et al., 2007; Ujike and Taguchi, 2015; Watanabe et al., 2020b; Watanabe et al., 2019).
T32 4025-4383 Sentence denotes In parallel with their potential masking functions, glycan-dependent epitopes can elicit specific, even neutralizing, antibody responses, as has been described for HIV-1 (Duan et al., 2018; Escolano et al., 2019; Pinto et al., 2020; Seabright et al., 2020; Watanabe et al., 2019; Yu et al., 2018; https://www.biorxiv.org/content/10.1101/2020.06.30.178897v1).
T33 4384-4573 Sentence denotes Thus, understanding the glycosylation of the viral Spike trimer is fundamental for the development of efficacious vaccines, neutralizing antibodies, and therapeutic inhibitors of infection.
T34 4574-4689 Sentence denotes ACE2 is an integral membrane metalloproteinase that regulates the renin-angiotensin system (Tikellis et al., 2011).
T35 4690-4864 Sentence denotes Both SARS-CoV-1 and SARS-CoV-2 have co-opted ACE2 to function as the receptor by which these viruses attach and fuse with host cells (Hoffmann et al., 2020; Li et al., 2003).
T36 4865-5126 Sentence denotes ACE2 is cleavable by ADAM proteases at the cell surface (Lambert et al., 2005), resulting in the shedding of a soluble ectodomain that can be detected in apical secretions of various epithelial layers (gastric, airway, etc.) and in serum (Epelman et al., 2009).
T37 5127-5266 Sentence denotes The N-terminal extracellular domain of ACE2 contains six canonical sequons for N-linked glycosylation and several potential O-linked sites.
T38 5267-5515 Sentence denotes Several nonsynonymous single-nucleotide polymorphisms (SNPs) in the ACE2 gene have been identified in the human population and could potentially alter ACE2 glycosylation and/or affinity of the receptor for the viral Spike protein (Li et al., 2005).
T39 5516-5886 Sentence denotes Given that glycosylation can affect the half-life of circulating glycoproteins in addition to modulating the affinity of their interactions with receptors and immune/inflammatory signaling pathways (Marth and Grewal, 2008; Varki, 2017), understanding the impact of glycosylation of ACE2 with respect to its binding of SARS-CoV-2 Spike glycoprotein is of high importance.
T40 5887-6176 Sentence denotes The proposed use of soluble extracellular domains of ACE2 as decoy, competitive inhibitors for SARS-CoV-2 infection emphasizes the critical need for understanding the glycosylation profile of ACE2 so that optimally active biologics can be produced (Lei et al., 2020; Monteil et al., 2020).
T41 6177-6539 Sentence denotes To accomplish the task of characterizing site-specific glycosylation of the trimer Spike of SARS-CoV-2 and the host receptor ACE2, we began by expressing and purifying a stabilized, soluble trimer Spike glycoprotein mimetic immunogen (that we define here and forward as S, [Yu et al., 2020]) and a soluble version of the ACE2 glycoprotein from a human cell line.
T42 6540-6722 Sentence denotes We utilized multiple mass-spectrometry-based approaches, including glycomic and glycoproteomic approaches, to determine occupancy and site-specific heterogeneity of N-linked glycans.
T43 6723-6906 Sentence denotes Occupancy (i.e., the percent of any given residue being modified by a glycan) is an important consideration when developing neutralizing antibodies against a glycan-dependent epitope.
T44 6907-7018 Sentence denotes We also identified sites of O-linked glycosylation and the heterogeneity of the O-linked glycans on S and ACE2.
T45 7019-7234 Sentence denotes We leveraged this rich dataset, along with existing 3D-structures of both glycoproteins, to generate static and molecular dynamics (MD) models of S alone, and in complex with the glycosylated, soluble ACE2 receptor.
T46 7235-7510 Sentence denotes By combining bioinformatics characterization of viral evolution and variants of S and ACE2 with MD simulations of the glycosylated S-ACE2 interaction, we identified important roles for glycans in multiple processes, including receptor-viral binding and glycan shielding of S.
T47 7511-7749 Sentence denotes Our rich characterization of the recombinant, glycosylated S trimer mimetic immunogen of SARS-CoV-2 in complex with the soluble human ACE2 receptor provides a detailed platform for guiding rational vaccine, antibody, and inhibitor design.
T48 7751-7758 Sentence denotes Results
T49 7760-7869 Sentence denotes Expression, Purification, and Characterization of SARS-CoV-2 Spike Glycoprotein Trimer and Soluble Human ACE2
T50 7870-8376 Sentence denotes A trimer-stabilized, soluble variant of the SARS-CoV-2 S that contains 22 canonical N-linked glycosylation sequons per protomer and a soluble version of human ACE2 that contains six, lacking the most C-terminal seventh, canonical N-linked glycosylation sequons (Figure 1 A) were purified from the media of transfected HEK293 cells, and the quaternary structure confirmed by negative EM staining for the S trimer (Figure 1B) and purity examined by SDS-PAGE Coomassie G-250 stained gels for both (Figure 1C).
T51 8377-8505 Sentence denotes In addition, proteolytic digestions followed by proteomic analyses confirmed that the proteins were highly purified (Table S12).
T52 8506-8705 Sentence denotes Finally, the N terminus of both the mature S and the soluble mature ACE2 were empirically determined via proteolytic digestions and liquid chromatography-tandem mass spectrometry (LC-MS/MS) analyses.
T53 8706-8934 Sentence denotes These results confirmed that both the secreted, mature forms of S protein and ACE2 begin with an N-terminal glutamine that has undergone condensation to form pyroglutamine at residues 14 and 18, respectively (Figures 1D and S1).
T54 8935-9182 Sentence denotes The N-terminal peptide observed for S also contains a glycan at Asn-0017 (Figure 1D), and mass spectrometry analysis of non-reducing proteolytic digestions confirmed that Cys-0015 of S is in a disulfide linkage with Cys-0136 (Figure S2; Table S2).
T55 9183-9516 Sentence denotes Given that SignalP (Almagro Armenteros et al., 2019) predicts signal sequence cleavage between Cys-0015 and Val-0016 but we observed cleavage between Ser-0013 and Gln-0014, we examined the possibility that an in-frame upstream methionine to the proposed start methionine (Figure 1A) might be used to initiate translation (Figure S3).
T56 9517-9736 Sentence denotes If one examines the predicted signal sequence cleavage using the in-frame Met that is encoded nine amino acids upstream, SignalP now predicts cleavage between the Ser and Gln that we observed in our studies (Figure S3).
T57 9737-9978 Sentence denotes To examine whether this impacted S expression, we expressed constructs that contained or did not contain the upstream 27 nucleotides in a pseudovirus (VSV) system expressing SARS-CoV-2 S (Figure S4) and in our HEK293 system (data not shown).
T58 9979-10100 Sentence denotes Both expression systems produced a similar amount of S regardless of which expression construct was utilized (Figure S4).
T59 10101-10306 Sentence denotes Thus, while the translation initiation start site has still not been fully defined, allowing for earlier translation in expression construct design did not have a significant impact on the generation of S.
T60 10307-10420 Sentence denotes Figure 1 Expression and Characterization of SARS-CoV-2 Spike Glycoprotein Trimer Immunogen and Soluble Human ACE2
T61 10421-10484 Sentence denotes (A) Sequences of SARS-CoV-2 S immunogen and soluble human ACE2.
T62 10485-10591 Sentence denotes The N-terminal pyroglutamines for both mature protein monomers are bolded, underlined, and shown in green.
T63 10592-10678 Sentence denotes The canonical N-linked glycosylation sequons are bolded, underlined, and shown in red.
T64 10679-10888 Sentence denotes (B and C) Negative stain electron microscopy of the purified trimer (B) and Coomassie G-250-stained reducing SDS-PAGE gels (C) confirmed purity of the SARS-CoV-2 S protein trimer and of the soluble human ACE2.
T65 10889-10919 Sentence denotes MWM, molecular weight markers.
T66 10920-11089 Sentence denotes (D) A representative Step-HCD fragmentation spectrum from mass-spectrometry analysis of a tryptic digest of S annotated manually based on search results from pGlyco 2.2.
T67 11090-11182 Sentence denotes This spectrum defines the N terminus of the mature protein monomer as (pyro-)glutamine 0014.
T68 11183-11344 Sentence denotes A representative N-glycan consistent with this annotation and our glycomics data (Figure 2) is overlaid by using the Symbol Nomenclature For Glycans (SNFG) code.
T69 11345-11381 Sentence denotes This complex glycan occurs at N0017.
T70 11382-11501 Sentence denotes Note, that as expected, the cysteine is carbamidomethylated, and the mass accuracy of the assigned peptide is 0.98 ppm.
T71 11502-11614 Sentence denotes On the sequence of the N-terminal peptide and in the spectrum, the assigned b (blue) and y (red) ions are shown.
T72 11615-11768 Sentence denotes In the spectrum, purple highlights glycan oxonium ions and green marks intact peptide fragment ions with various partial glycan sequences still attached.
T73 11769-11936 Sentence denotes Note that the green-labeled ions allow for limited topology to be extracted including defining that the fucose is on the core and not the antennae of the glycopeptide.
T74 11938-12043 Sentence denotes Glycomics-Informed Glycoproteomics Reveals Site-Specific Microheterogeneity of SARS-CoV-2 S Glycosylation
T75 12044-12128 Sentence denotes We utilized multiple approaches to examine glycosylation of the SARS-CoV-2 S trimer.
T76 12129-12264 Sentence denotes First, the portfolio of glycans linked to SARS-CoV-2 S trimer immunogen was analyzed after their release from the polypeptide backbone.
T77 12265-12391 Sentence denotes N-glycans were released from protein by treatment with PNGase F- and O-glycans were subsequently released by beta-elimination.
T78 12392-12588 Sentence denotes After permethylation to enhance detection sensitivity and structural characterization, released glycans were analyzed by multi-stage mass spectrometry (MSn) (Aoki et al., 2007; Aoki et al., 2008).
T79 12589-12714 Sentence denotes Mass spectra were processed by GRITS Toolbox, and the resulting annotations were validated manually (Weatherly et al., 2019).
T80 12715-12871 Sentence denotes Glycan assignments were grouped by type and by additional structural features for relative quantification of profile characteristics (Figure 2 A; Table S3).
T81 12872-13040 Sentence denotes This analysis quantified 49 N-glycans and revealed that 55% of the total glycan abundance was of the complex type, 17% was of the hybrid type, and 28% was high mannose.
T82 13041-13190 Sentence denotes Among the complex and hybrid N-glycans, we observed a high degree of core fucosylation and significant abundance of bisected and LacDiNAc structures.
T83 13191-13432 Sentence denotes We also observed sulfated N-linked glycans by using negative mode MSn analyses (Table S13), although signal intensity was too low in positive ion mode (at least 10-fold lower than any of the non-sulfated glycans) for accurate quantification.
T84 13433-13520 Sentence denotes In addition, we detected 15 O-glycans released from the S trimer (Figure S5; Table S4).
T85 13521-13659 Sentence denotes Figure 2 Glycomics-Informed Glycoproteomics Reveals Substantial Site-Specific Microheterogeneity of N-linked Glycosylation on SARS-CoV-2 S
T86 13660-13763 Sentence denotes (A) Glycans released from SARS-CoV-2 S protein trimer immunogen were permethylated and analyzed by MSn.
T87 13764-13885 Sentence denotes Structures were assigned and grouped by type and structural features, and prevalence was determined based on ion current.
T88 13886-13944 Sentence denotes The pie chart shows basic division by broad N-glycan type.
T89 13945-14013 Sentence denotes The bar graph provides additional detail about the glycans detected.
T90 14014-14187 Sentence denotes The most abundant structure with a unique categorization by glycomics for each N-glycan type in the pie chart, or above each feature category in the bar graph, is indicated.
T91 14188-14412 Sentence denotes (B–E) Glycopeptides were prepared from SARS-CoV-2 S protein trimer immunogen by using multiple combinations of proteases, analyzed by LC-MSn, and the resulting data were searched by using several different software packages.
T92 14413-14535 Sentence denotes Four representative sites of N-linked glycosylation with specific features of interest were chosen and are presented here.
T93 14536-14764 Sentence denotes N0074 (B) and N0149 (C) are shown that occur in variable insert regions of S compared to SARS-CoV and other related coronaviruses, and there are emerging variants of SARS-CoV-2 that disrupt these two sites of glycosylation in S.
T94 14765-14823 Sentence denotes N0234 (D) contains the most high-mannose N-linked glycans.
T95 14824-14974 Sentence denotes N0801 (D) is an example of glycosylation in the S2 region of the immunogen and displays a high degree of hybrid glycosylation compared to other sites.
T96 14975-15057 Sentence denotes The abundance of each composition is graphed in terms of assigned spectral counts.
T97 15058-15178 Sentence denotes Representative glycans (as determined by glycomics analysis) for several abundant compositions are shown in SNFG format.
T98 15179-15310 Sentence denotes The abbreviations used here and throughout the manuscript are as follows: N, HexNAc; H, hexose; F, fucose; A, Neu5Ac; S, sulfation.
T99 15311-15475 Sentence denotes Note that the graphs for the other 18 sites and other graphs grouping the microheterogeneity observed by other properties are presented in Supplemental Information.
T100 15476-15716 Sentence denotes To determine occupancy of N-linked glycans at each site, we employed a sequential deglycoslyation approach by using Endoglycosidase H and PNGase F in the presence of 18O-H2O after tryptic digestion of S (Wang et al., 2020; Yu et al., 2018).
T101 15717-15848 Sentence denotes After LC-MS/MS analyses, the resulting data confirmed that 19 of the canonical sequons had occupancies greater than 95% (Table S5).
T102 15849-16013 Sentence denotes One canonical sequence, N0149, had insufficient spectral counts for quantification by this method, but subsequent analyses described below suggested high occupancy.
T103 16014-16119 Sentence denotes The two most C-terminal N-linked sites, N1173 and N1194, had reduced occupancy, 52% and 82% respectively.
T104 16120-16299 Sentence denotes Reduced occupancy at these sites could reflect hindered en bloc transfer by the oligosaccharyltransferase (OST) due to primary amino acid sequences at or near the N-linked sequon.
T105 16300-16632 Sentence denotes Alternatively, this could reflect these two sites being post-translationally modified after release of the protein by the ribosome by a less efficient STT3B-containing OST, either due to activity or initial folding of the polypeptide, as opposed to co-translationally modified by the STT3A-containing OST (Ruiz-Canada et al., 2009).
T106 16633-16959 Sentence denotes None of the non-canonical sequons (three N-X-C sites and four N-G-L/I/V sites; Zielinska et al., 2010) showed significant occupancy (>5%), except for N0501, which showed moderate (19%) conversion to 18O-Asp that could be due to deamidation that is facilitated by glycine at the +1 position (Table S5) (Palmisano et al., 2012).
T107 16960-17115 Sentence denotes Further analysis of this site (see below) by direct glycopeptide analyses allowed us to determine that N0501 undergoes deamidation but is not glycosylated.
T108 17116-17277 Sentence denotes Thus, all, and only the, 22 canonical sequences for N-linked glycosylation (N-X-S/T) are utilized, with only N1173 and N1194 demonstrating occupancies below 95%.
T109 17278-17440 Sentence denotes Next, we applied three different proteolytic digestion strategies to the SARS-CoV-2 S immunogen to maximize glycopeptide coverage by subsequent LC-MS/MS analyses.
T110 17441-17741 Sentence denotes Extended gradient nanoflow reverse-phase LC-MS/MS was carried out on a ThermoFisher Lumos Tribrid instrument using Step-HCD fragmentation on each of the samples (see STAR Methods for details, as well as Duan et al., 2018; Escolano et al., 2019; Wang et al., 2020; Yu et al., 2018; Zhou et al., 2017).
T111 17742-18055 Sentence denotes After data analyses using pGlyco 2.2.2 (Liu et al., 2017), Byonic (Bern et al., 2012), and manual validation of glycan compositions against our released glycomics findings (Figure 2A; Tables S3 and S13), we were able to determine the microheterogeneity at each of the 22 canonical sites (Figures 2B–2E; Table S6).
T112 18056-18164 Sentence denotes Notably, none of the non-canonical consensus sequences, including N0501, displayed any quantifiable glycans.
T113 18165-18292 Sentence denotes The N-glycosites N0074 (Figure 2B) and N0149 (Figure 2C) are highly processed and display a typical mammalian N-glycan profile.
T114 18293-18383 Sentence denotes N0149 is, however, modified with several hybrid N-glycan structures, whereas N0074 is not.
T115 18384-18574 Sentence denotes N0234 (Figure 2D) and N0801 (Figure 2E) have N-glycan profiles more similar to those found on other viruses such as HIV (Watanabe et al., 2019) that are dominated by high-mannose structures.
T116 18575-18729 Sentence denotes N0234 (Figure 2D) displays an abundance of Man7-Man9 high-mannose structures, suggesting stalled processing by early-acting ER and cis-Golgi mannosidases.
T117 18730-18927 Sentence denotes In contrast, N0801 (Figure 2E) is processed more efficiently to Man5 high-mannose and hybrid structures, suggesting that access to the glycan at this site by MGAT1 and α-Mannosidase II is hindered.
T118 18928-19195 Sentence denotes In general, for all 22 sites (Figures 2B–2E; Table S6), we observed underprocessing of complex glycan antennae (i.e., under-galactosylation and under-sialylation) and a high degree of core fucosylation in agreement with released glycan analyses (Figure 2A; Table S3).
T119 19196-19294 Sentence denotes We also observed a small percent of sulfated N-linked glycans at several sites (Tables S6 and S8).
T120 19295-19510 Sentence denotes Based on the assignments and the spectral counts for each topology, we were able to determine the percent of total N-linked glycan types (high-mannose, hybrid, or complex) present at each site (Figure 3 ; Table S7).
T121 19511-19757 Sentence denotes Notably, three of the sites (N0234, N0709, and N0717) displayed more than 50% high-mannose glycans, whereas 11 other sites (N0017, N0074, N0149, N0165, N0282, N0331, N0657, N1134, N1158, N1173, and N1194) were more than 90% complex when occupied.
T122 19758-19824 Sentence denotes The other eight sites were distributed between these two extremes.
T123 19825-19955 Sentence denotes Notably, only one site (N0717 at 45%), which also had greater than 50% high mannose (55%), had greater than 33% hybrid structures.
T124 19956-20226 Sentence denotes To further evaluate the heterogeneity, we grouped all the topologies into the 20 classes recently described by the Crispin laboratory, adding two categories (sulfated and unoccupied) that we refer to here as the Oxford classification (Table S8) (Watanabe et al., 2020a).
T125 20227-20526 Sentence denotes Among other features observed, this classification allowed us to observe that although most sites with high-mannose structures were dominated by the Man5GlcNAc2 structure, N0234 and N0717 were dominated by the higher Man structures of Man8GlcNAc2 and Man7GlcNAc2, respectively (Figure S7; Table S8).
T126 20527-20750 Sentence denotes Limited processing at N0234 is in agreement with a recent report suggesting that high-mannose structures at this site help to stabilize the receptor-binding domain of S (www.biorxiv.org/content/10.1101/2020.06.11.146522v1).
T127 20751-21078 Sentence denotes Furthermore, applying the Oxford classifications to our dataset clearly demonstrates that the three most C-terminal sites (N1158, N1173, and N1194), dominated by complex-type glycans, were more often further processed (i.e., multiple antennae) and elaborated (i.e., galactosylation and sialylation) than other sites (Table S8).
T128 21079-21173 Sentence denotes Figure 3 SARS-CoV-2 S Immunogen N-glycan Sites Are Predominantly Modified by Complex N-glycans
T129 21174-21455 Sentence denotes N-glycan topologies were assigned to all 22 sites of the S protomer and the spectral counts for each of the three types of N-glycans (high-mannose, hybrid, and complex), as well as the unoccupied peptide spectral match counts at each site, were summed and visualized as pie charts.
T130 21456-21543 Sentence denotes Note that only N1173 and N1194 show an appreciable amount of the unoccupied amino acid.
T131 21544-21827 Sentence denotes We also analyzed our generated mass spectrometry data for the presence of O-linked glycans based on our glycomic findings (Figure S5; Table S4) and a recent manuscript suggesting significant levels of O-glycosylation of S1 and S2 when expressed independently (Shajahan et al., 2020).
T132 21828-21964 Sentence denotes We were able to confirm sites of O-glycan modification with microheterogeneity observed for the vast majority of these sites (Table S9).
T133 21965-22168 Sentence denotes However, occupancy at each site, determined by spectral counts, was observed to be very low (below 4%), except for Thr0323, which had a modestly higher but still low 11% occupancy (Figure S6; Table S10).
T134 22170-22304 Sentence denotes 3D Structural Modeling of Glycosylated SARS-CoV-2 Trimer Immunogen Enables Predictions of Epitope Accessibility and Other Key Features
T135 22305-22427 Sentence denotes A 3D structure of the S trimer was generated by using a homology model of the S trimer described previously (based on PDB:
T136 22428-22454 Sentence denotes 6VSB; Wrapp et al., 2020).
T137 22455-22758 Sentence denotes Onto this 3D structure, we installed explicitly defined glycans at each glycosylated sequon based on one of three separate sets of criteria, thereby generating three different glycoform models for comparison that we denote as “Abundance,” “Oxford Class,” and “Processed” models (STAR Methods; Table S1).
T138 22759-22990 Sentence denotes These criteria were chosen in order to generate glycoform models that represent reasonable expectations for glycosylation microheterogeneity and integrate cross-validating glycomic and glycoproteomic characterization of S and ACE2.
T139 22991-23089 Sentence denotes The three glycoform models were subjected to multiple all-atom MD simulations with explicit water.
T140 23090-23216 Sentence denotes Information from analyses of these structures is presented in Figure 4 A along with the sequence of the SARS-CoV-2 S protomer.
T141 23217-23326 Sentence denotes We also determined variants in S that are emerging in the virus that have been sequenced to date (Table S11).
T142 23327-23502 Sentence denotes The inter-residue distances were measured between the most α-carbon-distal atoms of the N-glycan sites and Spike glycoprotein population variant sites in 3D space (Figure 4B).
T143 23503-23726 Sentence denotes Notable from this analysis, there are several variants that don’t ablate the N-linked sequon but are sufficiently close in 3D space to N-glycosites, such as D138H, H655Y, S939F, and L1203F, to warrant further investigation.
T144 23727-23877 Sentence denotes Figure 4 3D Structural Modeling of Glycosylated SARS-CoV-2 Spike Trimer Immunogen Reveals Predictions for Antigen Accessibility and Other Key Features
T145 23878-24070 Sentence denotes Results from glycomics and glycoproteomics experiments were combined with results from bioinformatics analyses and used to model several versions of glycosylated SARS-CoV-2 S trimer immunogen.
T146 24071-24178 Sentence denotes (A) Sequence of the SARS-CoV-2 S immunogen displaying computed antigen accessibility and other information.
T147 24179-24260 Sentence denotes Antigen accessibility is indicated by red shading across the amino acid sequence.
T148 24261-24464 Sentence denotes (B) Emerging variants confirmed by independent sequencing experiments were analyzed based on the 3D structure of SARS-CoV-2 S to generate a proximity chart to the determined N-linked glycosylation sites.
T149 24465-24678 Sentence denotes (C) SARS-CoV-2 S trimer immunogen model from MD simulation displaying abundance glycoforms and antigen accessibility shaded in red for most accessible, white for partial, and black for inaccessible (see Video S1).
T150 24679-24795 Sentence denotes (D) SARS-CoV-2 S trimer immunogen model from MD simulation displaying Oxford Class glycoforms and sequence variants.
T151 24796-24921 Sentence denotes Asterisk indicates not visible, whereas the box represents three amino acid variants that are clustered together in 3D space.
T152 24922-25093 Sentence denotes (E) SARS-CoV-2 S trimer immunogen model from MD simulation displaying processed glycoforms plus shading of Thr-323 that has O-glycosylation at low stoichiometry in yellow.
T153 25094-25351 Sentence denotes The percentage of simulation time that each S protein residue is accessible to a probe that approximates the size of an antibody variable domain was calculated for a model of the S trimer by using the Abundance glycoforms (Table S1) (Ferreira et al., 2018).
T154 25352-25522 Sentence denotes The predicted antibody accessibility is visualized across the sequence, as well as mapped onto the 3D surface, via color shading (Figures 4A and 4C; Table S13; Video S1).
T155 25523-25805 Sentence denotes Additionally, the Oxford Class glycoforms model (Table S1), which is arguably the most encompassing means for representing glycan microheterogeneity because it captures abundant structural topologies (Table S8), is shown with the sequence variant information (Figure 4D; Table S11).
T156 25806-26128 Sentence denotes A substantial number of these variants occur (directly by comparison to Figure 4A or visually by comparison to Figure 4C) in regions of high calculated epitope accessibility (e.g., N74K, T76I, R78M, D138H, H146Y, S151I, D253G, V483A, etc.; Table S14), suggesting potential selective pressure to avoid host immune response.
T157 26129-26477 Sentence denotes Also, it is interesting to note that three of the emerging variants would eliminate N-linked sequons in S; N74K and T76I would eliminate N-glycosylation of N74 (found in the insert variable region 1 of CoV-2 S compared to CoV-1 S), and S151I eliminates N-glycosylation of N149 (found in the insert variable region 2) (Figures 4A and S7; Table S11).
T158 26478-26733 Sentence denotes Lastly, the SARS-CoV-2 S Processed glycoform model is shown (Table S1), along with marking amino acid T0323 that has a modest (11% occupancy, Figure S6; Table S10) amount of O-glycosylation to represent the most heavily glycosylated form of S (Figure 4E).
T159 26734-26743 Sentence denotes Video S1.
T160 26744-26802 Sentence denotes Glycosylated S Antigen Accessibility, Related to Figure 4C
T161 26804-26885 Sentence denotes Glycomics-Informed Glycoproteomics Reveals Complex N-linked Glycosylation of ACE2
T162 26886-27004 Sentence denotes We also analyzed ACE2 glycosylation utilizing the same glycomic and glycoproteomic approaches described for S protein.
T163 27005-27234 Sentence denotes Glycomic analyses of released N-linked glycans (Figure 5 A; Table S3) revealed that the majority of glycans on ACE2 are complex with limited high-mannose and hybrid glycans, and we were unable to detect sulfated N-linked glycans.
T164 27235-27441 Sentence denotes Glycoproteomic analyses revealed that occupancy was high (> 75%) at all six sites, and significant microheterogeneity dominated by complex N-glycans was observed for each site (Figures 5B–5G; Tables S5–S8).
T165 27442-27702 Sentence denotes We also observed, consistent with the O-glycomics (Figure S5; Table S4), that Ser 155 and several S/T residues at the C terminus of ACE2 outside of the peptidase domain were O-glycosylated, but stoichiometry was extremely low (less than 2%; Tables S9 and S10).
T166 27703-27823 Sentence denotes Figure 5 Glycomics-Informed Glycoproteomics of Soluble Human ACE2 Reveals High Occupancy, Complex N-linked Glycosylation
T167 27824-27912 Sentence denotes (A) Glycans released from soluble, purified ACE2 were permethylated and analyzed by MSn.
T168 27913-28031 Sentence denotes Structures were assigned, grouped by type and structural features, and prevalence was determined based on ion current.
T169 28032-28090 Sentence denotes The pie chart shows basic division by broad N-glycan type.
T170 28091-28159 Sentence denotes The bar graph provides additional detail about the glycans detected.
T171 28160-28333 Sentence denotes The most abundant structure with a unique categorization by glycomics for each N-glycan type in the pie chart, or above each feature category in the bar graph, is indicated.
T172 28334-28539 Sentence denotes (B–G) Glycopeptides were prepared from soluble human ACE2 by using multiple combinations of proteases, analyzed by LC-MSn, and the resulting data were searched by using several different software packages.
T173 28540-28599 Sentence denotes All six sites of N-linked glycosylation are presented here.
T174 28600-28714 Sentence denotes Displayed in the bar graphs are the individual compositions observed graphed in terms of assigned spectral counts.
T175 28715-28835 Sentence denotes Representative glycans (as determined by glycomics analysis) for several abundant compositions are shown in SNFG format.
T176 28836-28952 Sentence denotes The pie chart (analogous to Figure 3 for SARS-CoV-2 S) for each site is displayed in the upper corner of each panel.
T177 28953-28962 Sentence denotes (B) N053.
T178 28963-28972 Sentence denotes (C) N090.
T179 28973-28982 Sentence denotes (D) N103.
T180 28983-28992 Sentence denotes (E) N322.
T181 28993-29002 Sentence denotes (F) N432.
T182 29003-29066 Sentence denotes (G) N546, a site that does not exist in three in 10,000 people.
T183 29068-29161 Sentence denotes 3D Structural Modeling of Glycosylated, Soluble, ACE2-Highlighting Glycosylation and Variants
T184 29162-29287 Sentence denotes We integrated our glycomics, glycoproteomics, and population variant analyses results with a 3D model of Ace 2 (based on PDB:
T185 29288-29437 Sentence denotes 6M0J (Lan et al., 2020; see STAR Methods for details) to generate two versions of the soluble glycosylated ACE2 for visualization and MD simulations.
T186 29438-29656 Sentence denotes We visualized the ACE2 glycoprotein with the Abundance glycoform model simulated at each site as well as highlighting the naturally occurring variants observed in the human population (Figure 6 A; Video S2; Table S11).
T187 29657-29777 Sentence denotes Note, that the Abundance glycoform model and the Oxford Class glycoform model for ACE2 are identical (Tables S1 and S8).
T188 29778-29965 Sentence denotes Notably, one site of N-linked glycosylation (N546) is predicted to not be present in three out of 10,000 humans based on naturally occurring variation in the human population (Table S11).
T189 29966-30035 Sentence denotes We also modeled ACE2 using the Processed glycoform model (Figure 6B).
T190 30036-30123 Sentence denotes In both models, the interaction domain with S is defined (Figures 6A and 6B; Video S2).
T191 30124-30190 Sentence denotes Figure 6 3D Structural Modeling of Glycosylated Soluble Human ACE2
T192 30191-30372 Sentence denotes Results from glycomics and glycoproteomics experiments were combined with results from bioinformatics analyses and used to model several versions of glycosylated soluble human ACE2.
T193 30373-30505 Sentence denotes (A) Soluble human ACE2 model from MD simulations displaying abundance glycoforms, interaction surface with S, and sequence variants.
T194 30506-30597 Sentence denotes N546 variant is boxed that would remove N-linked glycosylation at that site (see Video S2).
T195 30598-30710 Sentence denotes (B) Soluble human ACE2 model from MD simulations displaying processed glycoforms and interaction surface with S.
T196 30711-30720 Sentence denotes Video S2.
T197 30721-30774 Sentence denotes Glycosylated ACE2 with Variants, Related to Figure 6A
T198 30776-30927 Sentence denotes MD Simulation of the Glycosylated Trimer Spike of SARS-CoV-2 in Complex with Glycosylated, Soluble, Human Ace 2 Reveals Protein and Glycan Interactions
T199 30928-31052 Sentence denotes MD simulations were performed to examine the co-complex (generated from a crystal structure of the ACE2-RBD co-complex, PDB:
T200 31053-31235 Sentence denotes 6M0J; Lan et al., 2020) of glycosylated S with glycosylated ACE2 with the three different glycoforms models (Abundance, Oxford Class, and Processed; Table S1; Videos S5, S6, and S7).
T201 31236-31474 Sentence denotes Information from these analyses is laid out along the primary structure (sequence) of the SARS-CoV-2 S protomer and ACE2 highlighting regions of glycan-protein interaction observed in the MD simulations (Table S14; Videos S5, S6, and S7).
T202 31475-31680 Sentence denotes Interestingly, two glycans on ACE2 (at N090 and N322), which are highlighted in Figure 7 A and shown in a more close-up view in Figure 7B, are predicted to form interactions with the S protein (Table S15).
T203 31681-31903 Sentence denotes The N322 glycan interaction with the S trimer is outside of the receptor-binding domain, and the interaction is observed across multiple simulations and throughout each simulation (Figures 7A and 7B; Video S5, S6, and S7).
T204 31904-32195 Sentence denotes The ACE2 glycan at N090 is close enough to the S trimer surface to repeatedly form interactions; however, the glycan arms interact with multiple regions of the surface over the course of the simulations, reflecting the relatively high degree of glycan dynamics (Figures 7A and 7B; Video S3).
T205 32196-32380 Sentence denotes Inter-molecule glycan-glycan interactions are also observed repeatedly between the glycan at N546 of ACE2 and those in the S protein at residues N0074 and N0165 (Figure 7D; Table S16).
T206 32381-32555 Sentence denotes Finally, a full view of the ACE2-S complex with Oxford class glycoforms on ACE2 illustrates the extensive glycosylation at the interface of the complex (Figure 7C; Video S4).
T207 32556-32713 Sentence denotes Figure 7 Interactions of Glycosylated Soluble Human ACE2 and Glycosylated SARS-CoV-2 S Trimer Immunogen Revealed By 3D-Structural Modeling and MD Simulations
T208 32714-32854 Sentence denotes (A) MD simulation of glycosylated soluble human ACE2 and glycosylated SARS-CoV-2 S trimer immunogen interaction (see Videos S5, S6, and S7).
T209 32855-32956 Sentence denotes ACE2 (top) is colored red with glycans in pink, whereas S is colored white with glycans in dark gray.
T210 32957-33039 Sentence denotes Highlighted are ACE2 glycans that interact with S that are magnified to the right.
T211 33040-33229 Sentence denotes (B) Magnification of ACE2-S interface highlighting ACE2 glycan interactions by using 3D-SNFG icons (Thieker et al., 2016) with S protein (pink) as well as ACE2-S glycan-glycan interactions.
T212 33230-33350 Sentence denotes (C) Magnification of dynamics trajectory of glycans at the interface of soluble human ACE2 and S (see Videos S3 and S4).
T213 33351-33360 Sentence denotes Video S3.
T214 33361-33410 Sentence denotes Interface of ACE2-S Complex, Related to Figure 7C
T215 33411-33420 Sentence denotes Video S4.
T216 33421-33474 Sentence denotes The Glycosylated ACE2-S Complex, Related to Figure 7C
T217 33475-33484 Sentence denotes Video S5.
T218 33485-33545 Sentence denotes Abundance Glycoforms on ACE2-S Complex, Related to Figure 7A
T219 33546-33555 Sentence denotes Video S6.
T220 33556-33619 Sentence denotes Oxford Class Glycoforms on ACE2-S Complex, Related to Figure 7A
T221 33620-33629 Sentence denotes Video S7.
T222 33630-33690 Sentence denotes Processed Glycoforms on ACE2-S Complex, Related to Figure 7A
T223 33692-33702 Sentence denotes Discussion
T224 33703-34170 Sentence denotes We have defined the glycomics-informed, site-specific microheterogeneity of 22 sites of N-linked glycosylation per monomer on a SARS-CoV-2 trimer and the six sites of N-linked glycosylation on a soluble version of its human ACE2 receptor by using a combination of mass spectrometry approaches coupled with evolutionary and variant sequence analyses to provide a detailed understanding of the glycosylation states of these glycoproteins (Figures 1, 2, 3, 4, 5, and 6).
T225 34171-34341 Sentence denotes Our results suggest essential roles for glycosylation in mediating receptor binding, antigenic shielding, and potentially the evolution/divergence of these glycoproteins.
T226 34342-34729 Sentence denotes The highly glycosylated SARS-CoV-2 Spike protein, unlike several other viral proteins including HIV-1 (Watanabe et al., 2019) but in agreement with another recent report (Watanabe et al., 2020a), presents significantly more processing of N-glycans toward complex glycosylation, suggesting that steric hindrance to processing enzymes is not a major factor at most sites (Figures 2 and 3).
T227 34730-34825 Sentence denotes However, the N-glycans still provide considerable shielding of the peptide backbone (Figure 4).
T228 34826-35238 Sentence denotes Our glycomics-guided glycoproteomic data are generally in strong agreement with the trimer immunogen data recently published by Crispin (Watanabe et al., 2020a), although we also observed sulfated N-linked glycans; were able to differentiate branching, bisected, and diLacNAc containing structures by glycomics; and observed less occupancy on the two most C-terminal N-linked sites by using a different approach.
T229 35239-35438 Sentence denotes Our detection of sulfated N-linked glycans at multiple sites on S is in agreement with a recent manuscript re-analyzing the Crispin data (https://www.biorxiv.org/content/10.1101/2020.05.31.125302v1).
T230 35439-35580 Sentence denotes Sulfated N-linked glycans could potentially play key roles in immune regulation and receptor binding as in other viruses (Wang et al., 2009).
T231 35581-35700 Sentence denotes This result is especially significant in that sulfated N-glycans were not observed when we performed glycomics on ACE2.
T232 35701-35965 Sentence denotes At each individual site, the glycans we observed on our immunogen appear to be slightly more processed, but the overlap between our analysis and the Crispin’s group results (Watanabe et al., 2020a) at each site in terms of major features are nearly superimposable.
T233 35966-36180 Sentence denotes This agreement differs substantially when comparing our and Crispin’s data (Watanabe et al., 2020a) to that of the Azadi group (Shajahan et al., 2020), which analyzed S1 and S2 that had been expressed individually.
T234 36181-36457 Sentence denotes When expressed as two separate polypeptides and not purified for trimers, several unoccupied sites of N-linked glycosylation were observed and processing at several sites was significantly different (Shajahan et al., 2020) than we and others (Watanabe et al., 2020a) observed.
T235 36458-36749 Sentence denotes Although O-glycosylation has recently been reported for individually expressed S1 and S2 domains of the Spike glycoprotein (Shajahan et al., 2020), in trimeric form the level of O-glycosylation is extremely low, with the highest level of occupancy we observed being 11% at T0323 (Figure 4E).
T236 36750-37019 Sentence denotes The low level of O-linked occupancy we observed is in agreement with the Crispin group’s analysis of a Spike Trimer immunogen (Watanabe et al., 2020a) but differs significantly from the Azadi group’s analyses of individually expressed S1 and S2 (Shajahan et al., 2020).
T237 37020-37298 Sentence denotes Thus, the context in which the Spike protein is expressed and purified before analysis significantly alters the glycosylation of the protomer that is reminiscent of previous studies looking at expression of the HIV-1 envelope Spike (Behrens et al., 2017; Watanabe et al., 2019).
T238 37299-37453 Sentence denotes The soluble ACE2 protein examined here contains six highly utilized sites of N-linked glycosylation dominated by complex type N-linked glycans (Figure 5).
T239 37454-37558 Sentence denotes O-glycans were also present on this glycoprotein but at very low levels of occupancy at all sites (<2%).
T240 37559-37769 Sentence denotes Our glycomics-informed glycoproteomics allowed us to assign defined sets of glycans to specific glycosylation sites on 3D-structures of S and ACE2 glycoproteins based on experimental evidence (Figures 4 and 6).
T241 37770-38009 Sentence denotes Similar to almost all glycoproteins, microheterogeneity is evident at most glycosylation sites of S and ACE2; each glycosylation site can be modified with one of several glycan structures, generating site-specific glycosylation portfolios.
T242 38010-38104 Sentence denotes For modeling purposes, however, explicit structures must be placed at each glycosylation site.
T243 38105-38275 Sentence denotes In order to capture the impact of microheterogeneity on S and ACE2 MD we chose to generate glycoforms for modeling that represented reasonable portfolios of glycan types.
T244 38276-38557 Sentence denotes Using three glycoform models for S (Abundance, Oxford Class, and Processed) and two models for ACE2 (Abundance, which was equivalent to Oxford Class, and Processed), we generated three MD simulations of the co-complexes of these two glycoproteins (Figure 7; Videos S5, S6, and S7).
T245 38558-38726 Sentence denotes The observed interactions over time allowed us to evaluate glycan-protein contacts between the two proteins and examine potential glycan-glycan interactions (Figure 7).
T246 38727-38833 Sentence denotes We observed glycan-mediated interactions between the S trimer and glycans at N090, N322, and N546 of ACE2.
T247 38834-38985 Sentence denotes Thus, variations in glycan occupancy or processing at these sites could alter the affinity of the SARS-CoV-2–ACE2 interaction and modulate infectivity.
T248 38986-39241 Sentence denotes It is well established that glycosylation states vary depending on tissue and cell type as well as in the case of humans, on age (Krištić et al., 2014), underlying disease (Pavić et al., 2018; Rudman et al., 2019), and ethnicity (Gebrehiwot et al., 2018).
T249 39242-39364 Sentence denotes Thus, glycosylation portfolios could in part be responsible for tissue tropism and individual susceptibility to infection.
T250 39365-39712 Sentence denotes The importance of glycosylation for S binding to ACE2 is even more emphatically demonstrated by the direct glycan-glycan interactions observed (Figure 7) between S glycans (at N0074 and N0165) and an ACE2 receptor glycan (at N546), adding an additional layer of complexity for interpreting the impact of glycosylation on individual susceptibility.
T251 39713-39838 Sentence denotes Several emerging variants of the virus appear to be altering N-linked glycosylation occupancy by disrupting N-linked sequons.
T252 39839-40058 Sentence denotes Interestingly, the two N-linked sequons in SARS-CoV-2 S directly impacted by variants, N0074 and N0149, are in divergent insert regions 1 and 2, respectively, of SARS-CoV-2 S in comparison with SARS-CoV-1 S (Figure 4A).
T253 40059-40302 Sentence denotes The N0074, in particular, is one of the S glycans that interact directly with ACE2 glycan (at N546; Figure 7), suggesting that glycan-glycan interactions could contribute to the unique infectivity differences between SARS-CoV-2 and SARS-CoV-1.
T254 40303-40522 Sentence denotes These sequon variants will also be important to examine in terms of glycan shielding that could influence immunogenicity and efficacy of neutralizing antibodies, as well as interactions with the host cell receptor ACE2.
T255 40523-40750 Sentence denotes Naturally occurring amino acid-changing SNPs in the ACE2 gene generate a number of variants including one variant, with a frequency of three in 10,000 humans, that eliminates a site of N-linked glycosylation at N546 (Figure 6).
T256 40751-41033 Sentence denotes Understanding the impact of ACE2 variants on glycosylation and more importantly on S binding, especially for N546S, which impacts the glycan-glycan interaction between S and ACE2 (Figure 7), should be prioritized in light of efforts to develop ACE2 as a potential decoy therapeutic.
T257 41034-41181 Sentence denotes Intelligent manipulation of ACE2 glycosylation could lead to more potent biologics capable of acting as better competitive inhibitors of S binding.
T258 41182-41516 Sentence denotes The data presented here, and related similar recent findings (Casalino et al., 2020; Watanabe et al., 2020a; Wrobel et al., 2020), provide a framework to facilitate the production of immunogens, vaccines, antibodies, and inhibitors as well as additional information regarding mechanisms by which glycan microheterogeneity is achieved.
T259 41517-41651 Sentence denotes However, considerable efforts still remain in order to fully understand the role of glycans in SARS-CoV-2 infection and pathogenicity.
T260 41652-41901 Sentence denotes Although HEK-expressed S and ACE2 provide a useful window for understanding human glycosylation of these proteins, glycoproteomic characterization after expression in cell lines of more direct relevance to disease and target tissue is sorely needed.
T261 41902-42099 Sentence denotes Although site occupancy could change depending on presentation and cell type (Struwe et al., 2018), processing of N-linked glycans will almost certainly be altered in a cell-type-dependent fashion.
T262 42100-42391 Sentence denotes Thus, analyses of the Spike trimer extracted from pseudoviruses, virion-like particles, and ultimately from infectious SARS-CoV-2 virions harvested from airway cells or patients will provide the most accurate view of how trimer immunogens reflect the true glycosylation pattern of the virus.
T263 42392-42599 Sentence denotes Detailed analyses of the impact of emerging variants in S and natural and designed-for-biologics variants of ACE2 on glycosylation and binding properties are important next steps for developing therapeutics.
T264 42600-42860 Sentence denotes Finally, it will be important to monitor the slow evolution of the virus to determine if existing sites of glycosylation are lost or new sites emerge with selective pressure that might alter the efficacy of vaccines, neutralizing antibodies, and/or inhibitors.
T265 42862-42874 Sentence denotes STAR★Methods
T266 42876-42895 Sentence denotes Key Resources Table
T267 42896-42933 Sentence denotes REAGENT or RESOURCE SOURCE IDENTIFIER
T268 42934-42979 Sentence denotes Chemicals, Peptides, and Recombinant Proteins
T269 42980-43015 Sentence denotes SARS-CoV-2 S protein This Study N/A
T270 43016-43049 Sentence denotes Human ACE2 protein This Study N/A
T271 43050-43095 Sentence denotes 2x Laemmli sample buffer Bio-Rad Cat#161-0737
T272 43096-43189 Sentence denotes Invitrogen NuPAGE 4 to 12%, Bis-Tris, Mini Protein Gel Thermo Fisher Scientific Cat#NP0321PK2
T273 43190-43259 Sentence denotes Coomassie Brilliant Blue G-250 Dye Thermo Fisher Scientific Cat#20279
T274 43260-43298 Sentence denotes Dithiothreitol Sigma Aldrich Cat#43815
T275 43299-43336 Sentence denotes Iodoacetamide Sigma Aldrich Cat#I1149
T276 43337-43362 Sentence denotes Trypsin Promega Cat#V5111
T277 43363-43386 Sentence denotes Lys-C Promega Cat#V1671
T278 43387-43410 Sentence denotes Arg-C Promega Cat#V1881
T279 43411-43434 Sentence denotes Glu-C Promega Cat#V1651
T280 43435-43459 Sentence denotes Asp-N Promega Cat#VA1160
T281 43460-43495 Sentence denotes Endoglycosidase H Promega Cat#V4871
T282 43496-43521 Sentence denotes PNGaseF Promega Cat#V4831
T283 43522-43582 Sentence denotes Chymotrypsin Athens Research and Technology Cat#16-19-030820
T284 43583-43633 Sentence denotes Alpha lytic protease New England BioLabs Cat#P8113
T285 43634-43687 Sentence denotes 18O water Cambridge Isotope Laboratories OLM-782-10-1
T286 43688-43730 Sentence denotes O-protease OpeRATOR Genovis Cat#G1-OP1-020
T287 43731-43745 Sentence denotes Deposited Data
T288 43746-43847 Sentence denotes MS data for site-specific N-linked glycopeptides for SARS-Cov-2 S and human ACE2 This Study PXD019937
T289 43848-43949 Sentence denotes MS data for site-specific O-linked glycopeptides for SARS-Cov-2 S and human ACE2 This Study PXD019940
T290 43950-44052 Sentence denotes MS data for deglycosylated N-linked glycopeptides for SARS-Cov-2 S and human ACE2 This Study PXD019938
T291 44053-44126 Sentence denotes MS data for disulfide bond analysis for SARS-Cov-2 S This Study PXD019939
T292 44127-44202 Sentence denotes MS data for N-linked glycomics deposited at GlycoPost This Study GPST000120
T293 44203-44278 Sentence denotes MS data for O-linked glycomics deposited at GlycoPost This Study GPST000121
T294 44279-44299 Sentence denotes Experimental Models:
T295 44300-44310 Sentence denotes Cell Lines
T296 44311-44339 Sentence denotes 293-F Cells GIBCO Cat#R79007
T297 44340-44365 Sentence denotes Vero-6 Cells ATCC CRL1586
T298 44366-44386 Sentence denotes Experimental Models:
T299 44387-44404 Sentence denotes Organisms/Strains
T300 44405-44440 Sentence denotes VSV(G)-Pseudoviruses This Study N/A
T301 44441-44464 Sentence denotes Software and Algorithms
T302 44465-44545 Sentence denotes pGlyco v2.2.2 Liu et al., 2017 http://pfind.ict.ac.cn/software/pGlyco/index.html
T303 44546-44611 Sentence denotes Proteome Discoverer v1.4 Thermo Fisher Scientific CAT#OPTON-30945
T304 44612-44695 Sentence denotes Byonic v3.8.13 Protein Metrics Inc. https://www.proteinmetrics.com/products/byonic/
T305 44696-44818 Sentence denotes ProteoIQ v2.7 Premier Biosoft (Bern et al., 2012) http://www.premierbiosoft.com/protein_quantification_software/index.html
T306 44819-44890 Sentence denotes GRITS Toolbox V1.1 Weatherly et al., 2019 http://www.grits-toolbox.org/
T307 44891-44976 Sentence denotes EMBOSS needle v6.6.0 Rice et al., 2000 https://www.ebi.ac.uk/Tools/psa/emboss_needle/
T308 44977-45033 Sentence denotes Biopython v1.76 Cock et al., 2009 https://biopython.org/
T309 45034-45081 Sentence denotes Rpdb v2.3 Julien Ide https://rdrr.io/cran/Rpdb/
T310 45082-45166 Sentence denotes SignalP V5.0 Almagro Armenteros et al., 2019 http://www.cbs.dtu.dk/services/SignalP/
T311 45167-45265 Sentence denotes LibreOFFICE Writer v6.4.4.2 The Document Foundation https://www.libreoffice.org/download/download/
T312 45266-45318 Sentence denotes GlyGen V1.5 York et al., 2020 https://www.glygen.org
T313 45319-45409 Sentence denotes GNOme V1.5.5 OBO Foundry https://github.com/glygen-glycan-data/GNOme/blob/master/README.md
T314 45410-45476 Sentence denotes GlyTouCan V3.1.0 Aoki-Kinoshita et al., 2016 https://glytoucan.org
T315 45477-45563 Sentence denotes Inkscape V1.0 Inkscape project contributors https://inkscape.org/release/inkscape-1.0/
T316 45564-45617 Sentence denotes ffmpeg V3.4 The FFmpeg developers https://ffmpeg.org/
T317 45618-45673 Sentence denotes Cygwin V3.1.5 Cygwin developers https://www.cygwin.com/
T318 45675-45696 Sentence denotes Resource Availability
T319 45698-45710 Sentence denotes Lead Contact
T320 45711-45919 Sentence denotes Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Lance Wells (lwells@ccrc.uga.edu) or alternatively by Peng Zhao (pengzhao@uga.edu).
T321 45921-45943 Sentence denotes Materials Availability
T322 45944-45992 Sentence denotes This study did not generate new unique reagents.
T323 45994-46020 Sentence denotes Data and Code Availability
T324 46021-46157 Sentence denotes The mass spectrometry proteomics data are available via ProteomeXchange with identifiers PXD019937, PXD019940, PXD019938, and PXD019939.
T325 46158-46266 Sentence denotes The mass spectrometry glycomics data are available via GlycoPost with identifiers GPST000120 and GPST000121.
T326 46268-46306 Sentence denotes Experimental Model and Subject Details
T327 46307-46418 Sentence denotes HEK293-F cells (GIBCO) were maintained and passaged in FreeStyle Media (GIBCO) containing 1% Pen Strep (GIBCO).
T328 46419-46580 Sentence denotes Vero-6 cells (ATCC) were maintained and passaged in DMEM medium supplemented with 10% fetal bovine serum and 1% Pen Strep (GIBCO) and amphotericin B antibiotics.
T329 46581-46657 Sentence denotes All cells were maintained at 37°C with 5% CO2 before and after transfection.
T330 46659-46673 Sentence denotes Method Details
T331 46675-46761 Sentence denotes Expression, Purification, and Characterization of SARS-CoV-2 S and Human ACE2 Proteins
T332 46762-47189 Sentence denotes To express a stabilized ectodomain of Spike protein, a synthetic gene encoding residues 1−1208 of SARS-CoV-2 Spike with the furin cleavage site (residues 682–685) replaced by a “GGSG” sequence, proline substitutions at residues 986 and 987, and a foldon trimerization motif followed by a C-terminal 6xHisTag was created and cloned into the mammalian expression vector pCMV-IRES-puro (Codex BioSolutions, Inc, Gaithersburg, MD).
T333 47190-47319 Sentence denotes The expression construct was transiently transfected in HEK293F cells using polyethylenimine (Polysciences, Inc, Warrington, PA).
T334 47320-47564 Sentence denotes Protein was purified from cell supernatants using Ni-NTA resin (QIAGEN, Germany), the eluted fractions containing S protein were pooled, concentrated, and further purified by gel filtration chromatography on a Superose 6 column (GE Healthcare).
T335 47565-47662 Sentence denotes Negative stain electron microscopy (EM) analysis was performed as described (Shaik et al., 2019).
T336 47663-47956 Sentence denotes Briefly, analysis was performed at room temperature with a magnification of 52,000x and a defocus value of 1.5 μm following low-dose procedures, using a Philips Tecnai F20 electron microscope (Thermo Fisher Scientific) equipped with a Gatan US4000 CCD camera and operated at voltage of 200 kV.
T337 47957-48102 Sentence denotes The DNA fragment encoding human ACE2 (1-615) with a 6xHis tag at C terminus was synthesized by Genscript and cloned to the vector pCMV-IRES-puro.
T338 48103-48184 Sentence denotes The expression construct was transfected in HEK293F cells using polyethylenimine.
T339 48185-48261 Sentence denotes The medium was discarded and replaced with FreeStyle 293 medium after 6-8 h.
T340 48262-48387 Sentence denotes After incubation in 37°C with 5.5% CO2 for 5 days, the supernatant was collected and loaded to Ni-NTA resin for purification.
T341 48388-48463 Sentence denotes The elution was concentrated and further purified by a Superdex 200 column.
T342 48465-48520 Sentence denotes In-Gel Analysis of SARS-CoV-2 S and Human ACE2 Proteins
T343 48521-48781 Sentence denotes A 3.5-μg aliquot of SARS-CoV-2 S protein as well as a 2-μg aliquot of human ACE2 were combined with Laemmli sample buffer, analyzed on a 4%–12% Invitrogen NuPage Bis-Tris gel using the MES pH 6.5 running buffer, and stained with Coomassie Brilliant Blue G-250.
T344 48783-48875 Sentence denotes Analysis of N-linked and O-linked Glycans Released from SARS-Cov-2 S and Human ACE2 Proteins
T345 48876-49030 Sentence denotes Aliquots of approximately 25-50 μg of S or ACE2 protein were processed for glycan analysis as previously described (Aoki et al., 2007; Aoki et al., 2008).
T346 49031-49101 Sentence denotes For N-linked glycan analysis, the proteins were digested with trypsin.
T347 49102-49234 Sentence denotes Following trypsinization, glycopeptides were enriched by C18 Sep-Pak and subjected to PNGaseF digestion to release N-linked glycans.
T348 49235-49372 Sentence denotes Following PNGaseF digestion, released glycans were separated from residual glycosylated peptides bearing O-linked glycans by C18 Sep-Pak.
T349 49373-49492 Sentence denotes O-glycosylated peptides were eluted from the Sep-Pak and subjected to reductive β-elimination to release the O-glycans.
T350 49493-49610 Sentence denotes Another 25-50 μg aliquot of each protein was denatured with SDS and digested with PNGaseF to remove N-linked glycans.
T351 49611-49751 Sentence denotes The de-N-glycosylated, intact protein was precipitated with cold ethanol and then subjected to reductive β-elimination to release O-glycans.
T352 49752-49852 Sentence denotes The profiles of O-glycans released from peptides or from intact protein were found to be comparable.
T353 49853-50036 Sentence denotes N- and O-linked glycans released from glycoproteins were permethylated with methyliodide according to the method of Anumula and Taylor prior to MS analysis (Anumula and Taylor, 1992).
T354 50037-50158 Sentence denotes Glycan structural analysis was performed using an LTQ-Orbitrap instrument (Orbitrap Discovery, Thermo Fisher Scientific).
T355 50159-50469 Sentence denotes Detection and relative quantification of the prevalence of individual glycans was accomplished using the total ion mapping (TIM) and neutral loss scan (NL scan) functionality of the Xcalibur software package version 2.0 (Thermo Fisher Scientific) as previously described (Aoki et al., 2007; Aoki et al., 2008).
T356 50470-50583 Sentence denotes Mass accuracy and detector response was tuned with a permethylated oligosaccharide standard in positive ion mode.
T357 50584-50705 Sentence denotes For fragmentation by collision-induced dissociation (CID in MS2 and MSn), normalized collision energy of 45% was applied.
T358 50706-50812 Sentence denotes Most permethylated glycans were identified as singly or doubly charged, sodiated species in positive mode.
T359 50813-50917 Sentence denotes Sulfated N-glycans were detected as singly or doubly charged, deprotonated species in negative ion mode.
T360 50918-51014 Sentence denotes Peaks for all charge states were deconvoluted by the charge state and summed for quantification.
T361 51015-51067 Sentence denotes All spectra were manually interpreted and annotated.
T362 51068-51192 Sentence denotes The explicit identities of individual monosaccharide residues have been assigned based on known human biosynthetic pathways.
T363 51193-51389 Sentence denotes Graphical representations of monosaccharide residues are consistent with the Symbol Nomenclature for Glycans (SNFG), which has been broadly adopted by the glycomics community (Varki et al., 2015).
T364 51390-51578 Sentence denotes The MS-based glycomics data generated in these analyses and the associated annotations are presented in accordance with the MIRAGE standards and the Athens Guidelines (Wells et al., 2013).
T365 51579-51794 Sentence denotes Data annotation and assignment of glycan accession identifiers were facilitated by GRITS Toolbox, GlyTouCan, GNOme, and GlyGen (Kahsay et al., 2020; Tiemeyer et al., 2017; Weatherly et al., 2019; York et al., 2020).
T366 51796-51857 Sentence denotes Analysis of Disulfide Bonds for SARS-Cov-2 S Protein by LC-MS
T367 51858-52043 Sentence denotes Two 10-μg aliquots of SARS-CoV-2 S protein were denatured by incubating with 20% acetonitrile at room temperature and alkylated by 13.75 mM of iodoacetamide at room temperature in dark.
T368 52044-52178 Sentence denotes The two aliquots of proteins were then digested respectively using alpha lytic protease, or a combination of trypsin, Lys-C and Glu-C.
T369 52179-52254 Sentence denotes Following digestion, the proteins were deglycosylated by PNGaseF treatment.
T370 52255-52478 Sentence denotes The resulting peptides were separated on an Acclaim PepMap RSLC C18 column (75 μm x 15 cm) and eluted into the nano-electrospray ion source of an Orbitrap Fusion Lumos Tribrid mass spectrometer at a flow rate of 200 nL/min.
T371 52479-52624 Sentence denotes The elution gradient consists of 1%–40% acetonitrile in 0.1% formic acid over 370 min followed by 10 min of 80% acetonitrile in 0.1% formic acid.
T372 52625-52722 Sentence denotes The spray voltage was set to 2.2 kV and the temperature of the heated capillary was set to 280°C.
T373 52723-52903 Sentence denotes Full MS scans were acquired from m/z 200 to 2000 at 60k resolution, and MS/MS scans following electron transfer dissociation (ETD) were collected in the Orbitrap at 15k resolution.
T374 52904-53044 Sentence denotes The raw spectra were analyzed by Byonic (v3.8.13, Protein Metrics Inc.) with mass tolerance set as 20 ppm for both precursors and fragments.
T375 53045-53125 Sentence denotes The search output was filtered at 1% false discovery rate and 10 ppm mass error.
T376 53126-53208 Sentence denotes The spectra assigned as cross-linked peptides were manually evaluated for Cys0015.
T377 53210-53308 Sentence denotes Analysis of Site-Specific N-linked Glycopeptides for SARS-Cov-2 S and Human ACE2 Proteins by LC-MS
T378 53309-53488 Sentence denotes Four 3.5-μg aliquots of SARS-CoV-2 S protein were reduced by incubating with 10 mM of dithiothreitol at 56°C and alkylated by 27.5 mM of iodoacetamide at room temperature in dark.
T379 53489-53664 Sentence denotes The four aliquots of proteins were then digested respectively using alpha lytic protease, chymotrypsin, a combination of trypsin and Glu-C, or a combination of Glu-C and AspN.
T380 53665-53836 Sentence denotes Three 10-μg aliquots of ACE2 protein were reduced by incubating with 5 mM of dithiothreitol at 56°C and alkylated by 13.75 mM of iodoacetamide at room temperature in dark.
T381 53837-53980 Sentence denotes The three aliquots of proteins were then digested respectively using alpha lytic protease, chymotrypsin, or a combination of trypsin and Lys-C.
T382 53981-54204 Sentence denotes The resulting peptides were separated on an Acclaim PepMap RSLC C18 column (75 μm x 15 cm) and eluted into the nano-electrospray ion source of an Orbitrap Fusion Lumos Tribrid mass spectrometer at a flow rate of 200 nL/min.
T383 54205-54350 Sentence denotes The elution gradient consists of 1%–40% acetonitrile in 0.1% formic acid over 370 min followed by 10 min of 80% acetonitrile in 0.1% formic acid.
T384 54351-54448 Sentence denotes The spray voltage was set to 2.2 kV and the temperature of the heated capillary was set to 280°C.
T385 54449-54816 Sentence denotes Full MS scans were acquired from m/z 200 to 2000 at 60k resolution, and MS/MS scans following higher-energy collisional dissociation (HCD) with stepped collision energy (15%, 25%, 35%) were collected in the Orbitrap at 15k resolution. pGlyco v2.2.2 (Liu et al., 2017) was used for database searches with mass tolerance set as 20 ppm for both precursors and fragments.
T386 54817-54925 Sentence denotes The database search output was filtered to reach a 1% false discovery rate for glycans and 10% for peptides.
T387 54926-55025 Sentence denotes Quantitation was performed by calculating spectral counts for each glycan composition at each site.
T388 55026-55121 Sentence denotes Any N-linked glycan compositions identified by only one spectra were removed from quantitation.
T389 55122-55207 Sentence denotes N-linked glycan compositions were categorized into 22 classes (including Unoccupied):
T390 55208-55550 Sentence denotes HexNAc(2)Hex(9∼5)Fuc(0∼1) was classified as M9 to M5 respectively; HexNAc(2)Hex(4∼1)Fuc(0∼1) was classified as M1-M4; HexNAc(3∼6)Hex(5∼9)Fuc(0)NeuAc(0∼1) was classified as Hybrid with HexNAc(3∼6)Hex(5∼9)Fuc(1∼2)NeuAc(0∼1) classified as F-Hybrid; Complex-type glycans are classified based on the number of antenna, fucosylation, and sulfation:
T391 55551-56289 Sentence denotes HexNAc(3)Hex(3∼4)Fuc(0)NeuAc(0∼1) is assigned as A1 with HexNAc(3)Hex(3∼4)Fuc(1∼2)NeuAc(0∼1) assigned as F-A1; HexNAc(4)Hex(3∼5)Fuc(0)NeuAc(0∼2) is assigned as A2/A1B with HexNAc(4)Hex(3∼5)Fuc(1∼5)NeuAc(0∼2) assigned as F-A2/A1B; HexNAc(5)Hex(3∼6)Fuc(0)NeuAc(0∼3) is assigned as A3/A2B with HexNAc(5)Hex(3∼6)Fuc(1∼3)NeuAc(0∼3) assigned as F-A3/A2B; HexNAc(6)Hex(3∼7)Fuc(0)NeuAc(0∼4) is assigned as A4/A3B with HexNAc(6)Hex(3∼7)Fuc(1∼3)NeuAc(0∼4) assigned as F-A4/A3B; HexNAc(7)Hex(3∼8)Fuc(0)NeuAc(0∼1) is assigned as A5/A4B with HexNAc(7)Hex(3∼8)Fuc(1∼3)NeuAc(0∼1) as F-A5/A4B; HexNAc(8)Hex(3∼9)Fuc(0) is assigned as A6/A5B with HexNAc(8)Hex(3∼9)Fuc(1) assigned as F-A6/A5B; any glycans identified with a sulfate are assigned as Sulfated.
T392 56291-56363 Sentence denotes Analysis of Deglycosylated SARS-Cov-2 S and Human ACE2 Proteins by LC-MS
T393 56364-56544 Sentence denotes Three 3.5-μg aliquots of SARS-CoV-2 S protein were reduced by incubating with 10 mM of dithiothreitol at 56°C and alkylated by 27.5 mM of iodoacetamide at room temperature in dark.
T394 56545-56661 Sentence denotes The three aliquots were then digested respectively using chymotrypsin, Asp-N, or a combination of trypsin and Glu-C.
T395 56662-56831 Sentence denotes Two 10-μg aliquots of ACE2 protein were reduced by incubating with 5 mM of dithiothreitol at 56°C and alkylated by 13.75 mM of iodoacetamide at room temperature in dark.
T396 56832-56939 Sentence denotes The two aliquots were then digested respectively using chymotrypsin, or a combination of trypsin and Lys-C.
T397 56940-57074 Sentence denotes Following digestion, the proteins were deglycosylated by Endoglycosidase H followed by PNGaseF treatment in the presence of 18O water.
T398 57075-57298 Sentence denotes The resulting peptides were separated on an Acclaim PepMap RSLC C18 column (75 μm x 15 cm) and eluted into the nano-electrospray ion source of an Orbitrap Fusion Lumos Tribrid mass spectrometer at a flow rate of 200 nL/min.
T399 57299-57444 Sentence denotes The elution gradient consists of 1%–40% acetonitrile in 0.1% formic acid over 370 min followed by 10 min of 80% acetonitrile in 0.1% formic acid.
T400 57445-57542 Sentence denotes The spray voltage was set to 2.2 kV and the temperature of the heated capillary was set to 280°C.
T401 57543-57729 Sentence denotes Full MS scans were acquired from m/z 200 to 2000 at 60k resolution, and MS/MS scans following collision-induced dissociation (CID) at 38% collision energy were collected in the ion trap.
T402 57730-57870 Sentence denotes The spectra were analyzed using SEQUEST (Proteome Discoverer 1.4) with mass tolerance set as 20 ppm for precursors and 0.5 Da for fragments.
T403 57871-58001 Sentence denotes The search output was filtered using ProteoIQ (v2.7) to reach a 1% false discovery rate at protein level and 10% at peptide level.
T404 58002-58220 Sentence denotes Occupancy of each N-linked glycosylation site was calculated using spectral counts assigned to the 18O-Asp-containing (PNGaseF-cleaved) and/or HexNAc-modified (EndoH-cleaved) peptides and their unmodified counterparts.
T405 58222-58320 Sentence denotes Analysis of Site-Specific O-linked Glycopeptides for SARS-Cov-2 S and Human ACE2 Proteins by LC-MS
T406 58321-58538 Sentence denotes Three 10-μg aliquots of SARS-CoV-2 S protein and one 10-μg aliquot of ACE2 protein were reduced by incubating with 5 mM of dithiothreitol at 56°C and alkylated by 13.75 mM of iodoacetamide at room temperature in dark.
T407 58539-58656 Sentence denotes The four aliquots were then digested respectively using trypsin, Lys-C, Arg-C, or a combination of trypsin and Lys-C.
T408 58657-58776 Sentence denotes Following digestion, the proteins were deglycosylated by PNGaseF treatment and then digested with O-protease OpeRATOR®.
T409 58777-59000 Sentence denotes The resulting peptides were separated on an Acclaim PepMap RSLC C18 column (75 μm x 15 cm) and eluted into the nano-electrospray ion source of an Orbitrap Fusion Lumos Tribrid mass spectrometer at a flow rate of 200 nL/min.
T410 59001-59146 Sentence denotes The elution gradient consists of 1%–40% acetonitrile in 0.1% formic acid over 370 min followed by 10 min of 80% acetonitrile in 0.1% formic acid.
T411 59147-59244 Sentence denotes The spray voltage was set to 2.2 kV and the temperature of the heated capillary was set to 280°C.
T412 59245-59519 Sentence denotes Full MS scans were acquired from m/z 200 to 2000 at 60k resolution, and MS/MS scans following higher-energy collisional dissociation (HCD) with stepped collision energy (15%, 25%, 35%) or electron transfer dissociation (ETD) were collected in the Orbitrap at 15k resolution.
T413 59520-59638 Sentence denotes The raw spectra were analyzed by Byonic (v3.8.13) with mass tolerance set as 20 ppm for both precursors and fragments.
T414 59639-59740 Sentence denotes MS/MS filtering was applied to only allow for spectra where the oxonium ions of HexNAc were observed.
T415 59741-59821 Sentence denotes The search output was filtered at 1% false discovery rate and 10 ppm mass error.
T416 59822-59893 Sentence denotes The spectra assigned as O-linked glycopeptides were manually evaluated.
T417 59894-59993 Sentence denotes Quantitation was performed by calculating spectral counts for each glycan composition at each site.
T418 59994-60089 Sentence denotes Any O-linked glycan compositions identified by only one spectra were removed from quantitation.
T419 60090-60283 Sentence denotes Occupancy of each O-linked glycosylation site was calculated using spectral counts assigned to any glycosylated peptides and their unmodified counterparts from searches without MS/MS filtering.
T420 60285-60342 Sentence denotes Sequence Analysis of SARS-CoV-2 S and Human ACE2 Proteins
T421 60343-60488 Sentence denotes The genomes of SARS-CoV as well as bat and pangolin coronavirus sequences reported to be closely related to SARS-CoV-2 were downloaded from NCBI.
T422 60489-60660 Sentence denotes The S protein sequences from all of those genomes were aligned using EMBOSS needle v6.6.0 (Rice et al., 2000) via the EMBL-EBI provided web service (Madeira et al., 2019).
T423 60661-60761 Sentence denotes Manual analysis was performed in the regions containing canonical N-glycosylation sequons (N-X-S/T).
T424 60762-61081 Sentence denotes For further sequence analysis of SARS-CoV-2 S variants, the genomes of SARS-CoV-2 were downloaded from NCBI and GISAID and further processed using Biopython 1.76 to extract all sequences annotated as “surface glycoprotein” and to remove any incomplete sequence as well as any sequence containing unassigned amino acids.
T425 61082-61305 Sentence denotes For sequence analysis of human ACE2 variants, the single nucleotide polymorphisms (SNPs) of ACE2 were extracted from the NCBI dbSNP database and filtered for missense mutation entries with a reported minor allele frequency.
T426 61306-61467 Sentence denotes Manual analysis was performed on both SARS-CoV-2 S and human ACE2 variants to further examine the regions containing canonical N-glycosylation sequons (N-X-S/T).
T427 61468-61577 Sentence denotes LibreOffice Writer and its macro capabilities was used to shade regions on the linear sequence of S and ACE2.
T428 61579-61688 Sentence denotes 3D Structural Modeling and Molecular Dynamics Simulation of Glycosylated SARS-CoV-2 S and Human ACE2 Proteins
T429 61689-61991 Sentence denotes SARS-CoV-2 Spike (S) protein structure and ACE2 co-complex – A 3D structure of the prefusion form of the S protein (RefSeq: YP_009724390.1, UniProt: P0DTC2 SPIKE_SARS2), based on a Cryo-EM structure (PDB code 6VSB) (Wrapp et al., 2020), was obtained from the SWISS-MODEL server (swissmodel.expasy.org).
T430 61992-62058 Sentence denotes The model has 95% coverage (residues 27 to 1146) of the S protein.
T431 62059-62220 Sentence denotes The receptor binding domain (RBD) in the “open” conformation was replaced with the RBD from an ACE2 co-complex (PDB code 6M0J) by grafting residues C336 to V524.
T432 62221-62489 Sentence denotes Glycoform generation – Glycans (detected by glycomics) were selected for installation on glycosylated S and ACE2 sequons (detected by glycoproteomics) based on three sets of criteria designed to reasonably capture different aspects of glycosylation microheterogeneity.
T433 62490-62829 Sentence denotes We denote the first of these glycoform models as “Abundance.” The glycans selected for installation to generate the Abundance model were chosen because they were identified as the most abundant glycan structure (detected by glycomics) that matched the most abundant glycan composition (detected by glycoproteomics) at each individual site.
T434 62830-63215 Sentence denotes We denote the second glycoform model as “Oxford Class.” The glycans selected for installation to generate the Oxford Class model were chosen because they were the most abundant glycan structure, (detected by glycomics) that was contained within the most highly represented Oxford classification group (detected by glycoproteomics) at each individual site (Figure S7; Tables S1 and S8).
T435 63216-63650 Sentence denotes Finally, we denote the third glycoform model as “Processed.” The glycans selected for installation to generate the Processed model were chosen because they were the most highly trimmed, elaborated, or terminally decorated structure (detected by glycomics) that corresponded to a composition (detected by glycoproteomics) which was present at ≥ 1/3rd of the abundance of the most highly represented composition at each site (Table S1).
T436 63651-63827 Sentence denotes 3D structures of the three glycoforms (Abundance, Oxford Class, Processed) were generated for the SARS-CoV-2 S protein alone, and in complex with the glycosylated ACE2 protein.
T437 63828-64191 Sentence denotes The glycoprotein builder available at GLYCAM-Web (www.glycam.org) was employed together with an in-house program that adjusts the asparagine side chain torsion angles and glycosidic linkages within known low-energy ranges (Nivedha et al., 2014) to relieve any atomic overlaps with the core protein, as described previously (Grant et al., 2016; Peng et al., 2017).
T438 64192-64391 Sentence denotes Energy minimization and Molecular dynamics (MD) simulations – Each glycosylated structure was placed in a periodic box of TIP3P water molecules with a 10 Å buffer between the solute and the box edge.
T439 64392-64587 Sentence denotes Energy minimization of all atoms was performed for 20,000 steps (10,000 steepest decent, followed by 10,000 conjugant gradient) under constant pressure (1 atm) and temperature (300 K) conditions.
T440 64588-64830 Sentence denotes All MD simulations were performed under nPT conditions with the CUDA implementation of the PMEMD (Götz et al., 2012; Salomon-Ferrer et al., 2013) simulation code, as present in the Amber14 software suite (University of California, San Diego).
T441 64831-64999 Sentence denotes The GLYCAM06j force field (Kirschner et al., 2008) and Amber14SB force field (Maier et al., 2015) were employed for the carbohydrate and protein moieties, respectively.
T442 65000-65193 Sentence denotes A Berendsen barostat with a time constant of 1 ps was employed for pressure regulation, while a Langevin thermostat with a collision frequency of 2 ps-1 was employed for temperature regulation.
T443 65194-65246 Sentence denotes A nonbonded interaction cut-off of 8 Å was employed.
T444 65247-65356 Sentence denotes Long-range electrostatics were treated with the particle-mesh Ewald (PME) method (Darden and Pedersen, 1993).
T445 65357-65491 Sentence denotes Covalent bonds involving hydrogen were constrained with the SHAKE algorithm, allowing an integration time step of 2 fs to be employed.
T446 65492-65605 Sentence denotes The energy minimized coordinates were equilibrated at 300K over 400 ps with restraints on the solute heavy atoms.
T447 65606-65854 Sentence denotes Each system was then equilibrated with restraints on the Ca atoms of the protein for 1ns, prior to initiating 4 independent 250 ns production MD simulations with random starting seeds for a total time of 1 μs per system, with no restraints applied.
T448 65855-65882 Sentence denotes Antigenic surface analysis.
T449 65883-66237 Sentence denotes A series of 3D structure snapshots of the simulation were taken at 1 ns intervals and analyzed in terms of their ability to interact with a spherical probe based on the average size of hypervariable loops present in an antibody complementarity determining region (CDR), as described recently (https://www.biorxiv.org/content/10.1101/2020.04.07.030445v2).
T450 66238-66391 Sentence denotes The percentage of simulation time each residue was exposed to the AbASA probe was calculated and plotted onto both the 3D structure and primary sequence.
T451 66393-66458 Sentence denotes Analysis of SARS-CoV-2 Spike VSV Pseudoparticles (ppVSV-SARS-2-S)
T452 66459-66566 Sentence denotes 293T cells were transfected with an expression plasmid encoding SARS-CoV-2 Spike (pcDNAintron-SARS-2-SΔ19).
T453 66567-66679 Sentence denotes To increase cell surface expression, the last 19 amino acids containing the Golgi retention signal were removed.
T454 66680-66761 Sentence denotes Two SΔ19 constructs were compared, one started with Met1 and the other with Met2.
T455 66762-66895 Sentence denotes Twenty-four h following transfection, cells were transduced with ppVSVΔG-VSV-G (particles that were pseudotyped with VSV-G in trans).
T456 66896-66978 Sentence denotes One h following transduction cells were extensively washed and media was replaced.
T457 66979-67093 Sentence denotes Supernatant containing particles were collected 12-24 h following transduction and cleared through centrifugation.
T458 67094-67149 Sentence denotes Cleared supernatant was frozen at −80°C for future use.
T459 67150-67247 Sentence denotes Target cells Vero E6 were seeded in 24-well plates (5x105 cells/mL) at a density of 80% coverage.
T460 67248-67437 Sentence denotes The following day, ppVSV-SARS-2-S/GFP particles were transduced into target cells for 60 min, particles pseudotyped with VSV-G, Lassa virus GP, or no glycoprotein were included as controls.
T461 67438-67658 Sentence denotes 24 h following transduction, transduced cells were released from the plate with trypsin, fixed with 4% formaldehyde, and GFP-positive virus-transduced cells were quantified using flow cytometry (Bectin Dickson BD-LSRII).
T462 67659-67895 Sentence denotes To quantify the ability of various SARS-CoV-2 S mutants to mediate fusion, effector cells (HEK293T) were transiently transfected with the indicated pcDNAintron-SARS-2-S expression vector or measles virus H and F (Brindley et al., 2014).
T463 67896-68016 Sentence denotes Effector cells were infected with MVA-T7 four h following transduction to produce the T7 polymerase (Paal et al., 2009).
T464 68017-68249 Sentence denotes Target cells naturally expressing the receptor ACE2 (Vero) or ACE2 negative cells (HEK293T) were transfected with pTM1-luciferase, which encodes for firefly luciferase under the control of a T7 promoter (Brindley and Plemper, 2010).
T465 68250-68355 Sentence denotes 24 h following transfection, the target cells were lifted and added to the effector cells at a 1:1 ratio.
T466 68356-68486 Sentence denotes 4 h following co-cultivation, cells were washed, lysed and luciferase levels were quantified using Promega’s Steady-Glo substrate.
T467 68487-68602 Sentence denotes To visualize cell-to-cell fusion, Vero cells were co-transfected with pGFP and the pcDNAintron-SARS-2-S constructs.
T468 68603-68683 Sentence denotes 24 h following transfection, syncytia was visualized by fluorescence microscopy.
T469 68685-68724 Sentence denotes Quantification and Statistical Analysis
T470 68725-68852 Sentence denotes Raw glycoproteomic data from the mass spectrometers was searched using Proteome Discoverer v1.4 (SEQUEST), Protein Metrics Inc.
T471 68853-68887 Sentence denotes Byonic v3.8.13, and pGlyco v2.2.2.
T472 68888-69020 Sentence denotes For data searches using Proteome Discoverer, the results were processed to apply false discovery rate filtering using ProteoIQ v2.7.
T473 69021-69193 Sentence denotes For the deglycosylated protein work, search results from SEQUEST were filtered in ProteoIQ with a 1% false discovery rate at the protein level and 10% at the peptide level.
T474 69194-69327 Sentence denotes For N-linked glycopeptide analysis, pGlyco was used with false discovery rate of 1% at the glycan level and 10% at the peptide level.
T475 69328-69444 Sentence denotes For disulfide bond analysis and O-glycopeptide searches, Byonic was used and the false discovery rate was set to 1%.
T476 69445-69497 Sentence denotes All mass spectrometry results were manually curated.
T477 69498-69750 Sentence denotes Antigen accessibility simulations were carried out as described in the Method Details section and the mean of four simulations (three of length 350ns, one of length 200ns; amounting to 1.25 μs of total molecular dynamics simulation time) were utilized.
T478 69751-70069 Sentence denotes Glycan-glycan and glycan-peptide interactions were also calculated based on simulations as a percentage of time residues were in contact and averaged (mean) to produce the corresponding supplemental (colored) sequence figures with the raw numbers for coloring present also in each corresponding supplemental table tab.
T479 70070-70166 Sentence denotes 3D distances were computed using Rpdb as described in more detail in the Method Details section.
T480 70167-70263 Sentence denotes This data is presented using box & whisker plots with all underlying statistics calculated in R.