PubAnnotation

Id	Subject	Object	Predicate	Lexical cue
T16	0-12	Sentence	denotes	Introduction
T17	13-236	Sentence	denotes	The SARS-CoV-2 coronavirus, a positive-sense single-stranded RNA virus, is responsible for the severe acute respiratory syndrome referred to as COVID-19 that was first reported in China in December 2019 (Zhou et al., 2020).
T18	237-468	Sentence	denotes	In approximately six months, this betacoronavirus has spread globally, with more than 14 million people testing positive worldwide resulting in greater than 600,000 deaths as of July 20, 2020 (https://coronavirus.jhu.edu/map.html).
T19	469-703	Sentence	denotes	The SARS-CoV-2 coronavirus is highly similar (nearly 80% identical at the genomic level) to SARS-CoV-1, which was responsible for the severe acute respiratory syndrome outbreak that began in 2002 (Lu et al., 2020; Zhong et al., 2003).
T20	704-914	Sentence	denotes	Furthermore, human SARS-CoV-2 at the whole-genome level is >95% identical to a bat coronavirus (RaTG13), the natural reservoir host for multiple coronaviruses (Xia, 2020; Zhang et al., 2020; Zhou et al., 2020).
T21	915-1401	Sentence	denotes	Given the rapid appearance and spread of this virus, there is no current validated vaccine or SARS-CoV-2-specific targeting therapy that is clinically approved, although statins, heparin, and steroids look promising for lowering fatality rates, and antivirals likely reduce the duration of symptomatic disease presentation (Alijotas-Reig et al., 2020; Beigel et al., 2020; Beun et al., 2020; Dashti-Khavidaki and Khalili, 2020; Fedson et al., 2020; Shi et al., 2020; Tang et al., 2020).
T22	1402-1567	Sentence	denotes	SARS-CoV-2, like SARS-CoV-1, utilizes the host angiotensin-converting enzyme 2 (ACE2) for binding and entry into host cells (Hoffmann et al., 2020; Li et al., 2003).
T23	1568-1743	Sentence	denotes	Like many viruses, SARS-CoV-2 utilizes a Spike glycoprotein trimer for recognition and binding to the host cell entry receptor and for membrane fusion (Watanabe et al., 2019).
T24	1744-2124	Sentence	denotes	Given the importance of viral Spike proteins for targeting and entry into host cells along with their location on the viral surface, Spike proteins are often used as immunogens for vaccines to generate neutralizing antibodies and frequently targeted for inhibition by small molecules that might block host receptor binding and/or membrane fusion (Li, 2016; Watanabe et al., 2019).
T25	2125-2368	Sentence	denotes	In similar fashion, wild-type or catalytically impaired ACE2 has also been investigated as a potential therapeutic biologic that might interfere with the infection cycle of ACE2-targeting coronaviruses (Lei et al., 2020; Monteil et al., 2020).
T26	2369-2576	Sentence	denotes	Thus, a detailed understanding of SARS-CoV-2 Spike binding to ACE2 is critical for elucidating mechanisms of viral binding and entry, as well as for undertaking the rational design of effective therapeutics.
T27	2577-2741	Sentence	denotes	The SARS-CoV-2 Spike glycoprotein consists of two subunits, a receptor binding subunit (S1) and a membrane fusion subunit (S2) (Lu et al., 2020; Zhou et al., 2020).
T28	2742-3026	Sentence	denotes	The Spike glycoprotein assembles into stable homotrimers that together possess 66 canonical sequons for N-linked glycosylation (N-X-S/T, where X is any amino acid except P) as well as a number of potential O-linked glycosylation sites (Watanabe et al., 2020a; Watanabe et al., 2020b).
T29	3027-3350	Sentence	denotes	Interestingly, coronaviruses virions bud into the lumen of the endoplasmic reticulum-Golgi intermediate compartment, ERGIC, raising unanswered questions regarding the precise mechanisms by which viral surface glycoproteins are processed as they traverse the secretory pathway (Stertz et al., 2007; Ujike and Taguchi, 2015).
T30	3351-3656	Sentence	denotes	Although this and similar studies (Shajahan et al., 2020; Watanabe et al., 2020a) analyze recombinant proteins, a previous study on SARS-CoV-1 suggested that glycosylation of the Spike can be impacted by this intracellular budding, and this remains to be investigated in SARS-CoV-2 (Ritchie et al., 2010).
T31	3657-4024	Sentence	denotes	Nonetheless, it has been proposed that this virus, and others, acquires a glycan coat sufficient and similar enough to endogenous host protein glycosylation that it serves as a glycan shield, facilitating immune evasion by masking non-self viral peptides with self-glycans (Stertz et al., 2007; Ujike and Taguchi, 2015; Watanabe et al., 2020b; Watanabe et al., 2019).
T32	4025-4383	Sentence	denotes	In parallel with their potential masking functions, glycan-dependent epitopes can elicit specific, even neutralizing, antibody responses, as has been described for HIV-1 (Duan et al., 2018; Escolano et al., 2019; Pinto et al., 2020; Seabright et al., 2020; Watanabe et al., 2019; Yu et al., 2018; https://www.biorxiv.org/content/10.1101/2020.06.30.178897v1).
T33	4384-4573	Sentence	denotes	Thus, understanding the glycosylation of the viral Spike trimer is fundamental for the development of efficacious vaccines, neutralizing antibodies, and therapeutic inhibitors of infection.
T34	4574-4689	Sentence	denotes	ACE2 is an integral membrane metalloproteinase that regulates the renin-angiotensin system (Tikellis et al., 2011).
T35	4690-4864	Sentence	denotes	Both SARS-CoV-1 and SARS-CoV-2 have co-opted ACE2 to function as the receptor by which these viruses attach and fuse with host cells (Hoffmann et al., 2020; Li et al., 2003).
T36	4865-5126	Sentence	denotes	ACE2 is cleavable by ADAM proteases at the cell surface (Lambert et al., 2005), resulting in the shedding of a soluble ectodomain that can be detected in apical secretions of various epithelial layers (gastric, airway, etc.) and in serum (Epelman et al., 2009).
T37	5127-5266	Sentence	denotes	The N-terminal extracellular domain of ACE2 contains six canonical sequons for N-linked glycosylation and several potential O-linked sites.
T38	5267-5515	Sentence	denotes	Several nonsynonymous single-nucleotide polymorphisms (SNPs) in the ACE2 gene have been identified in the human population and could potentially alter ACE2 glycosylation and/or affinity of the receptor for the viral Spike protein (Li et al., 2005).
T39	5516-5886	Sentence	denotes	Given that glycosylation can affect the half-life of circulating glycoproteins in addition to modulating the affinity of their interactions with receptors and immune/inflammatory signaling pathways (Marth and Grewal, 2008; Varki, 2017), understanding the impact of glycosylation of ACE2 with respect to its binding of SARS-CoV-2 Spike glycoprotein is of high importance.
T40	5887-6176	Sentence	denotes	The proposed use of soluble extracellular domains of ACE2 as decoy, competitive inhibitors for SARS-CoV-2 infection emphasizes the critical need for understanding the glycosylation profile of ACE2 so that optimally active biologics can be produced (Lei et al., 2020; Monteil et al., 2020).
T41	6177-6539	Sentence	denotes	To accomplish the task of characterizing site-specific glycosylation of the trimer Spike of SARS-CoV-2 and the host receptor ACE2, we began by expressing and purifying a stabilized, soluble trimer Spike glycoprotein mimetic immunogen (that we define here and forward as S, [Yu et al., 2020]) and a soluble version of the ACE2 glycoprotein from a human cell line.
T42	6540-6722	Sentence	denotes	We utilized multiple mass-spectrometry-based approaches, including glycomic and glycoproteomic approaches, to determine occupancy and site-specific heterogeneity of N-linked glycans.
T43	6723-6906	Sentence	denotes	Occupancy (i.e., the percent of any given residue being modified by a glycan) is an important consideration when developing neutralizing antibodies against a glycan-dependent epitope.
T44	6907-7018	Sentence	denotes	We also identified sites of O-linked glycosylation and the heterogeneity of the O-linked glycans on S and ACE2.
T45	7019-7234	Sentence	denotes	We leveraged this rich dataset, along with existing 3D-structures of both glycoproteins, to generate static and molecular dynamics (MD) models of S alone, and in complex with the glycosylated, soluble ACE2 receptor.
T46	7235-7510	Sentence	denotes	By combining bioinformatics characterization of viral evolution and variants of S and ACE2 with MD simulations of the glycosylated S-ACE2 interaction, we identified important roles for glycans in multiple processes, including receptor-viral binding and glycan shielding of S.
T47	7511-7749	Sentence	denotes	Our rich characterization of the recombinant, glycosylated S trimer mimetic immunogen of SARS-CoV-2 in complex with the soluble human ACE2 receptor provides a detailed platform for guiding rational vaccine, antibody, and inhibitor design.
T48	7751-7758	Sentence	denotes	Results
T49	7760-7869	Sentence	denotes	Expression, Purification, and Characterization of SARS-CoV-2 Spike Glycoprotein Trimer and Soluble Human ACE2
T50	7870-8376	Sentence	denotes	A trimer-stabilized, soluble variant of the SARS-CoV-2 S that contains 22 canonical N-linked glycosylation sequons per protomer and a soluble version of human ACE2 that contains six, lacking the most C-terminal seventh, canonical N-linked glycosylation sequons (Figure 1 A) were purified from the media of transfected HEK293 cells, and the quaternary structure confirmed by negative EM staining for the S trimer (Figure 1B) and purity examined by SDS-PAGE Coomassie G-250 stained gels for both (Figure 1C).
T51	8377-8505	Sentence	denotes	In addition, proteolytic digestions followed by proteomic analyses confirmed that the proteins were highly purified (Table S12).
T52	8506-8705	Sentence	denotes	Finally, the N terminus of both the mature S and the soluble mature ACE2 were empirically determined via proteolytic digestions and liquid chromatography-tandem mass spectrometry (LC-MS/MS) analyses.
T53	8706-8934	Sentence	denotes	These results confirmed that both the secreted, mature forms of S protein and ACE2 begin with an N-terminal glutamine that has undergone condensation to form pyroglutamine at residues 14 and 18, respectively (Figures 1D and S1).
T54	8935-9182	Sentence	denotes	The N-terminal peptide observed for S also contains a glycan at Asn-0017 (Figure 1D), and mass spectrometry analysis of non-reducing proteolytic digestions confirmed that Cys-0015 of S is in a disulfide linkage with Cys-0136 (Figure S2; Table S2).
T55	9183-9516	Sentence	denotes	Given that SignalP (Almagro Armenteros et al., 2019) predicts signal sequence cleavage between Cys-0015 and Val-0016 but we observed cleavage between Ser-0013 and Gln-0014, we examined the possibility that an in-frame upstream methionine to the proposed start methionine (Figure 1A) might be used to initiate translation (Figure S3).
T56	9517-9736	Sentence	denotes	If one examines the predicted signal sequence cleavage using the in-frame Met that is encoded nine amino acids upstream, SignalP now predicts cleavage between the Ser and Gln that we observed in our studies (Figure S3).
T57	9737-9978	Sentence	denotes	To examine whether this impacted S expression, we expressed constructs that contained or did not contain the upstream 27 nucleotides in a pseudovirus (VSV) system expressing SARS-CoV-2 S (Figure S4) and in our HEK293 system (data not shown).
T58	9979-10100	Sentence	denotes	Both expression systems produced a similar amount of S regardless of which expression construct was utilized (Figure S4).
T59	10101-10306	Sentence	denotes	Thus, while the translation initiation start site has still not been fully defined, allowing for earlier translation in expression construct design did not have a significant impact on the generation of S.
T60	10307-10420	Sentence	denotes	Figure 1 Expression and Characterization of SARS-CoV-2 Spike Glycoprotein Trimer Immunogen and Soluble Human ACE2
T61	10421-10484	Sentence	denotes	(A) Sequences of SARS-CoV-2 S immunogen and soluble human ACE2.
T62	10485-10591	Sentence	denotes	The N-terminal pyroglutamines for both mature protein monomers are bolded, underlined, and shown in green.
T63	10592-10678	Sentence	denotes	The canonical N-linked glycosylation sequons are bolded, underlined, and shown in red.
T64	10679-10888	Sentence	denotes	(B and C) Negative stain electron microscopy of the purified trimer (B) and Coomassie G-250-stained reducing SDS-PAGE gels (C) confirmed purity of the SARS-CoV-2 S protein trimer and of the soluble human ACE2.
T65	10889-10919	Sentence	denotes	MWM, molecular weight markers.
T66	10920-11089	Sentence	denotes	(D) A representative Step-HCD fragmentation spectrum from mass-spectrometry analysis of a tryptic digest of S annotated manually based on search results from pGlyco 2.2.
T67	11090-11182	Sentence	denotes	This spectrum defines the N terminus of the mature protein monomer as (pyro-)glutamine 0014.
T68	11183-11344	Sentence	denotes	A representative N-glycan consistent with this annotation and our glycomics data (Figure 2) is overlaid by using the Symbol Nomenclature For Glycans (SNFG) code.
T69	11345-11381	Sentence	denotes	This complex glycan occurs at N0017.
T70	11382-11501	Sentence	denotes	Note, that as expected, the cysteine is carbamidomethylated, and the mass accuracy of the assigned peptide is 0.98 ppm.
T71	11502-11614	Sentence	denotes	On the sequence of the N-terminal peptide and in the spectrum, the assigned b (blue) and y (red) ions are shown.
T72	11615-11768	Sentence	denotes	In the spectrum, purple highlights glycan oxonium ions and green marks intact peptide fragment ions with various partial glycan sequences still attached.
T73	11769-11936	Sentence	denotes	Note that the green-labeled ions allow for limited topology to be extracted including defining that the fucose is on the core and not the antennae of the glycopeptide.
T74	11938-12043	Sentence	denotes	Glycomics-Informed Glycoproteomics Reveals Site-Specific Microheterogeneity of SARS-CoV-2 S Glycosylation
T75	12044-12128	Sentence	denotes	We utilized multiple approaches to examine glycosylation of the SARS-CoV-2 S trimer.
T76	12129-12264	Sentence	denotes	First, the portfolio of glycans linked to SARS-CoV-2 S trimer immunogen was analyzed after their release from the polypeptide backbone.
T77	12265-12391	Sentence	denotes	N-glycans were released from protein by treatment with PNGase F- and O-glycans were subsequently released by beta-elimination.
T78	12392-12588	Sentence	denotes	After permethylation to enhance detection sensitivity and structural characterization, released glycans were analyzed by multi-stage mass spectrometry (MSn) (Aoki et al., 2007; Aoki et al., 2008).
T79	12589-12714	Sentence	denotes	Mass spectra were processed by GRITS Toolbox, and the resulting annotations were validated manually (Weatherly et al., 2019).
T80	12715-12871	Sentence	denotes	Glycan assignments were grouped by type and by additional structural features for relative quantification of profile characteristics (Figure 2 A; Table S3).
T81	12872-13040	Sentence	denotes	This analysis quantified 49 N-glycans and revealed that 55% of the total glycan abundance was of the complex type, 17% was of the hybrid type, and 28% was high mannose.
T82	13041-13190	Sentence	denotes	Among the complex and hybrid N-glycans, we observed a high degree of core fucosylation and significant abundance of bisected and LacDiNAc structures.
T83	13191-13432	Sentence	denotes	We also observed sulfated N-linked glycans by using negative mode MSn analyses (Table S13), although signal intensity was too low in positive ion mode (at least 10-fold lower than any of the non-sulfated glycans) for accurate quantification.
T84	13433-13520	Sentence	denotes	In addition, we detected 15 O-glycans released from the S trimer (Figure S5; Table S4).
T85	13521-13659	Sentence	denotes	Figure 2 Glycomics-Informed Glycoproteomics Reveals Substantial Site-Specific Microheterogeneity of N-linked Glycosylation on SARS-CoV-2 S
T86	13660-13763	Sentence	denotes	(A) Glycans released from SARS-CoV-2 S protein trimer immunogen were permethylated and analyzed by MSn.
T87	13764-13885	Sentence	denotes	Structures were assigned and grouped by type and structural features, and prevalence was determined based on ion current.
T88	13886-13944	Sentence	denotes	The pie chart shows basic division by broad N-glycan type.
T89	13945-14013	Sentence	denotes	The bar graph provides additional detail about the glycans detected.
T90	14014-14187	Sentence	denotes	The most abundant structure with a unique categorization by glycomics for each N-glycan type in the pie chart, or above each feature category in the bar graph, is indicated.
T91	14188-14412	Sentence	denotes	(B–E) Glycopeptides were prepared from SARS-CoV-2 S protein trimer immunogen by using multiple combinations of proteases, analyzed by LC-MSn, and the resulting data were searched by using several different software packages.
T92	14413-14535	Sentence	denotes	Four representative sites of N-linked glycosylation with specific features of interest were chosen and are presented here.
T93	14536-14764	Sentence	denotes	N0074 (B) and N0149 (C) are shown that occur in variable insert regions of S compared to SARS-CoV and other related coronaviruses, and there are emerging variants of SARS-CoV-2 that disrupt these two sites of glycosylation in S.
T94	14765-14823	Sentence	denotes	N0234 (D) contains the most high-mannose N-linked glycans.
T95	14824-14974	Sentence	denotes	N0801 (D) is an example of glycosylation in the S2 region of the immunogen and displays a high degree of hybrid glycosylation compared to other sites.
T96	14975-15057	Sentence	denotes	The abundance of each composition is graphed in terms of assigned spectral counts.
T97	15058-15178	Sentence	denotes	Representative glycans (as determined by glycomics analysis) for several abundant compositions are shown in SNFG format.
T98	15179-15310	Sentence	denotes	The abbreviations used here and throughout the manuscript are as follows: N, HexNAc; H, hexose; F, fucose; A, Neu5Ac; S, sulfation.
T99	15311-15475	Sentence	denotes	Note that the graphs for the other 18 sites and other graphs grouping the microheterogeneity observed by other properties are presented in Supplemental Information.
T100	15476-15716	Sentence	denotes	To determine occupancy of N-linked glycans at each site, we employed a sequential deglycoslyation approach by using Endoglycosidase H and PNGase F in the presence of 18O-H2O after tryptic digestion of S (Wang et al., 2020; Yu et al., 2018).
T101	15717-15848	Sentence	denotes	After LC-MS/MS analyses, the resulting data confirmed that 19 of the canonical sequons had occupancies greater than 95% (Table S5).
T102	15849-16013	Sentence	denotes	One canonical sequence, N0149, had insufficient spectral counts for quantification by this method, but subsequent analyses described below suggested high occupancy.
T103	16014-16119	Sentence	denotes	The two most C-terminal N-linked sites, N1173 and N1194, had reduced occupancy, 52% and 82% respectively.
T104	16120-16299	Sentence	denotes	Reduced occupancy at these sites could reflect hindered en bloc transfer by the oligosaccharyltransferase (OST) due to primary amino acid sequences at or near the N-linked sequon.
T105	16300-16632	Sentence	denotes	Alternatively, this could reflect these two sites being post-translationally modified after release of the protein by the ribosome by a less efficient STT3B-containing OST, either due to activity or initial folding of the polypeptide, as opposed to co-translationally modified by the STT3A-containing OST (Ruiz-Canada et al., 2009).
T106	16633-16959	Sentence	denotes	None of the non-canonical sequons (three N-X-C sites and four N-G-L/I/V sites; Zielinska et al., 2010) showed significant occupancy (>5%), except for N0501, which showed moderate (19%) conversion to 18O-Asp that could be due to deamidation that is facilitated by glycine at the +1 position (Table S5) (Palmisano et al., 2012).
T107	16960-17115	Sentence	denotes	Further analysis of this site (see below) by direct glycopeptide analyses allowed us to determine that N0501 undergoes deamidation but is not glycosylated.
T108	17116-17277	Sentence	denotes	Thus, all, and only the, 22 canonical sequences for N-linked glycosylation (N-X-S/T) are utilized, with only N1173 and N1194 demonstrating occupancies below 95%.
T109	17278-17440	Sentence	denotes	Next, we applied three different proteolytic digestion strategies to the SARS-CoV-2 S immunogen to maximize glycopeptide coverage by subsequent LC-MS/MS analyses.
T110	17441-17741	Sentence	denotes	Extended gradient nanoflow reverse-phase LC-MS/MS was carried out on a ThermoFisher Lumos Tribrid instrument using Step-HCD fragmentation on each of the samples (see STAR Methods for details, as well as Duan et al., 2018; Escolano et al., 2019; Wang et al., 2020; Yu et al., 2018; Zhou et al., 2017).
T111	17742-18055	Sentence	denotes	After data analyses using pGlyco 2.2.2 (Liu et al., 2017), Byonic (Bern et al., 2012), and manual validation of glycan compositions against our released glycomics findings (Figure 2A; Tables S3 and S13), we were able to determine the microheterogeneity at each of the 22 canonical sites (Figures 2B–2E; Table S6).
T112	18056-18164	Sentence	denotes	Notably, none of the non-canonical consensus sequences, including N0501, displayed any quantifiable glycans.
T113	18165-18292	Sentence	denotes	The N-glycosites N0074 (Figure 2B) and N0149 (Figure 2C) are highly processed and display a typical mammalian N-glycan profile.
T114	18293-18383	Sentence	denotes	N0149 is, however, modified with several hybrid N-glycan structures, whereas N0074 is not.
T115	18384-18574	Sentence	denotes	N0234 (Figure 2D) and N0801 (Figure 2E) have N-glycan profiles more similar to those found on other viruses such as HIV (Watanabe et al., 2019) that are dominated by high-mannose structures.
T116	18575-18729	Sentence	denotes	N0234 (Figure 2D) displays an abundance of Man7-Man9 high-mannose structures, suggesting stalled processing by early-acting ER and cis-Golgi mannosidases.
T117	18730-18927	Sentence	denotes	In contrast, N0801 (Figure 2E) is processed more efficiently to Man5 high-mannose and hybrid structures, suggesting that access to the glycan at this site by MGAT1 and α-Mannosidase II is hindered.
T118	18928-19195	Sentence	denotes	In general, for all 22 sites (Figures 2B–2E; Table S6), we observed underprocessing of complex glycan antennae (i.e., under-galactosylation and under-sialylation) and a high degree of core fucosylation in agreement with released glycan analyses (Figure 2A; Table S3).
T119	19196-19294	Sentence	denotes	We also observed a small percent of sulfated N-linked glycans at several sites (Tables S6 and S8).
T120	19295-19510	Sentence	denotes	Based on the assignments and the spectral counts for each topology, we were able to determine the percent of total N-linked glycan types (high-mannose, hybrid, or complex) present at each site (Figure 3 ; Table S7).
T121	19511-19757	Sentence	denotes	Notably, three of the sites (N0234, N0709, and N0717) displayed more than 50% high-mannose glycans, whereas 11 other sites (N0017, N0074, N0149, N0165, N0282, N0331, N0657, N1134, N1158, N1173, and N1194) were more than 90% complex when occupied.
T122	19758-19824	Sentence	denotes	The other eight sites were distributed between these two extremes.
T123	19825-19955	Sentence	denotes	Notably, only one site (N0717 at 45%), which also had greater than 50% high mannose (55%), had greater than 33% hybrid structures.
T124	19956-20226	Sentence	denotes	To further evaluate the heterogeneity, we grouped all the topologies into the 20 classes recently described by the Crispin laboratory, adding two categories (sulfated and unoccupied) that we refer to here as the Oxford classification (Table S8) (Watanabe et al., 2020a).
T125	20227-20526	Sentence	denotes	Among other features observed, this classification allowed us to observe that although most sites with high-mannose structures were dominated by the Man5GlcNAc2 structure, N0234 and N0717 were dominated by the higher Man structures of Man8GlcNAc2 and Man7GlcNAc2, respectively (Figure S7; Table S8).
T126	20527-20750	Sentence	denotes	Limited processing at N0234 is in agreement with a recent report suggesting that high-mannose structures at this site help to stabilize the receptor-binding domain of S (www.biorxiv.org/content/10.1101/2020.06.11.146522v1).
T127	20751-21078	Sentence	denotes	Furthermore, applying the Oxford classifications to our dataset clearly demonstrates that the three most C-terminal sites (N1158, N1173, and N1194), dominated by complex-type glycans, were more often further processed (i.e., multiple antennae) and elaborated (i.e., galactosylation and sialylation) than other sites (Table S8).
T128	21079-21173	Sentence	denotes	Figure 3 SARS-CoV-2 S Immunogen N-glycan Sites Are Predominantly Modified by Complex N-glycans
T129	21174-21455	Sentence	denotes	N-glycan topologies were assigned to all 22 sites of the S protomer and the spectral counts for each of the three types of N-glycans (high-mannose, hybrid, and complex), as well as the unoccupied peptide spectral match counts at each site, were summed and visualized as pie charts.
T130	21456-21543	Sentence	denotes	Note that only N1173 and N1194 show an appreciable amount of the unoccupied amino acid.
T131	21544-21827	Sentence	denotes	We also analyzed our generated mass spectrometry data for the presence of O-linked glycans based on our glycomic findings (Figure S5; Table S4) and a recent manuscript suggesting significant levels of O-glycosylation of S1 and S2 when expressed independently (Shajahan et al., 2020).
T132	21828-21964	Sentence	denotes	We were able to confirm sites of O-glycan modification with microheterogeneity observed for the vast majority of these sites (Table S9).
T133	21965-22168	Sentence	denotes	However, occupancy at each site, determined by spectral counts, was observed to be very low (below 4%), except for Thr0323, which had a modestly higher but still low 11% occupancy (Figure S6; Table S10).
T134	22170-22304	Sentence	denotes	3D Structural Modeling of Glycosylated SARS-CoV-2 Trimer Immunogen Enables Predictions of Epitope Accessibility and Other Key Features
T135	22305-22427	Sentence	denotes	A 3D structure of the S trimer was generated by using a homology model of the S trimer described previously (based on PDB:
T136	22428-22454	Sentence	denotes	6VSB; Wrapp et al., 2020).
T137	22455-22758	Sentence	denotes	Onto this 3D structure, we installed explicitly defined glycans at each glycosylated sequon based on one of three separate sets of criteria, thereby generating three different glycoform models for comparison that we denote as “Abundance,” “Oxford Class,” and “Processed” models (STAR Methods; Table S1).
T138	22759-22990	Sentence	denotes	These criteria were chosen in order to generate glycoform models that represent reasonable expectations for glycosylation microheterogeneity and integrate cross-validating glycomic and glycoproteomic characterization of S and ACE2.
T139	22991-23089	Sentence	denotes	The three glycoform models were subjected to multiple all-atom MD simulations with explicit water.
T140	23090-23216	Sentence	denotes	Information from analyses of these structures is presented in Figure 4 A along with the sequence of the SARS-CoV-2 S protomer.
T141	23217-23326	Sentence	denotes	We also determined variants in S that are emerging in the virus that have been sequenced to date (Table S11).
T142	23327-23502	Sentence	denotes	The inter-residue distances were measured between the most α-carbon-distal atoms of the N-glycan sites and Spike glycoprotein population variant sites in 3D space (Figure 4B).
T143	23503-23726	Sentence	denotes	Notable from this analysis, there are several variants that don’t ablate the N-linked sequon but are sufficiently close in 3D space to N-glycosites, such as D138H, H655Y, S939F, and L1203F, to warrant further investigation.
T144	23727-23877	Sentence	denotes	Figure 4 3D Structural Modeling of Glycosylated SARS-CoV-2 Spike Trimer Immunogen Reveals Predictions for Antigen Accessibility and Other Key Features
T145	23878-24070	Sentence	denotes	Results from glycomics and glycoproteomics experiments were combined with results from bioinformatics analyses and used to model several versions of glycosylated SARS-CoV-2 S trimer immunogen.
T146	24071-24178	Sentence	denotes	(A) Sequence of the SARS-CoV-2 S immunogen displaying computed antigen accessibility and other information.
T147	24179-24260	Sentence	denotes	Antigen accessibility is indicated by red shading across the amino acid sequence.
T148	24261-24464	Sentence	denotes	(B) Emerging variants confirmed by independent sequencing experiments were analyzed based on the 3D structure of SARS-CoV-2 S to generate a proximity chart to the determined N-linked glycosylation sites.
T149	24465-24678	Sentence	denotes	(C) SARS-CoV-2 S trimer immunogen model from MD simulation displaying abundance glycoforms and antigen accessibility shaded in red for most accessible, white for partial, and black for inaccessible (see Video S1).
T150	24679-24795	Sentence	denotes	(D) SARS-CoV-2 S trimer immunogen model from MD simulation displaying Oxford Class glycoforms and sequence variants.
T151	24796-24921	Sentence	denotes	Asterisk indicates not visible, whereas the box represents three amino acid variants that are clustered together in 3D space.
T152	24922-25093	Sentence	denotes	(E) SARS-CoV-2 S trimer immunogen model from MD simulation displaying processed glycoforms plus shading of Thr-323 that has O-glycosylation at low stoichiometry in yellow.
T153	25094-25351	Sentence	denotes	The percentage of simulation time that each S protein residue is accessible to a probe that approximates the size of an antibody variable domain was calculated for a model of the S trimer by using the Abundance glycoforms (Table S1) (Ferreira et al., 2018).
T154	25352-25522	Sentence	denotes	The predicted antibody accessibility is visualized across the sequence, as well as mapped onto the 3D surface, via color shading (Figures 4A and 4C; Table S13; Video S1).
T155	25523-25805	Sentence	denotes	Additionally, the Oxford Class glycoforms model (Table S1), which is arguably the most encompassing means for representing glycan microheterogeneity because it captures abundant structural topologies (Table S8), is shown with the sequence variant information (Figure 4D; Table S11).
T156	25806-26128	Sentence	denotes	A substantial number of these variants occur (directly by comparison to Figure 4A or visually by comparison to Figure 4C) in regions of high calculated epitope accessibility (e.g., N74K, T76I, R78M, D138H, H146Y, S151I, D253G, V483A, etc.; Table S14), suggesting potential selective pressure to avoid host immune response.
T157	26129-26477	Sentence	denotes	Also, it is interesting to note that three of the emerging variants would eliminate N-linked sequons in S; N74K and T76I would eliminate N-glycosylation of N74 (found in the insert variable region 1 of CoV-2 S compared to CoV-1 S), and S151I eliminates N-glycosylation of N149 (found in the insert variable region 2) (Figures 4A and S7; Table S11).
T158	26478-26733	Sentence	denotes	Lastly, the SARS-CoV-2 S Processed glycoform model is shown (Table S1), along with marking amino acid T0323 that has a modest (11% occupancy, Figure S6; Table S10) amount of O-glycosylation to represent the most heavily glycosylated form of S (Figure 4E).
T159	26734-26743	Sentence	denotes	Video S1.
T160	26744-26802	Sentence	denotes	Glycosylated S Antigen Accessibility, Related to Figure 4C
T161	26804-26885	Sentence	denotes	Glycomics-Informed Glycoproteomics Reveals Complex N-linked Glycosylation of ACE2
T162	26886-27004	Sentence	denotes	We also analyzed ACE2 glycosylation utilizing the same glycomic and glycoproteomic approaches described for S protein.
T163	27005-27234	Sentence	denotes	Glycomic analyses of released N-linked glycans (Figure 5 A; Table S3) revealed that the majority of glycans on ACE2 are complex with limited high-mannose and hybrid glycans, and we were unable to detect sulfated N-linked glycans.
T164	27235-27441	Sentence	denotes	Glycoproteomic analyses revealed that occupancy was high (> 75%) at all six sites, and significant microheterogeneity dominated by complex N-glycans was observed for each site (Figures 5B–5G; Tables S5–S8).
T165	27442-27702	Sentence	denotes	We also observed, consistent with the O-glycomics (Figure S5; Table S4), that Ser 155 and several S/T residues at the C terminus of ACE2 outside of the peptidase domain were O-glycosylated, but stoichiometry was extremely low (less than 2%; Tables S9 and S10).
T166	27703-27823	Sentence	denotes	Figure 5 Glycomics-Informed Glycoproteomics of Soluble Human ACE2 Reveals High Occupancy, Complex N-linked Glycosylation
T167	27824-27912	Sentence	denotes	(A) Glycans released from soluble, purified ACE2 were permethylated and analyzed by MSn.
T168	27913-28031	Sentence	denotes	Structures were assigned, grouped by type and structural features, and prevalence was determined based on ion current.
T169	28032-28090	Sentence	denotes	The pie chart shows basic division by broad N-glycan type.
T170	28091-28159	Sentence	denotes	The bar graph provides additional detail about the glycans detected.
T171	28160-28333	Sentence	denotes	The most abundant structure with a unique categorization by glycomics for each N-glycan type in the pie chart, or above each feature category in the bar graph, is indicated.
T172	28334-28539	Sentence	denotes	(B–G) Glycopeptides were prepared from soluble human ACE2 by using multiple combinations of proteases, analyzed by LC-MSn, and the resulting data were searched by using several different software packages.
T173	28540-28599	Sentence	denotes	All six sites of N-linked glycosylation are presented here.
T174	28600-28714	Sentence	denotes	Displayed in the bar graphs are the individual compositions observed graphed in terms of assigned spectral counts.
T175	28715-28835	Sentence	denotes	Representative glycans (as determined by glycomics analysis) for several abundant compositions are shown in SNFG format.
T176	28836-28952	Sentence	denotes	The pie chart (analogous to Figure 3 for SARS-CoV-2 S) for each site is displayed in the upper corner of each panel.
T177	28953-28962	Sentence	denotes	(B) N053.
T178	28963-28972	Sentence	denotes	(C) N090.
T179	28973-28982	Sentence	denotes	(D) N103.
T180	28983-28992	Sentence	denotes	(E) N322.
T181	28993-29002	Sentence	denotes	(F) N432.
T182	29003-29066	Sentence	denotes	(G) N546, a site that does not exist in three in 10,000 people.
T183	29068-29161	Sentence	denotes	3D Structural Modeling of Glycosylated, Soluble, ACE2-Highlighting Glycosylation and Variants
T184	29162-29287	Sentence	denotes	We integrated our glycomics, glycoproteomics, and population variant analyses results with a 3D model of Ace 2 (based on PDB:
T185	29288-29437	Sentence	denotes	6M0J (Lan et al., 2020; see STAR Methods for details) to generate two versions of the soluble glycosylated ACE2 for visualization and MD simulations.
T186	29438-29656	Sentence	denotes	We visualized the ACE2 glycoprotein with the Abundance glycoform model simulated at each site as well as highlighting the naturally occurring variants observed in the human population (Figure 6 A; Video S2; Table S11).
T187	29657-29777	Sentence	denotes	Note, that the Abundance glycoform model and the Oxford Class glycoform model for ACE2 are identical (Tables S1 and S8).
T188	29778-29965	Sentence	denotes	Notably, one site of N-linked glycosylation (N546) is predicted to not be present in three out of 10,000 humans based on naturally occurring variation in the human population (Table S11).
T189	29966-30035	Sentence	denotes	We also modeled ACE2 using the Processed glycoform model (Figure 6B).
T190	30036-30123	Sentence	denotes	In both models, the interaction domain with S is defined (Figures 6A and 6B; Video S2).
T191	30124-30190	Sentence	denotes	Figure 6 3D Structural Modeling of Glycosylated Soluble Human ACE2
T192	30191-30372	Sentence	denotes	Results from glycomics and glycoproteomics experiments were combined with results from bioinformatics analyses and used to model several versions of glycosylated soluble human ACE2.
T193	30373-30505	Sentence	denotes	(A) Soluble human ACE2 model from MD simulations displaying abundance glycoforms, interaction surface with S, and sequence variants.
T194	30506-30597	Sentence	denotes	N546 variant is boxed that would remove N-linked glycosylation at that site (see Video S2).
T195	30598-30710	Sentence	denotes	(B) Soluble human ACE2 model from MD simulations displaying processed glycoforms and interaction surface with S.
T196	30711-30720	Sentence	denotes	Video S2.
T197	30721-30774	Sentence	denotes	Glycosylated ACE2 with Variants, Related to Figure 6A
T198	30776-30927	Sentence	denotes	MD Simulation of the Glycosylated Trimer Spike of SARS-CoV-2 in Complex with Glycosylated, Soluble, Human Ace 2 Reveals Protein and Glycan Interactions
T199	30928-31052	Sentence	denotes	MD simulations were performed to examine the co-complex (generated from a crystal structure of the ACE2-RBD co-complex, PDB:
T200	31053-31235	Sentence	denotes	6M0J; Lan et al., 2020) of glycosylated S with glycosylated ACE2 with the three different glycoforms models (Abundance, Oxford Class, and Processed; Table S1; Videos S5, S6, and S7).
T201	31236-31474	Sentence	denotes	Information from these analyses is laid out along the primary structure (sequence) of the SARS-CoV-2 S protomer and ACE2 highlighting regions of glycan-protein interaction observed in the MD simulations (Table S14; Videos S5, S6, and S7).
T202	31475-31680	Sentence	denotes	Interestingly, two glycans on ACE2 (at N090 and N322), which are highlighted in Figure 7 A and shown in a more close-up view in Figure 7B, are predicted to form interactions with the S protein (Table S15).
T203	31681-31903	Sentence	denotes	The N322 glycan interaction with the S trimer is outside of the receptor-binding domain, and the interaction is observed across multiple simulations and throughout each simulation (Figures 7A and 7B; Video S5, S6, and S7).
T204	31904-32195	Sentence	denotes	The ACE2 glycan at N090 is close enough to the S trimer surface to repeatedly form interactions; however, the glycan arms interact with multiple regions of the surface over the course of the simulations, reflecting the relatively high degree of glycan dynamics (Figures 7A and 7B; Video S3).
T205	32196-32380	Sentence	denotes	Inter-molecule glycan-glycan interactions are also observed repeatedly between the glycan at N546 of ACE2 and those in the S protein at residues N0074 and N0165 (Figure 7D; Table S16).
T206	32381-32555	Sentence	denotes	Finally, a full view of the ACE2-S complex with Oxford class glycoforms on ACE2 illustrates the extensive glycosylation at the interface of the complex (Figure 7C; Video S4).
T207	32556-32713	Sentence	denotes	Figure 7 Interactions of Glycosylated Soluble Human ACE2 and Glycosylated SARS-CoV-2 S Trimer Immunogen Revealed By 3D-Structural Modeling and MD Simulations
T208	32714-32854	Sentence	denotes	(A) MD simulation of glycosylated soluble human ACE2 and glycosylated SARS-CoV-2 S trimer immunogen interaction (see Videos S5, S6, and S7).
T209	32855-32956	Sentence	denotes	ACE2 (top) is colored red with glycans in pink, whereas S is colored white with glycans in dark gray.
T210	32957-33039	Sentence	denotes	Highlighted are ACE2 glycans that interact with S that are magnified to the right.
T211	33040-33229	Sentence	denotes	(B) Magnification of ACE2-S interface highlighting ACE2 glycan interactions by using 3D-SNFG icons (Thieker et al., 2016) with S protein (pink) as well as ACE2-S glycan-glycan interactions.
T212	33230-33350	Sentence	denotes	(C) Magnification of dynamics trajectory of glycans at the interface of soluble human ACE2 and S (see Videos S3 and S4).
T213	33351-33360	Sentence	denotes	Video S3.
T214	33361-33410	Sentence	denotes	Interface of ACE2-S Complex, Related to Figure 7C
T215	33411-33420	Sentence	denotes	Video S4.
T216	33421-33474	Sentence	denotes	The Glycosylated ACE2-S Complex, Related to Figure 7C
T217	33475-33484	Sentence	denotes	Video S5.
T218	33485-33545	Sentence	denotes	Abundance Glycoforms on ACE2-S Complex, Related to Figure 7A
T219	33546-33555	Sentence	denotes	Video S6.
T220	33556-33619	Sentence	denotes	Oxford Class Glycoforms on ACE2-S Complex, Related to Figure 7A
T221	33620-33629	Sentence	denotes	Video S7.
T222	33630-33690	Sentence	denotes	Processed Glycoforms on ACE2-S Complex, Related to Figure 7A
T223	33692-33702	Sentence	denotes	Discussion
T224	33703-34170	Sentence	denotes	We have defined the glycomics-informed, site-specific microheterogeneity of 22 sites of N-linked glycosylation per monomer on a SARS-CoV-2 trimer and the six sites of N-linked glycosylation on a soluble version of its human ACE2 receptor by using a combination of mass spectrometry approaches coupled with evolutionary and variant sequence analyses to provide a detailed understanding of the glycosylation states of these glycoproteins (Figures 1, 2, 3, 4, 5, and 6).
T225	34171-34341	Sentence	denotes	Our results suggest essential roles for glycosylation in mediating receptor binding, antigenic shielding, and potentially the evolution/divergence of these glycoproteins.
T226	34342-34729	Sentence	denotes	The highly glycosylated SARS-CoV-2 Spike protein, unlike several other viral proteins including HIV-1 (Watanabe et al., 2019) but in agreement with another recent report (Watanabe et al., 2020a), presents significantly more processing of N-glycans toward complex glycosylation, suggesting that steric hindrance to processing enzymes is not a major factor at most sites (Figures 2 and 3).
T227	34730-34825	Sentence	denotes	However, the N-glycans still provide considerable shielding of the peptide backbone (Figure 4).
T228	34826-35238	Sentence	denotes	Our glycomics-guided glycoproteomic data are generally in strong agreement with the trimer immunogen data recently published by Crispin (Watanabe et al., 2020a), although we also observed sulfated N-linked glycans; were able to differentiate branching, bisected, and diLacNAc containing structures by glycomics; and observed less occupancy on the two most C-terminal N-linked sites by using a different approach.
T229	35239-35438	Sentence	denotes	Our detection of sulfated N-linked glycans at multiple sites on S is in agreement with a recent manuscript re-analyzing the Crispin data (https://www.biorxiv.org/content/10.1101/2020.05.31.125302v1).
T230	35439-35580	Sentence	denotes	Sulfated N-linked glycans could potentially play key roles in immune regulation and receptor binding as in other viruses (Wang et al., 2009).
T231	35581-35700	Sentence	denotes	This result is especially significant in that sulfated N-glycans were not observed when we performed glycomics on ACE2.
T232	35701-35965	Sentence	denotes	At each individual site, the glycans we observed on our immunogen appear to be slightly more processed, but the overlap between our analysis and the Crispin’s group results (Watanabe et al., 2020a) at each site in terms of major features are nearly superimposable.
T233	35966-36180	Sentence	denotes	This agreement differs substantially when comparing our and Crispin’s data (Watanabe et al., 2020a) to that of the Azadi group (Shajahan et al., 2020), which analyzed S1 and S2 that had been expressed individually.
T234	36181-36457	Sentence	denotes	When expressed as two separate polypeptides and not purified for trimers, several unoccupied sites of N-linked glycosylation were observed and processing at several sites was significantly different (Shajahan et al., 2020) than we and others (Watanabe et al., 2020a) observed.
T235	36458-36749	Sentence	denotes	Although O-glycosylation has recently been reported for individually expressed S1 and S2 domains of the Spike glycoprotein (Shajahan et al., 2020), in trimeric form the level of O-glycosylation is extremely low, with the highest level of occupancy we observed being 11% at T0323 (Figure 4E).
T236	36750-37019	Sentence	denotes	The low level of O-linked occupancy we observed is in agreement with the Crispin group’s analysis of a Spike Trimer immunogen (Watanabe et al., 2020a) but differs significantly from the Azadi group’s analyses of individually expressed S1 and S2 (Shajahan et al., 2020).
T237	37020-37298	Sentence	denotes	Thus, the context in which the Spike protein is expressed and purified before analysis significantly alters the glycosylation of the protomer that is reminiscent of previous studies looking at expression of the HIV-1 envelope Spike (Behrens et al., 2017; Watanabe et al., 2019).
T238	37299-37453	Sentence	denotes	The soluble ACE2 protein examined here contains six highly utilized sites of N-linked glycosylation dominated by complex type N-linked glycans (Figure 5).
T239	37454-37558	Sentence	denotes	O-glycans were also present on this glycoprotein but at very low levels of occupancy at all sites (<2%).
T240	37559-37769	Sentence	denotes	Our glycomics-informed glycoproteomics allowed us to assign defined sets of glycans to specific glycosylation sites on 3D-structures of S and ACE2 glycoproteins based on experimental evidence (Figures 4 and 6).
T241	37770-38009	Sentence	denotes	Similar to almost all glycoproteins, microheterogeneity is evident at most glycosylation sites of S and ACE2; each glycosylation site can be modified with one of several glycan structures, generating site-specific glycosylation portfolios.
T242	38010-38104	Sentence	denotes	For modeling purposes, however, explicit structures must be placed at each glycosylation site.
T243	38105-38275	Sentence	denotes	In order to capture the impact of microheterogeneity on S and ACE2 MD we chose to generate glycoforms for modeling that represented reasonable portfolios of glycan types.
T244	38276-38557	Sentence	denotes	Using three glycoform models for S (Abundance, Oxford Class, and Processed) and two models for ACE2 (Abundance, which was equivalent to Oxford Class, and Processed), we generated three MD simulations of the co-complexes of these two glycoproteins (Figure 7; Videos S5, S6, and S7).
T245	38558-38726	Sentence	denotes	The observed interactions over time allowed us to evaluate glycan-protein contacts between the two proteins and examine potential glycan-glycan interactions (Figure 7).
T246	38727-38833	Sentence	denotes	We observed glycan-mediated interactions between the S trimer and glycans at N090, N322, and N546 of ACE2.
T247	38834-38985	Sentence	denotes	Thus, variations in glycan occupancy or processing at these sites could alter the affinity of the SARS-CoV-2–ACE2 interaction and modulate infectivity.
T248	38986-39241	Sentence	denotes	It is well established that glycosylation states vary depending on tissue and cell type as well as in the case of humans, on age (Krištić et al., 2014), underlying disease (Pavić et al., 2018; Rudman et al., 2019), and ethnicity (Gebrehiwot et al., 2018).
T249	39242-39364	Sentence	denotes	Thus, glycosylation portfolios could in part be responsible for tissue tropism and individual susceptibility to infection.
T250	39365-39712	Sentence	denotes	The importance of glycosylation for S binding to ACE2 is even more emphatically demonstrated by the direct glycan-glycan interactions observed (Figure 7) between S glycans (at N0074 and N0165) and an ACE2 receptor glycan (at N546), adding an additional layer of complexity for interpreting the impact of glycosylation on individual susceptibility.
T251	39713-39838	Sentence	denotes	Several emerging variants of the virus appear to be altering N-linked glycosylation occupancy by disrupting N-linked sequons.
T252	39839-40058	Sentence	denotes	Interestingly, the two N-linked sequons in SARS-CoV-2 S directly impacted by variants, N0074 and N0149, are in divergent insert regions 1 and 2, respectively, of SARS-CoV-2 S in comparison with SARS-CoV-1 S (Figure 4A).
T253	40059-40302	Sentence	denotes	The N0074, in particular, is one of the S glycans that interact directly with ACE2 glycan (at N546; Figure 7), suggesting that glycan-glycan interactions could contribute to the unique infectivity differences between SARS-CoV-2 and SARS-CoV-1.
T254	40303-40522	Sentence	denotes	These sequon variants will also be important to examine in terms of glycan shielding that could influence immunogenicity and efficacy of neutralizing antibodies, as well as interactions with the host cell receptor ACE2.
T255	40523-40750	Sentence	denotes	Naturally occurring amino acid-changing SNPs in the ACE2 gene generate a number of variants including one variant, with a frequency of three in 10,000 humans, that eliminates a site of N-linked glycosylation at N546 (Figure 6).
T256	40751-41033	Sentence	denotes	Understanding the impact of ACE2 variants on glycosylation and more importantly on S binding, especially for N546S, which impacts the glycan-glycan interaction between S and ACE2 (Figure 7), should be prioritized in light of efforts to develop ACE2 as a potential decoy therapeutic.
T257	41034-41181	Sentence	denotes	Intelligent manipulation of ACE2 glycosylation could lead to more potent biologics capable of acting as better competitive inhibitors of S binding.
T258	41182-41516	Sentence	denotes	The data presented here, and related similar recent findings (Casalino et al., 2020; Watanabe et al., 2020a; Wrobel et al., 2020), provide a framework to facilitate the production of immunogens, vaccines, antibodies, and inhibitors as well as additional information regarding mechanisms by which glycan microheterogeneity is achieved.
T259	41517-41651	Sentence	denotes	However, considerable efforts still remain in order to fully understand the role of glycans in SARS-CoV-2 infection and pathogenicity.
T260	41652-41901	Sentence	denotes	Although HEK-expressed S and ACE2 provide a useful window for understanding human glycosylation of these proteins, glycoproteomic characterization after expression in cell lines of more direct relevance to disease and target tissue is sorely needed.
T261	41902-42099	Sentence	denotes	Although site occupancy could change depending on presentation and cell type (Struwe et al., 2018), processing of N-linked glycans will almost certainly be altered in a cell-type-dependent fashion.
T262	42100-42391	Sentence	denotes	Thus, analyses of the Spike trimer extracted from pseudoviruses, virion-like particles, and ultimately from infectious SARS-CoV-2 virions harvested from airway cells or patients will provide the most accurate view of how trimer immunogens reflect the true glycosylation pattern of the virus.
T263	42392-42599	Sentence	denotes	Detailed analyses of the impact of emerging variants in S and natural and designed-for-biologics variants of ACE2 on glycosylation and binding properties are important next steps for developing therapeutics.
T264	42600-42860	Sentence	denotes	Finally, it will be important to monitor the slow evolution of the virus to determine if existing sites of glycosylation are lost or new sites emerge with selective pressure that might alter the efficacy of vaccines, neutralizing antibodies, and/or inhibitors.
T265	42862-42874	Sentence	denotes	STAR★Methods
T266	42876-42895	Sentence	denotes	Key Resources Table
T267	42896-42933	Sentence	denotes	REAGENT or RESOURCE SOURCE IDENTIFIER
T268	42934-42979	Sentence	denotes	Chemicals, Peptides, and Recombinant Proteins
T269	42980-43015	Sentence	denotes	SARS-CoV-2 S protein This Study N/A
T270	43016-43049	Sentence	denotes	Human ACE2 protein This Study N/A
T271	43050-43095	Sentence	denotes	2x Laemmli sample buffer Bio-Rad Cat#161-0737
T272	43096-43189	Sentence	denotes	Invitrogen NuPAGE 4 to 12%, Bis-Tris, Mini Protein Gel Thermo Fisher Scientific Cat#NP0321PK2
T273	43190-43259	Sentence	denotes	Coomassie Brilliant Blue G-250 Dye Thermo Fisher Scientific Cat#20279
T274	43260-43298	Sentence	denotes	Dithiothreitol Sigma Aldrich Cat#43815
T275	43299-43336	Sentence	denotes	Iodoacetamide Sigma Aldrich Cat#I1149
T276	43337-43362	Sentence	denotes	Trypsin Promega Cat#V5111
T277	43363-43386	Sentence	denotes	Lys-C Promega Cat#V1671
T278	43387-43410	Sentence	denotes	Arg-C Promega Cat#V1881
T279	43411-43434	Sentence	denotes	Glu-C Promega Cat#V1651
T280	43435-43459	Sentence	denotes	Asp-N Promega Cat#VA1160
T281	43460-43495	Sentence	denotes	Endoglycosidase H Promega Cat#V4871
T282	43496-43521	Sentence	denotes	PNGaseF Promega Cat#V4831
T283	43522-43582	Sentence	denotes	Chymotrypsin Athens Research and Technology Cat#16-19-030820
T284	43583-43633	Sentence	denotes	Alpha lytic protease New England BioLabs Cat#P8113
T285	43634-43687	Sentence	denotes	18O water Cambridge Isotope Laboratories OLM-782-10-1
T286	43688-43730	Sentence	denotes	O-protease OpeRATOR Genovis Cat#G1-OP1-020
T287	43731-43745	Sentence	denotes	Deposited Data
T288	43746-43847	Sentence	denotes	MS data for site-specific N-linked glycopeptides for SARS-Cov-2 S and human ACE2 This Study PXD019937
T289	43848-43949	Sentence	denotes	MS data for site-specific O-linked glycopeptides for SARS-Cov-2 S and human ACE2 This Study PXD019940
T290	43950-44052	Sentence	denotes	MS data for deglycosylated N-linked glycopeptides for SARS-Cov-2 S and human ACE2 This Study PXD019938
T291	44053-44126	Sentence	denotes	MS data for disulfide bond analysis for SARS-Cov-2 S This Study PXD019939
T292	44127-44202	Sentence	denotes	MS data for N-linked glycomics deposited at GlycoPost This Study GPST000120
T293	44203-44278	Sentence	denotes	MS data for O-linked glycomics deposited at GlycoPost This Study GPST000121
T294	44279-44299	Sentence	denotes	Experimental Models:
T295	44300-44310	Sentence	denotes	Cell Lines
T296	44311-44339	Sentence	denotes	293-F Cells GIBCO Cat#R79007
T297	44340-44365	Sentence	denotes	Vero-6 Cells ATCC CRL1586
T298	44366-44386	Sentence	denotes	Experimental Models:
T299	44387-44404	Sentence	denotes	Organisms/Strains
T300	44405-44440	Sentence	denotes	VSV(G)-Pseudoviruses This Study N/A
T301	44441-44464	Sentence	denotes	Software and Algorithms
T302	44465-44545	Sentence	denotes	pGlyco v2.2.2 Liu et al., 2017 http://pfind.ict.ac.cn/software/pGlyco/index.html
T303	44546-44611	Sentence	denotes	Proteome Discoverer v1.4 Thermo Fisher Scientific CAT#OPTON-30945
T304	44612-44695	Sentence	denotes	Byonic v3.8.13 Protein Metrics Inc. https://www.proteinmetrics.com/products/byonic/
T305	44696-44818	Sentence	denotes	ProteoIQ v2.7 Premier Biosoft (Bern et al., 2012) http://www.premierbiosoft.com/protein_quantification_software/index.html
T306	44819-44890	Sentence	denotes	GRITS Toolbox V1.1 Weatherly et al., 2019 http://www.grits-toolbox.org/
T307	44891-44976	Sentence	denotes	EMBOSS needle v6.6.0 Rice et al., 2000 https://www.ebi.ac.uk/Tools/psa/emboss_needle/
T308	44977-45033	Sentence	denotes	Biopython v1.76 Cock et al., 2009 https://biopython.org/
T309	45034-45081	Sentence	denotes	Rpdb v2.3 Julien Ide https://rdrr.io/cran/Rpdb/
T310	45082-45166	Sentence	denotes	SignalP V5.0 Almagro Armenteros et al., 2019 http://www.cbs.dtu.dk/services/SignalP/
T311	45167-45265	Sentence	denotes	LibreOFFICE Writer v6.4.4.2 The Document Foundation https://www.libreoffice.org/download/download/
T312	45266-45318	Sentence	denotes	GlyGen V1.5 York et al., 2020 https://www.glygen.org
T313	45319-45409	Sentence	denotes	GNOme V1.5.5 OBO Foundry https://github.com/glygen-glycan-data/GNOme/blob/master/README.md
T314	45410-45476	Sentence	denotes	GlyTouCan V3.1.0 Aoki-Kinoshita et al., 2016 https://glytoucan.org
T315	45477-45563	Sentence	denotes	Inkscape V1.0 Inkscape project contributors https://inkscape.org/release/inkscape-1.0/
T316	45564-45617	Sentence	denotes	ffmpeg V3.4 The FFmpeg developers https://ffmpeg.org/
T317	45618-45673	Sentence	denotes	Cygwin V3.1.5 Cygwin developers https://www.cygwin.com/
T318	45675-45696	Sentence	denotes	Resource Availability
T319	45698-45710	Sentence	denotes	Lead Contact
T320	45711-45919	Sentence	denotes	Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Lance Wells (lwells@ccrc.uga.edu) or alternatively by Peng Zhao (pengzhao@uga.edu).
T321	45921-45943	Sentence	denotes	Materials Availability
T322	45944-45992	Sentence	denotes	This study did not generate new unique reagents.
T323	45994-46020	Sentence	denotes	Data and Code Availability
T324	46021-46157	Sentence	denotes	The mass spectrometry proteomics data are available via ProteomeXchange with identifiers PXD019937, PXD019940, PXD019938, and PXD019939.
T325	46158-46266	Sentence	denotes	The mass spectrometry glycomics data are available via GlycoPost with identifiers GPST000120 and GPST000121.
T326	46268-46306	Sentence	denotes	Experimental Model and Subject Details
T327	46307-46418	Sentence	denotes	HEK293-F cells (GIBCO) were maintained and passaged in FreeStyle Media (GIBCO) containing 1% Pen Strep (GIBCO).
T328	46419-46580	Sentence	denotes	Vero-6 cells (ATCC) were maintained and passaged in DMEM medium supplemented with 10% fetal bovine serum and 1% Pen Strep (GIBCO) and amphotericin B antibiotics.
T329	46581-46657	Sentence	denotes	All cells were maintained at 37°C with 5% CO2 before and after transfection.
T330	46659-46673	Sentence	denotes	Method Details
T331	46675-46761	Sentence	denotes	Expression, Purification, and Characterization of SARS-CoV-2 S and Human ACE2 Proteins
T332	46762-47189	Sentence	denotes	To express a stabilized ectodomain of Spike protein, a synthetic gene encoding residues 1−1208 of SARS-CoV-2 Spike with the furin cleavage site (residues 682–685) replaced by a “GGSG” sequence, proline substitutions at residues 986 and 987, and a foldon trimerization motif followed by a C-terminal 6xHisTag was created and cloned into the mammalian expression vector pCMV-IRES-puro (Codex BioSolutions, Inc, Gaithersburg, MD).
T333	47190-47319	Sentence	denotes	The expression construct was transiently transfected in HEK293F cells using polyethylenimine (Polysciences, Inc, Warrington, PA).
T334	47320-47564	Sentence	denotes	Protein was purified from cell supernatants using Ni-NTA resin (QIAGEN, Germany), the eluted fractions containing S protein were pooled, concentrated, and further purified by gel filtration chromatography on a Superose 6 column (GE Healthcare).
T335	47565-47662	Sentence	denotes	Negative stain electron microscopy (EM) analysis was performed as described (Shaik et al., 2019).
T336	47663-47956	Sentence	denotes	Briefly, analysis was performed at room temperature with a magnification of 52,000x and a defocus value of 1.5 μm following low-dose procedures, using a Philips Tecnai F20 electron microscope (Thermo Fisher Scientific) equipped with a Gatan US4000 CCD camera and operated at voltage of 200 kV.
T337	47957-48102	Sentence	denotes	The DNA fragment encoding human ACE2 (1-615) with a 6xHis tag at C terminus was synthesized by Genscript and cloned to the vector pCMV-IRES-puro.
T338	48103-48184	Sentence	denotes	The expression construct was transfected in HEK293F cells using polyethylenimine.
T339	48185-48261	Sentence	denotes	The medium was discarded and replaced with FreeStyle 293 medium after 6-8 h.
T340	48262-48387	Sentence	denotes	After incubation in 37°C with 5.5% CO2 for 5 days, the supernatant was collected and loaded to Ni-NTA resin for purification.
T341	48388-48463	Sentence	denotes	The elution was concentrated and further purified by a Superdex 200 column.
T342	48465-48520	Sentence	denotes	In-Gel Analysis of SARS-CoV-2 S and Human ACE2 Proteins
T343	48521-48781	Sentence	denotes	A 3.5-μg aliquot of SARS-CoV-2 S protein as well as a 2-μg aliquot of human ACE2 were combined with Laemmli sample buffer, analyzed on a 4%–12% Invitrogen NuPage Bis-Tris gel using the MES pH 6.5 running buffer, and stained with Coomassie Brilliant Blue G-250.
T344	48783-48875	Sentence	denotes	Analysis of N-linked and O-linked Glycans Released from SARS-Cov-2 S and Human ACE2 Proteins
T345	48876-49030	Sentence	denotes	Aliquots of approximately 25-50 μg of S or ACE2 protein were processed for glycan analysis as previously described (Aoki et al., 2007; Aoki et al., 2008).
T346	49031-49101	Sentence	denotes	For N-linked glycan analysis, the proteins were digested with trypsin.
T347	49102-49234	Sentence	denotes	Following trypsinization, glycopeptides were enriched by C18 Sep-Pak and subjected to PNGaseF digestion to release N-linked glycans.
T348	49235-49372	Sentence	denotes	Following PNGaseF digestion, released glycans were separated from residual glycosylated peptides bearing O-linked glycans by C18 Sep-Pak.
T349	49373-49492	Sentence	denotes	O-glycosylated peptides were eluted from the Sep-Pak and subjected to reductive β-elimination to release the O-glycans.
T350	49493-49610	Sentence	denotes	Another 25-50 μg aliquot of each protein was denatured with SDS and digested with PNGaseF to remove N-linked glycans.
T351	49611-49751	Sentence	denotes	The de-N-glycosylated, intact protein was precipitated with cold ethanol and then subjected to reductive β-elimination to release O-glycans.
T352	49752-49852	Sentence	denotes	The profiles of O-glycans released from peptides or from intact protein were found to be comparable.
T353	49853-50036	Sentence	denotes	N- and O-linked glycans released from glycoproteins were permethylated with methyliodide according to the method of Anumula and Taylor prior to MS analysis (Anumula and Taylor, 1992).
T354	50037-50158	Sentence	denotes	Glycan structural analysis was performed using an LTQ-Orbitrap instrument (Orbitrap Discovery, Thermo Fisher Scientific).
T355	50159-50469	Sentence	denotes	Detection and relative quantification of the prevalence of individual glycans was accomplished using the total ion mapping (TIM) and neutral loss scan (NL scan) functionality of the Xcalibur software package version 2.0 (Thermo Fisher Scientific) as previously described (Aoki et al., 2007; Aoki et al., 2008).
T356	50470-50583	Sentence	denotes	Mass accuracy and detector response was tuned with a permethylated oligosaccharide standard in positive ion mode.
T357	50584-50705	Sentence	denotes	For fragmentation by collision-induced dissociation (CID in MS2 and MSn), normalized collision energy of 45% was applied.
T358	50706-50812	Sentence	denotes	Most permethylated glycans were identified as singly or doubly charged, sodiated species in positive mode.
T359	50813-50917	Sentence	denotes	Sulfated N-glycans were detected as singly or doubly charged, deprotonated species in negative ion mode.
T360	50918-51014	Sentence	denotes	Peaks for all charge states were deconvoluted by the charge state and summed for quantification.
T361	51015-51067	Sentence	denotes	All spectra were manually interpreted and annotated.
T362	51068-51192	Sentence	denotes	The explicit identities of individual monosaccharide residues have been assigned based on known human biosynthetic pathways.
T363	51193-51389	Sentence	denotes	Graphical representations of monosaccharide residues are consistent with the Symbol Nomenclature for Glycans (SNFG), which has been broadly adopted by the glycomics community (Varki et al., 2015).
T364	51390-51578	Sentence	denotes	The MS-based glycomics data generated in these analyses and the associated annotations are presented in accordance with the MIRAGE standards and the Athens Guidelines (Wells et al., 2013).
T365	51579-51794	Sentence	denotes	Data annotation and assignment of glycan accession identifiers were facilitated by GRITS Toolbox, GlyTouCan, GNOme, and GlyGen (Kahsay et al., 2020; Tiemeyer et al., 2017; Weatherly et al., 2019; York et al., 2020).
T366	51796-51857	Sentence	denotes	Analysis of Disulfide Bonds for SARS-Cov-2 S Protein by LC-MS
T367	51858-52043	Sentence	denotes	Two 10-μg aliquots of SARS-CoV-2 S protein were denatured by incubating with 20% acetonitrile at room temperature and alkylated by 13.75 mM of iodoacetamide at room temperature in dark.
T368	52044-52178	Sentence	denotes	The two aliquots of proteins were then digested respectively using alpha lytic protease, or a combination of trypsin, Lys-C and Glu-C.
T369	52179-52254	Sentence	denotes	Following digestion, the proteins were deglycosylated by PNGaseF treatment.
T370	52255-52478	Sentence	denotes	The resulting peptides were separated on an Acclaim PepMap RSLC C18 column (75 μm x 15 cm) and eluted into the nano-electrospray ion source of an Orbitrap Fusion Lumos Tribrid mass spectrometer at a flow rate of 200 nL/min.
T371	52479-52624	Sentence	denotes	The elution gradient consists of 1%–40% acetonitrile in 0.1% formic acid over 370 min followed by 10 min of 80% acetonitrile in 0.1% formic acid.
T372	52625-52722	Sentence	denotes	The spray voltage was set to 2.2 kV and the temperature of the heated capillary was set to 280°C.
T373	52723-52903	Sentence	denotes	Full MS scans were acquired from m/z 200 to 2000 at 60k resolution, and MS/MS scans following electron transfer dissociation (ETD) were collected in the Orbitrap at 15k resolution.
T374	52904-53044	Sentence	denotes	The raw spectra were analyzed by Byonic (v3.8.13, Protein Metrics Inc.) with mass tolerance set as 20 ppm for both precursors and fragments.
T375	53045-53125	Sentence	denotes	The search output was filtered at 1% false discovery rate and 10 ppm mass error.
T376	53126-53208	Sentence	denotes	The spectra assigned as cross-linked peptides were manually evaluated for Cys0015.
T377	53210-53308	Sentence	denotes	Analysis of Site-Specific N-linked Glycopeptides for SARS-Cov-2 S and Human ACE2 Proteins by LC-MS
T378	53309-53488	Sentence	denotes	Four 3.5-μg aliquots of SARS-CoV-2 S protein were reduced by incubating with 10 mM of dithiothreitol at 56°C and alkylated by 27.5 mM of iodoacetamide at room temperature in dark.
T379	53489-53664	Sentence	denotes	The four aliquots of proteins were then digested respectively using alpha lytic protease, chymotrypsin, a combination of trypsin and Glu-C, or a combination of Glu-C and AspN.
T380	53665-53836	Sentence	denotes	Three 10-μg aliquots of ACE2 protein were reduced by incubating with 5 mM of dithiothreitol at 56°C and alkylated by 13.75 mM of iodoacetamide at room temperature in dark.
T381	53837-53980	Sentence	denotes	The three aliquots of proteins were then digested respectively using alpha lytic protease, chymotrypsin, or a combination of trypsin and Lys-C.
T382	53981-54204	Sentence	denotes	The resulting peptides were separated on an Acclaim PepMap RSLC C18 column (75 μm x 15 cm) and eluted into the nano-electrospray ion source of an Orbitrap Fusion Lumos Tribrid mass spectrometer at a flow rate of 200 nL/min.
T383	54205-54350	Sentence	denotes	The elution gradient consists of 1%–40% acetonitrile in 0.1% formic acid over 370 min followed by 10 min of 80% acetonitrile in 0.1% formic acid.
T384	54351-54448	Sentence	denotes	The spray voltage was set to 2.2 kV and the temperature of the heated capillary was set to 280°C.
T385	54449-54816	Sentence	denotes	Full MS scans were acquired from m/z 200 to 2000 at 60k resolution, and MS/MS scans following higher-energy collisional dissociation (HCD) with stepped collision energy (15%, 25%, 35%) were collected in the Orbitrap at 15k resolution. pGlyco v2.2.2 (Liu et al., 2017) was used for database searches with mass tolerance set as 20 ppm for both precursors and fragments.
T386	54817-54925	Sentence	denotes	The database search output was filtered to reach a 1% false discovery rate for glycans and 10% for peptides.
T387	54926-55025	Sentence	denotes	Quantitation was performed by calculating spectral counts for each glycan composition at each site.
T388	55026-55121	Sentence	denotes	Any N-linked glycan compositions identified by only one spectra were removed from quantitation.
T389	55122-55207	Sentence	denotes	N-linked glycan compositions were categorized into 22 classes (including Unoccupied):
T390	55208-55550	Sentence	denotes	HexNAc(2)Hex(9∼5)Fuc(0∼1) was classified as M9 to M5 respectively; HexNAc(2)Hex(4∼1)Fuc(0∼1) was classified as M1-M4; HexNAc(3∼6)Hex(5∼9)Fuc(0)NeuAc(0∼1) was classified as Hybrid with HexNAc(3∼6)Hex(5∼9)Fuc(1∼2)NeuAc(0∼1) classified as F-Hybrid; Complex-type glycans are classified based on the number of antenna, fucosylation, and sulfation:
T391	55551-56289	Sentence	denotes	HexNAc(3)Hex(3∼4)Fuc(0)NeuAc(0∼1) is assigned as A1 with HexNAc(3)Hex(3∼4)Fuc(1∼2)NeuAc(0∼1) assigned as F-A1; HexNAc(4)Hex(3∼5)Fuc(0)NeuAc(0∼2) is assigned as A2/A1B with HexNAc(4)Hex(3∼5)Fuc(1∼5)NeuAc(0∼2) assigned as F-A2/A1B; HexNAc(5)Hex(3∼6)Fuc(0)NeuAc(0∼3) is assigned as A3/A2B with HexNAc(5)Hex(3∼6)Fuc(1∼3)NeuAc(0∼3) assigned as F-A3/A2B; HexNAc(6)Hex(3∼7)Fuc(0)NeuAc(0∼4) is assigned as A4/A3B with HexNAc(6)Hex(3∼7)Fuc(1∼3)NeuAc(0∼4) assigned as F-A4/A3B; HexNAc(7)Hex(3∼8)Fuc(0)NeuAc(0∼1) is assigned as A5/A4B with HexNAc(7)Hex(3∼8)Fuc(1∼3)NeuAc(0∼1) as F-A5/A4B; HexNAc(8)Hex(3∼9)Fuc(0) is assigned as A6/A5B with HexNAc(8)Hex(3∼9)Fuc(1) assigned as F-A6/A5B; any glycans identified with a sulfate are assigned as Sulfated.
T392	56291-56363	Sentence	denotes	Analysis of Deglycosylated SARS-Cov-2 S and Human ACE2 Proteins by LC-MS
T393	56364-56544	Sentence	denotes	Three 3.5-μg aliquots of SARS-CoV-2 S protein were reduced by incubating with 10 mM of dithiothreitol at 56°C and alkylated by 27.5 mM of iodoacetamide at room temperature in dark.
T394	56545-56661	Sentence	denotes	The three aliquots were then digested respectively using chymotrypsin, Asp-N, or a combination of trypsin and Glu-C.
T395	56662-56831	Sentence	denotes	Two 10-μg aliquots of ACE2 protein were reduced by incubating with 5 mM of dithiothreitol at 56°C and alkylated by 13.75 mM of iodoacetamide at room temperature in dark.
T396	56832-56939	Sentence	denotes	The two aliquots were then digested respectively using chymotrypsin, or a combination of trypsin and Lys-C.
T397	56940-57074	Sentence	denotes	Following digestion, the proteins were deglycosylated by Endoglycosidase H followed by PNGaseF treatment in the presence of 18O water.
T398	57075-57298	Sentence	denotes	The resulting peptides were separated on an Acclaim PepMap RSLC C18 column (75 μm x 15 cm) and eluted into the nano-electrospray ion source of an Orbitrap Fusion Lumos Tribrid mass spectrometer at a flow rate of 200 nL/min.
T399	57299-57444	Sentence	denotes	The elution gradient consists of 1%–40% acetonitrile in 0.1% formic acid over 370 min followed by 10 min of 80% acetonitrile in 0.1% formic acid.
T400	57445-57542	Sentence	denotes	The spray voltage was set to 2.2 kV and the temperature of the heated capillary was set to 280°C.
T401	57543-57729	Sentence	denotes	Full MS scans were acquired from m/z 200 to 2000 at 60k resolution, and MS/MS scans following collision-induced dissociation (CID) at 38% collision energy were collected in the ion trap.
T402	57730-57870	Sentence	denotes	The spectra were analyzed using SEQUEST (Proteome Discoverer 1.4) with mass tolerance set as 20 ppm for precursors and 0.5 Da for fragments.
T403	57871-58001	Sentence	denotes	The search output was filtered using ProteoIQ (v2.7) to reach a 1% false discovery rate at protein level and 10% at peptide level.
T404	58002-58220	Sentence	denotes	Occupancy of each N-linked glycosylation site was calculated using spectral counts assigned to the 18O-Asp-containing (PNGaseF-cleaved) and/or HexNAc-modified (EndoH-cleaved) peptides and their unmodified counterparts.
T405	58222-58320	Sentence	denotes	Analysis of Site-Specific O-linked Glycopeptides for SARS-Cov-2 S and Human ACE2 Proteins by LC-MS
T406	58321-58538	Sentence	denotes	Three 10-μg aliquots of SARS-CoV-2 S protein and one 10-μg aliquot of ACE2 protein were reduced by incubating with 5 mM of dithiothreitol at 56°C and alkylated by 13.75 mM of iodoacetamide at room temperature in dark.
T407	58539-58656	Sentence	denotes	The four aliquots were then digested respectively using trypsin, Lys-C, Arg-C, or a combination of trypsin and Lys-C.
T408	58657-58776	Sentence	denotes	Following digestion, the proteins were deglycosylated by PNGaseF treatment and then digested with O-protease OpeRATOR®.
T409	58777-59000	Sentence	denotes	The resulting peptides were separated on an Acclaim PepMap RSLC C18 column (75 μm x 15 cm) and eluted into the nano-electrospray ion source of an Orbitrap Fusion Lumos Tribrid mass spectrometer at a flow rate of 200 nL/min.
T410	59001-59146	Sentence	denotes	The elution gradient consists of 1%–40% acetonitrile in 0.1% formic acid over 370 min followed by 10 min of 80% acetonitrile in 0.1% formic acid.
T411	59147-59244	Sentence	denotes	The spray voltage was set to 2.2 kV and the temperature of the heated capillary was set to 280°C.
T412	59245-59519	Sentence	denotes	Full MS scans were acquired from m/z 200 to 2000 at 60k resolution, and MS/MS scans following higher-energy collisional dissociation (HCD) with stepped collision energy (15%, 25%, 35%) or electron transfer dissociation (ETD) were collected in the Orbitrap at 15k resolution.
T413	59520-59638	Sentence	denotes	The raw spectra were analyzed by Byonic (v3.8.13) with mass tolerance set as 20 ppm for both precursors and fragments.
T414	59639-59740	Sentence	denotes	MS/MS filtering was applied to only allow for spectra where the oxonium ions of HexNAc were observed.
T415	59741-59821	Sentence	denotes	The search output was filtered at 1% false discovery rate and 10 ppm mass error.
T416	59822-59893	Sentence	denotes	The spectra assigned as O-linked glycopeptides were manually evaluated.
T417	59894-59993	Sentence	denotes	Quantitation was performed by calculating spectral counts for each glycan composition at each site.
T418	59994-60089	Sentence	denotes	Any O-linked glycan compositions identified by only one spectra were removed from quantitation.
T419	60090-60283	Sentence	denotes	Occupancy of each O-linked glycosylation site was calculated using spectral counts assigned to any glycosylated peptides and their unmodified counterparts from searches without MS/MS filtering.
T420	60285-60342	Sentence	denotes	Sequence Analysis of SARS-CoV-2 S and Human ACE2 Proteins
T421	60343-60488	Sentence	denotes	The genomes of SARS-CoV as well as bat and pangolin coronavirus sequences reported to be closely related to SARS-CoV-2 were downloaded from NCBI.
T422	60489-60660	Sentence	denotes	The S protein sequences from all of those genomes were aligned using EMBOSS needle v6.6.0 (Rice et al., 2000) via the EMBL-EBI provided web service (Madeira et al., 2019).
T423	60661-60761	Sentence	denotes	Manual analysis was performed in the regions containing canonical N-glycosylation sequons (N-X-S/T).
T424	60762-61081	Sentence	denotes	For further sequence analysis of SARS-CoV-2 S variants, the genomes of SARS-CoV-2 were downloaded from NCBI and GISAID and further processed using Biopython 1.76 to extract all sequences annotated as “surface glycoprotein” and to remove any incomplete sequence as well as any sequence containing unassigned amino acids.
T425	61082-61305	Sentence	denotes	For sequence analysis of human ACE2 variants, the single nucleotide polymorphisms (SNPs) of ACE2 were extracted from the NCBI dbSNP database and filtered for missense mutation entries with a reported minor allele frequency.
T426	61306-61467	Sentence	denotes	Manual analysis was performed on both SARS-CoV-2 S and human ACE2 variants to further examine the regions containing canonical N-glycosylation sequons (N-X-S/T).
T427	61468-61577	Sentence	denotes	LibreOffice Writer and its macro capabilities was used to shade regions on the linear sequence of S and ACE2.
T428	61579-61688	Sentence	denotes	3D Structural Modeling and Molecular Dynamics Simulation of Glycosylated SARS-CoV-2 S and Human ACE2 Proteins
T429	61689-61991	Sentence	denotes	SARS-CoV-2 Spike (S) protein structure and ACE2 co-complex – A 3D structure of the prefusion form of the S protein (RefSeq: YP_009724390.1, UniProt: P0DTC2 SPIKE_SARS2), based on a Cryo-EM structure (PDB code 6VSB) (Wrapp et al., 2020), was obtained from the SWISS-MODEL server (swissmodel.expasy.org).
T430	61992-62058	Sentence	denotes	The model has 95% coverage (residues 27 to 1146) of the S protein.
T431	62059-62220	Sentence	denotes	The receptor binding domain (RBD) in the “open” conformation was replaced with the RBD from an ACE2 co-complex (PDB code 6M0J) by grafting residues C336 to V524.
T432	62221-62489	Sentence	denotes	Glycoform generation – Glycans (detected by glycomics) were selected for installation on glycosylated S and ACE2 sequons (detected by glycoproteomics) based on three sets of criteria designed to reasonably capture different aspects of glycosylation microheterogeneity.
T433	62490-62829	Sentence	denotes	We denote the first of these glycoform models as “Abundance.” The glycans selected for installation to generate the Abundance model were chosen because they were identified as the most abundant glycan structure (detected by glycomics) that matched the most abundant glycan composition (detected by glycoproteomics) at each individual site.
T434	62830-63215	Sentence	denotes	We denote the second glycoform model as “Oxford Class.” The glycans selected for installation to generate the Oxford Class model were chosen because they were the most abundant glycan structure, (detected by glycomics) that was contained within the most highly represented Oxford classification group (detected by glycoproteomics) at each individual site (Figure S7; Tables S1 and S8).
T435	63216-63650	Sentence	denotes	Finally, we denote the third glycoform model as “Processed.” The glycans selected for installation to generate the Processed model were chosen because they were the most highly trimmed, elaborated, or terminally decorated structure (detected by glycomics) that corresponded to a composition (detected by glycoproteomics) which was present at ≥ 1/3rd of the abundance of the most highly represented composition at each site (Table S1).
T436	63651-63827	Sentence	denotes	3D structures of the three glycoforms (Abundance, Oxford Class, Processed) were generated for the SARS-CoV-2 S protein alone, and in complex with the glycosylated ACE2 protein.
T437	63828-64191	Sentence	denotes	The glycoprotein builder available at GLYCAM-Web (www.glycam.org) was employed together with an in-house program that adjusts the asparagine side chain torsion angles and glycosidic linkages within known low-energy ranges (Nivedha et al., 2014) to relieve any atomic overlaps with the core protein, as described previously (Grant et al., 2016; Peng et al., 2017).
T438	64192-64391	Sentence	denotes	Energy minimization and Molecular dynamics (MD) simulations – Each glycosylated structure was placed in a periodic box of TIP3P water molecules with a 10 Å buffer between the solute and the box edge.
T439	64392-64587	Sentence	denotes	Energy minimization of all atoms was performed for 20,000 steps (10,000 steepest decent, followed by 10,000 conjugant gradient) under constant pressure (1 atm) and temperature (300 K) conditions.
T440	64588-64830	Sentence	denotes	All MD simulations were performed under nPT conditions with the CUDA implementation of the PMEMD (Götz et al., 2012; Salomon-Ferrer et al., 2013) simulation code, as present in the Amber14 software suite (University of California, San Diego).
T441	64831-64999	Sentence	denotes	The GLYCAM06j force field (Kirschner et al., 2008) and Amber14SB force field (Maier et al., 2015) were employed for the carbohydrate and protein moieties, respectively.
T442	65000-65193	Sentence	denotes	A Berendsen barostat with a time constant of 1 ps was employed for pressure regulation, while a Langevin thermostat with a collision frequency of 2 ps-1 was employed for temperature regulation.
T443	65194-65246	Sentence	denotes	A nonbonded interaction cut-off of 8 Å was employed.
T444	65247-65356	Sentence	denotes	Long-range electrostatics were treated with the particle-mesh Ewald (PME) method (Darden and Pedersen, 1993).
T445	65357-65491	Sentence	denotes	Covalent bonds involving hydrogen were constrained with the SHAKE algorithm, allowing an integration time step of 2 fs to be employed.
T446	65492-65605	Sentence	denotes	The energy minimized coordinates were equilibrated at 300K over 400 ps with restraints on the solute heavy atoms.
T447	65606-65854	Sentence	denotes	Each system was then equilibrated with restraints on the Ca atoms of the protein for 1ns, prior to initiating 4 independent 250 ns production MD simulations with random starting seeds for a total time of 1 μs per system, with no restraints applied.
T448	65855-65882	Sentence	denotes	Antigenic surface analysis.
T449	65883-66237	Sentence	denotes	A series of 3D structure snapshots of the simulation were taken at 1 ns intervals and analyzed in terms of their ability to interact with a spherical probe based on the average size of hypervariable loops present in an antibody complementarity determining region (CDR), as described recently (https://www.biorxiv.org/content/10.1101/2020.04.07.030445v2).
T450	66238-66391	Sentence	denotes	The percentage of simulation time each residue was exposed to the AbASA probe was calculated and plotted onto both the 3D structure and primary sequence.
T451	66393-66458	Sentence	denotes	Analysis of SARS-CoV-2 Spike VSV Pseudoparticles (ppVSV-SARS-2-S)
T452	66459-66566	Sentence	denotes	293T cells were transfected with an expression plasmid encoding SARS-CoV-2 Spike (pcDNAintron-SARS-2-SΔ19).
T453	66567-66679	Sentence	denotes	To increase cell surface expression, the last 19 amino acids containing the Golgi retention signal were removed.
T454	66680-66761	Sentence	denotes	Two SΔ19 constructs were compared, one started with Met1 and the other with Met2.
T455	66762-66895	Sentence	denotes	Twenty-four h following transfection, cells were transduced with ppVSVΔG-VSV-G (particles that were pseudotyped with VSV-G in trans).
T456	66896-66978	Sentence	denotes	One h following transduction cells were extensively washed and media was replaced.
T457	66979-67093	Sentence	denotes	Supernatant containing particles were collected 12-24 h following transduction and cleared through centrifugation.
T458	67094-67149	Sentence	denotes	Cleared supernatant was frozen at −80°C for future use.
T459	67150-67247	Sentence	denotes	Target cells Vero E6 were seeded in 24-well plates (5x105 cells/mL) at a density of 80% coverage.
T460	67248-67437	Sentence	denotes	The following day, ppVSV-SARS-2-S/GFP particles were transduced into target cells for 60 min, particles pseudotyped with VSV-G, Lassa virus GP, or no glycoprotein were included as controls.
T461	67438-67658	Sentence	denotes	24 h following transduction, transduced cells were released from the plate with trypsin, fixed with 4% formaldehyde, and GFP-positive virus-transduced cells were quantified using flow cytometry (Bectin Dickson BD-LSRII).
T462	67659-67895	Sentence	denotes	To quantify the ability of various SARS-CoV-2 S mutants to mediate fusion, effector cells (HEK293T) were transiently transfected with the indicated pcDNAintron-SARS-2-S expression vector or measles virus H and F (Brindley et al., 2014).
T463	67896-68016	Sentence	denotes	Effector cells were infected with MVA-T7 four h following transduction to produce the T7 polymerase (Paal et al., 2009).
T464	68017-68249	Sentence	denotes	Target cells naturally expressing the receptor ACE2 (Vero) or ACE2 negative cells (HEK293T) were transfected with pTM1-luciferase, which encodes for firefly luciferase under the control of a T7 promoter (Brindley and Plemper, 2010).
T465	68250-68355	Sentence	denotes	24 h following transfection, the target cells were lifted and added to the effector cells at a 1:1 ratio.
T466	68356-68486	Sentence	denotes	4 h following co-cultivation, cells were washed, lysed and luciferase levels were quantified using Promega’s Steady-Glo substrate.
T467	68487-68602	Sentence	denotes	To visualize cell-to-cell fusion, Vero cells were co-transfected with pGFP and the pcDNAintron-SARS-2-S constructs.
T468	68603-68683	Sentence	denotes	24 h following transfection, syncytia was visualized by fluorescence microscopy.
T469	68685-68724	Sentence	denotes	Quantification and Statistical Analysis
T470	68725-68852	Sentence	denotes	Raw glycoproteomic data from the mass spectrometers was searched using Proteome Discoverer v1.4 (SEQUEST), Protein Metrics Inc.
T471	68853-68887	Sentence	denotes	Byonic v3.8.13, and pGlyco v2.2.2.
T472	68888-69020	Sentence	denotes	For data searches using Proteome Discoverer, the results were processed to apply false discovery rate filtering using ProteoIQ v2.7.
T473	69021-69193	Sentence	denotes	For the deglycosylated protein work, search results from SEQUEST were filtered in ProteoIQ with a 1% false discovery rate at the protein level and 10% at the peptide level.
T474	69194-69327	Sentence	denotes	For N-linked glycopeptide analysis, pGlyco was used with false discovery rate of 1% at the glycan level and 10% at the peptide level.
T475	69328-69444	Sentence	denotes	For disulfide bond analysis and O-glycopeptide searches, Byonic was used and the false discovery rate was set to 1%.
T476	69445-69497	Sentence	denotes	All mass spectrometry results were manually curated.
T477	69498-69750	Sentence	denotes	Antigen accessibility simulations were carried out as described in the Method Details section and the mean of four simulations (three of length 350ns, one of length 200ns; amounting to 1.25 μs of total molecular dynamics simulation time) were utilized.
T478	69751-70069	Sentence	denotes	Glycan-glycan and glycan-peptide interactions were also calculated based on simulations as a percentage of time residues were in contact and averaged (mean) to produce the corresponding supplemental (colored) sequence figures with the raw numbers for coloring present also in each corresponding supplemental table tab.
T479	70070-70166	Sentence	denotes	3D distances were computed using Rpdb as described in more detail in the Method Details section.
T480	70167-70263	Sentence	denotes	This data is presented using box & whisker plots with all underlying statistics calculated in R.

T16

0-12

Sentence

denotes

Introduction

T17

13-236

Sentence

denotes

The SARS-CoV-2 coronavirus, a positive-sense single-stranded RNA virus, is responsible for the severe acute respiratory syndrome referred to as COVID-19 that was first reported in China in December 2019 (Zhou et al., 2020).

T18

237-468

Sentence

denotes

In approximately six months, this betacoronavirus has spread globally, with more than 14 million people testing positive worldwide resulting in greater than 600,000 deaths as of July 20, 2020 (https://coronavirus.jhu.edu/map.html).

T19

469-703

Sentence

denotes

The SARS-CoV-2 coronavirus is highly similar (nearly 80% identical at the genomic level) to SARS-CoV-1, which was responsible for the severe acute respiratory syndrome outbreak that began in 2002 (Lu et al., 2020; Zhong et al., 2003).

T20

704-914

Sentence

denotes

Furthermore, human SARS-CoV-2 at the whole-genome level is >95% identical to a bat coronavirus (RaTG13), the natural reservoir host for multiple coronaviruses (Xia, 2020; Zhang et al., 2020; Zhou et al., 2020).

T21

915-1401

Sentence

denotes

Given the rapid appearance and spread of this virus, there is no current validated vaccine or SARS-CoV-2-specific targeting therapy that is clinically approved, although statins, heparin, and steroids look promising for lowering fatality rates, and antivirals likely reduce the duration of symptomatic disease presentation (Alijotas-Reig et al., 2020; Beigel et al., 2020; Beun et al., 2020; Dashti-Khavidaki and Khalili, 2020; Fedson et al., 2020; Shi et al., 2020; Tang et al., 2020).

T22

1402-1567

Sentence

denotes

SARS-CoV-2, like SARS-CoV-1, utilizes the host angiotensin-converting enzyme 2 (ACE2) for binding and entry into host cells (Hoffmann et al., 2020; Li et al., 2003).

T23

1568-1743

Sentence

denotes

Like many viruses, SARS-CoV-2 utilizes a Spike glycoprotein trimer for recognition and binding to the host cell entry receptor and for membrane fusion (Watanabe et al., 2019).

T24

1744-2124

Sentence

denotes

Given the importance of viral Spike proteins for targeting and entry into host cells along with their location on the viral surface, Spike proteins are often used as immunogens for vaccines to generate neutralizing antibodies and frequently targeted for inhibition by small molecules that might block host receptor binding and/or membrane fusion (Li, 2016; Watanabe et al., 2019).

T25

2125-2368

Sentence

denotes

In similar fashion, wild-type or catalytically impaired ACE2 has also been investigated as a potential therapeutic biologic that might interfere with the infection cycle of ACE2-targeting coronaviruses (Lei et al., 2020; Monteil et al., 2020).

T26

2369-2576

Sentence

denotes

Thus, a detailed understanding of SARS-CoV-2 Spike binding to ACE2 is critical for elucidating mechanisms of viral binding and entry, as well as for undertaking the rational design of effective therapeutics.

T27

2577-2741

Sentence

denotes

The SARS-CoV-2 Spike glycoprotein consists of two subunits, a receptor binding subunit (S1) and a membrane fusion subunit (S2) (Lu et al., 2020; Zhou et al., 2020).

T28

2742-3026

Sentence

denotes

The Spike glycoprotein assembles into stable homotrimers that together possess 66 canonical sequons for N-linked glycosylation (N-X-S/T, where X is any amino acid except P) as well as a number of potential O-linked glycosylation sites (Watanabe et al., 2020a; Watanabe et al., 2020b).

T29

3027-3350

Sentence

denotes

Interestingly, coronaviruses virions bud into the lumen of the endoplasmic reticulum-Golgi intermediate compartment, ERGIC, raising unanswered questions regarding the precise mechanisms by which viral surface glycoproteins are processed as they traverse the secretory pathway (Stertz et al., 2007; Ujike and Taguchi, 2015).

T30

3351-3656

Sentence

denotes

Although this and similar studies (Shajahan et al., 2020; Watanabe et al., 2020a) analyze recombinant proteins, a previous study on SARS-CoV-1 suggested that glycosylation of the Spike can be impacted by this intracellular budding, and this remains to be investigated in SARS-CoV-2 (Ritchie et al., 2010).

T31

3657-4024

Sentence

denotes

Nonetheless, it has been proposed that this virus, and others, acquires a glycan coat sufficient and similar enough to endogenous host protein glycosylation that it serves as a glycan shield, facilitating immune evasion by masking non-self viral peptides with self-glycans (Stertz et al., 2007; Ujike and Taguchi, 2015; Watanabe et al., 2020b; Watanabe et al., 2019).

T32

4025-4383

Sentence

denotes

In parallel with their potential masking functions, glycan-dependent epitopes can elicit specific, even neutralizing, antibody responses, as has been described for HIV-1 (Duan et al., 2018; Escolano et al., 2019; Pinto et al., 2020; Seabright et al., 2020; Watanabe et al., 2019; Yu et al., 2018; https://www.biorxiv.org/content/10.1101/2020.06.30.178897v1).

T33

4384-4573

Sentence

denotes

Thus, understanding the glycosylation of the viral Spike trimer is fundamental for the development of efficacious vaccines, neutralizing antibodies, and therapeutic inhibitors of infection.

T34

4574-4689

Sentence

denotes

ACE2 is an integral membrane metalloproteinase that regulates the renin-angiotensin system (Tikellis et al., 2011).

T35

4690-4864

Sentence

denotes

Both SARS-CoV-1 and SARS-CoV-2 have co-opted ACE2 to function as the receptor by which these viruses attach and fuse with host cells (Hoffmann et al., 2020; Li et al., 2003).

T36

4865-5126

Sentence

denotes

ACE2 is cleavable by ADAM proteases at the cell surface (Lambert et al., 2005), resulting in the shedding of a soluble ectodomain that can be detected in apical secretions of various epithelial layers (gastric, airway, etc.) and in serum (Epelman et al., 2009).

T37

5127-5266

Sentence

denotes

The N-terminal extracellular domain of ACE2 contains six canonical sequons for N-linked glycosylation and several potential O-linked sites.

T38

5267-5515

Sentence

denotes

Several nonsynonymous single-nucleotide polymorphisms (SNPs) in the ACE2 gene have been identified in the human population and could potentially alter ACE2 glycosylation and/or affinity of the receptor for the viral Spike protein (Li et al., 2005).

T39

5516-5886

Sentence

denotes

Given that glycosylation can affect the half-life of circulating glycoproteins in addition to modulating the affinity of their interactions with receptors and immune/inflammatory signaling pathways (Marth and Grewal, 2008; Varki, 2017), understanding the impact of glycosylation of ACE2 with respect to its binding of SARS-CoV-2 Spike glycoprotein is of high importance.

T40

5887-6176

Sentence

denotes

The proposed use of soluble extracellular domains of ACE2 as decoy, competitive inhibitors for SARS-CoV-2 infection emphasizes the critical need for understanding the glycosylation profile of ACE2 so that optimally active biologics can be produced (Lei et al., 2020; Monteil et al., 2020).

T41

6177-6539

Sentence

denotes

To accomplish the task of characterizing site-specific glycosylation of the trimer Spike of SARS-CoV-2 and the host receptor ACE2, we began by expressing and purifying a stabilized, soluble trimer Spike glycoprotein mimetic immunogen (that we define here and forward as S, [Yu et al., 2020]) and a soluble version of the ACE2 glycoprotein from a human cell line.

T42

6540-6722

Sentence

denotes

We utilized multiple mass-spectrometry-based approaches, including glycomic and glycoproteomic approaches, to determine occupancy and site-specific heterogeneity of N-linked glycans.

T43

6723-6906

Sentence

denotes

Occupancy (i.e., the percent of any given residue being modified by a glycan) is an important consideration when developing neutralizing antibodies against a glycan-dependent epitope.

T44

6907-7018

Sentence

denotes

We also identified sites of O-linked glycosylation and the heterogeneity of the O-linked glycans on S and ACE2.

T45

7019-7234

Sentence

denotes

We leveraged this rich dataset, along with existing 3D-structures of both glycoproteins, to generate static and molecular dynamics (MD) models of S alone, and in complex with the glycosylated, soluble ACE2 receptor.

T46

7235-7510

Sentence

denotes

By combining bioinformatics characterization of viral evolution and variants of S and ACE2 with MD simulations of the glycosylated S-ACE2 interaction, we identified important roles for glycans in multiple processes, including receptor-viral binding and glycan shielding of S.

T47

7511-7749

Sentence

denotes

Our rich characterization of the recombinant, glycosylated S trimer mimetic immunogen of SARS-CoV-2 in complex with the soluble human ACE2 receptor provides a detailed platform for guiding rational vaccine, antibody, and inhibitor design.

T48

7751-7758

Sentence

denotes

Results

T49

7760-7869

Sentence

denotes

Expression, Purification, and Characterization of SARS-CoV-2 Spike Glycoprotein Trimer and Soluble Human ACE2

T50

7870-8376

Sentence

denotes

A trimer-stabilized, soluble variant of the SARS-CoV-2 S that contains 22 canonical N-linked glycosylation sequons per protomer and a soluble version of human ACE2 that contains six, lacking the most C-terminal seventh, canonical N-linked glycosylation sequons (Figure 1 A) were purified from the media of transfected HEK293 cells, and the quaternary structure confirmed by negative EM staining for the S trimer (Figure 1B) and purity examined by SDS-PAGE Coomassie G-250 stained gels for both (Figure 1C).

T51

8377-8505

Sentence

denotes

In addition, proteolytic digestions followed by proteomic analyses confirmed that the proteins were highly purified (Table S12).

T52

8506-8705

Sentence

denotes

Finally, the N terminus of both the mature S and the soluble mature ACE2 were empirically determined via proteolytic digestions and liquid chromatography-tandem mass spectrometry (LC-MS/MS) analyses.

T53

8706-8934

Sentence

denotes

These results confirmed that both the secreted, mature forms of S protein and ACE2 begin with an N-terminal glutamine that has undergone condensation to form pyroglutamine at residues 14 and 18, respectively (Figures 1D and S1).

T54

8935-9182

Sentence

denotes

The N-terminal peptide observed for S also contains a glycan at Asn-0017 (Figure 1D), and mass spectrometry analysis of non-reducing proteolytic digestions confirmed that Cys-0015 of S is in a disulfide linkage with Cys-0136 (Figure S2; Table S2).

T55

9183-9516

Sentence

denotes

Given that SignalP (Almagro Armenteros et al., 2019) predicts signal sequence cleavage between Cys-0015 and Val-0016 but we observed cleavage between Ser-0013 and Gln-0014, we examined the possibility that an in-frame upstream methionine to the proposed start methionine (Figure 1A) might be used to initiate translation (Figure S3).

T56

9517-9736

Sentence

denotes

If one examines the predicted signal sequence cleavage using the in-frame Met that is encoded nine amino acids upstream, SignalP now predicts cleavage between the Ser and Gln that we observed in our studies (Figure S3).

T57

9737-9978

Sentence

denotes

To examine whether this impacted S expression, we expressed constructs that contained or did not contain the upstream 27 nucleotides in a pseudovirus (VSV) system expressing SARS-CoV-2 S (Figure S4) and in our HEK293 system (data not shown).

T58

9979-10100

Sentence

denotes

Both expression systems produced a similar amount of S regardless of which expression construct was utilized (Figure S4).

T59

10101-10306

Sentence

denotes

Thus, while the translation initiation start site has still not been fully defined, allowing for earlier translation in expression construct design did not have a significant impact on the generation of S.

T60

10307-10420

Sentence

denotes

Figure 1 Expression and Characterization of SARS-CoV-2 Spike Glycoprotein Trimer Immunogen and Soluble Human ACE2

T61

10421-10484

Sentence

denotes

(A) Sequences of SARS-CoV-2 S immunogen and soluble human ACE2.

T62

10485-10591

Sentence

denotes

The N-terminal pyroglutamines for both mature protein monomers are bolded, underlined, and shown in green.

T63

10592-10678

Sentence

denotes

The canonical N-linked glycosylation sequons are bolded, underlined, and shown in red.

T64

10679-10888

Sentence

denotes

(B and C) Negative stain electron microscopy of the purified trimer (B) and Coomassie G-250-stained reducing SDS-PAGE gels (C) confirmed purity of the SARS-CoV-2 S protein trimer and of the soluble human ACE2.

T65

10889-10919

Sentence

denotes

MWM, molecular weight markers.

T66

10920-11089

Sentence

denotes

(D) A representative Step-HCD fragmentation spectrum from mass-spectrometry analysis of a tryptic digest of S annotated manually based on search results from pGlyco 2.2.

T67

11090-11182

Sentence

denotes

This spectrum defines the N terminus of the mature protein monomer as (pyro-)glutamine 0014.

T68

11183-11344

Sentence

denotes

A representative N-glycan consistent with this annotation and our glycomics data (Figure 2) is overlaid by using the Symbol Nomenclature For Glycans (SNFG) code.

T69

11345-11381

Sentence

denotes

This complex glycan occurs at N0017.

T70

11382-11501

Sentence

denotes

Note, that as expected, the cysteine is carbamidomethylated, and the mass accuracy of the assigned peptide is 0.98 ppm.

T71

11502-11614

Sentence

denotes

On the sequence of the N-terminal peptide and in the spectrum, the assigned b (blue) and y (red) ions are shown.

T72

11615-11768

Sentence

denotes

In the spectrum, purple highlights glycan oxonium ions and green marks intact peptide fragment ions with various partial glycan sequences still attached.

T73

11769-11936

Sentence

denotes

Note that the green-labeled ions allow for limited topology to be extracted including defining that the fucose is on the core and not the antennae of the glycopeptide.

T74

11938-12043

Sentence

denotes

Glycomics-Informed Glycoproteomics Reveals Site-Specific Microheterogeneity of SARS-CoV-2 S Glycosylation

T75

12044-12128

Sentence

denotes

We utilized multiple approaches to examine glycosylation of the SARS-CoV-2 S trimer.

T76

12129-12264

Sentence

denotes

First, the portfolio of glycans linked to SARS-CoV-2 S trimer immunogen was analyzed after their release from the polypeptide backbone.

T77

12265-12391

Sentence

denotes

N-glycans were released from protein by treatment with PNGase F- and O-glycans were subsequently released by beta-elimination.

T78

12392-12588

Sentence

denotes

After permethylation to enhance detection sensitivity and structural characterization, released glycans were analyzed by multi-stage mass spectrometry (MSn) (Aoki et al., 2007; Aoki et al., 2008).

T79

12589-12714

Sentence

denotes

Mass spectra were processed by GRITS Toolbox, and the resulting annotations were validated manually (Weatherly et al., 2019).

T80

12715-12871

Sentence

denotes

Glycan assignments were grouped by type and by additional structural features for relative quantification of profile characteristics (Figure 2 A; Table S3).

T81

12872-13040

Sentence

denotes

This analysis quantified 49 N-glycans and revealed that 55% of the total glycan abundance was of the complex type, 17% was of the hybrid type, and 28% was high mannose.

T82

13041-13190

Sentence

denotes

Among the complex and hybrid N-glycans, we observed a high degree of core fucosylation and significant abundance of bisected and LacDiNAc structures.

T83

13191-13432

Sentence

denotes

We also observed sulfated N-linked glycans by using negative mode MSn analyses (Table S13), although signal intensity was too low in positive ion mode (at least 10-fold lower than any of the non-sulfated glycans) for accurate quantification.

T84

13433-13520

Sentence

denotes

In addition, we detected 15 O-glycans released from the S trimer (Figure S5; Table S4).

T85

13521-13659

Sentence

denotes

Figure 2 Glycomics-Informed Glycoproteomics Reveals Substantial Site-Specific Microheterogeneity of N-linked Glycosylation on SARS-CoV-2 S

T86

13660-13763

Sentence

denotes

(A) Glycans released from SARS-CoV-2 S protein trimer immunogen were permethylated and analyzed by MSn.

T87

13764-13885

Sentence

denotes

Structures were assigned and grouped by type and structural features, and prevalence was determined based on ion current.

T88

13886-13944

Sentence

denotes

The pie chart shows basic division by broad N-glycan type.

T89

13945-14013

Sentence

denotes

The bar graph provides additional detail about the glycans detected.

T90

14014-14187

Sentence

denotes

The most abundant structure with a unique categorization by glycomics for each N-glycan type in the pie chart, or above each feature category in the bar graph, is indicated.

T91

14188-14412

Sentence

denotes

(B–E) Glycopeptides were prepared from SARS-CoV-2 S protein trimer immunogen by using multiple combinations of proteases, analyzed by LC-MSn, and the resulting data were searched by using several different software packages.

T92

14413-14535

Sentence

denotes

Four representative sites of N-linked glycosylation with specific features of interest were chosen and are presented here.

T93

14536-14764

Sentence

denotes

N0074 (B) and N0149 (C) are shown that occur in variable insert regions of S compared to SARS-CoV and other related coronaviruses, and there are emerging variants of SARS-CoV-2 that disrupt these two sites of glycosylation in S.

T94

14765-14823

Sentence

denotes

N0234 (D) contains the most high-mannose N-linked glycans.

T95

14824-14974

Sentence

denotes

N0801 (D) is an example of glycosylation in the S2 region of the immunogen and displays a high degree of hybrid glycosylation compared to other sites.

T96

14975-15057

Sentence

denotes

The abundance of each composition is graphed in terms of assigned spectral counts.

T97

15058-15178

Sentence

denotes

Representative glycans (as determined by glycomics analysis) for several abundant compositions are shown in SNFG format.

T98

15179-15310

Sentence

denotes

The abbreviations used here and throughout the manuscript are as follows: N, HexNAc; H, hexose; F, fucose; A, Neu5Ac; S, sulfation.

T99

15311-15475

Sentence

denotes

Note that the graphs for the other 18 sites and other graphs grouping the microheterogeneity observed by other properties are presented in Supplemental Information.

T100

15476-15716

Sentence

denotes

To determine occupancy of N-linked glycans at each site, we employed a sequential deglycoslyation approach by using Endoglycosidase H and PNGase F in the presence of 18O-H2O after tryptic digestion of S (Wang et al., 2020; Yu et al., 2018).

T101

15717-15848

Sentence

denotes

After LC-MS/MS analyses, the resulting data confirmed that 19 of the canonical sequons had occupancies greater than 95% (Table S5).

T102

15849-16013

Sentence

denotes

One canonical sequence, N0149, had insufficient spectral counts for quantification by this method, but subsequent analyses described below suggested high occupancy.

T103

16014-16119

Sentence

denotes

The two most C-terminal N-linked sites, N1173 and N1194, had reduced occupancy, 52% and 82% respectively.

T104

16120-16299

Sentence

denotes

Reduced occupancy at these sites could reflect hindered en bloc transfer by the oligosaccharyltransferase (OST) due to primary amino acid sequences at or near the N-linked sequon.

T105

16300-16632

Sentence

denotes

Alternatively, this could reflect these two sites being post-translationally modified after release of the protein by the ribosome by a less efficient STT3B-containing OST, either due to activity or initial folding of the polypeptide, as opposed to co-translationally modified by the STT3A-containing OST (Ruiz-Canada et al., 2009).

T106

16633-16959

Sentence

denotes

None of the non-canonical sequons (three N-X-C sites and four N-G-L/I/V sites; Zielinska et al., 2010) showed significant occupancy (>5%), except for N0501, which showed moderate (19%) conversion to 18O-Asp that could be due to deamidation that is facilitated by glycine at the +1 position (Table S5) (Palmisano et al., 2012).

T107

16960-17115

Sentence

denotes

Further analysis of this site (see below) by direct glycopeptide analyses allowed us to determine that N0501 undergoes deamidation but is not glycosylated.

T108

17116-17277

Sentence

denotes

Thus, all, and only the, 22 canonical sequences for N-linked glycosylation (N-X-S/T) are utilized, with only N1173 and N1194 demonstrating occupancies below 95%.

T109

17278-17440

Sentence

denotes

Next, we applied three different proteolytic digestion strategies to the SARS-CoV-2 S immunogen to maximize glycopeptide coverage by subsequent LC-MS/MS analyses.

T110

17441-17741

Sentence

denotes

Extended gradient nanoflow reverse-phase LC-MS/MS was carried out on a ThermoFisher Lumos Tribrid instrument using Step-HCD fragmentation on each of the samples (see STAR Methods for details, as well as Duan et al., 2018; Escolano et al., 2019; Wang et al., 2020; Yu et al., 2018; Zhou et al., 2017).

T111

17742-18055

Sentence

denotes

After data analyses using pGlyco 2.2.2 (Liu et al., 2017), Byonic (Bern et al., 2012), and manual validation of glycan compositions against our released glycomics findings (Figure 2A; Tables S3 and S13), we were able to determine the microheterogeneity at each of the 22 canonical sites (Figures 2B–2E; Table S6).

T112

18056-18164

Sentence

denotes

Notably, none of the non-canonical consensus sequences, including N0501, displayed any quantifiable glycans.

T113

18165-18292

Sentence

denotes

The N-glycosites N0074 (Figure 2B) and N0149 (Figure 2C) are highly processed and display a typical mammalian N-glycan profile.

T114

18293-18383

Sentence

denotes

N0149 is, however, modified with several hybrid N-glycan structures, whereas N0074 is not.

T115

18384-18574

Sentence

denotes

N0234 (Figure 2D) and N0801 (Figure 2E) have N-glycan profiles more similar to those found on other viruses such as HIV (Watanabe et al., 2019) that are dominated by high-mannose structures.

T116

18575-18729

Sentence

denotes

N0234 (Figure 2D) displays an abundance of Man7-Man9 high-mannose structures, suggesting stalled processing by early-acting ER and cis-Golgi mannosidases.

T117

18730-18927

Sentence

denotes

In contrast, N0801 (Figure 2E) is processed more efficiently to Man5 high-mannose and hybrid structures, suggesting that access to the glycan at this site by MGAT1 and α-Mannosidase II is hindered.

T118

18928-19195

Sentence

denotes

In general, for all 22 sites (Figures 2B–2E; Table S6), we observed underprocessing of complex glycan antennae (i.e., under-galactosylation and under-sialylation) and a high degree of core fucosylation in agreement with released glycan analyses (Figure 2A; Table S3).

T119

19196-19294

Sentence

denotes

We also observed a small percent of sulfated N-linked glycans at several sites (Tables S6 and S8).

T120

19295-19510

Sentence

denotes

Based on the assignments and the spectral counts for each topology, we were able to determine the percent of total N-linked glycan types (high-mannose, hybrid, or complex) present at each site (Figure 3 ; Table S7).

T121

19511-19757

Sentence

denotes

Notably, three of the sites (N0234, N0709, and N0717) displayed more than 50% high-mannose glycans, whereas 11 other sites (N0017, N0074, N0149, N0165, N0282, N0331, N0657, N1134, N1158, N1173, and N1194) were more than 90% complex when occupied.

T122

19758-19824

Sentence

denotes

The other eight sites were distributed between these two extremes.

T123

19825-19955

Sentence

denotes

Notably, only one site (N0717 at 45%), which also had greater than 50% high mannose (55%), had greater than 33% hybrid structures.

T124

19956-20226

Sentence

denotes

To further evaluate the heterogeneity, we grouped all the topologies into the 20 classes recently described by the Crispin laboratory, adding two categories (sulfated and unoccupied) that we refer to here as the Oxford classification (Table S8) (Watanabe et al., 2020a).

T125

20227-20526

Sentence

denotes

Among other features observed, this classification allowed us to observe that although most sites with high-mannose structures were dominated by the Man5GlcNAc2 structure, N0234 and N0717 were dominated by the higher Man structures of Man8GlcNAc2 and Man7GlcNAc2, respectively (Figure S7; Table S8).

T126

20527-20750

Sentence

denotes

Limited processing at N0234 is in agreement with a recent report suggesting that high-mannose structures at this site help to stabilize the receptor-binding domain of S (www.biorxiv.org/content/10.1101/2020.06.11.146522v1).

T127

20751-21078

Sentence

denotes

Furthermore, applying the Oxford classifications to our dataset clearly demonstrates that the three most C-terminal sites (N1158, N1173, and N1194), dominated by complex-type glycans, were more often further processed (i.e., multiple antennae) and elaborated (i.e., galactosylation and sialylation) than other sites (Table S8).

T128

21079-21173

Sentence

denotes

Figure 3 SARS-CoV-2 S Immunogen N-glycan Sites Are Predominantly Modified by Complex N-glycans

T129

21174-21455

Sentence

denotes

N-glycan topologies were assigned to all 22 sites of the S protomer and the spectral counts for each of the three types of N-glycans (high-mannose, hybrid, and complex), as well as the unoccupied peptide spectral match counts at each site, were summed and visualized as pie charts.

T130

21456-21543

Sentence

denotes

Note that only N1173 and N1194 show an appreciable amount of the unoccupied amino acid.

T131

21544-21827

Sentence

denotes

We also analyzed our generated mass spectrometry data for the presence of O-linked glycans based on our glycomic findings (Figure S5; Table S4) and a recent manuscript suggesting significant levels of O-glycosylation of S1 and S2 when expressed independently (Shajahan et al., 2020).

T132

21828-21964

Sentence

denotes

We were able to confirm sites of O-glycan modification with microheterogeneity observed for the vast majority of these sites (Table S9).

T133

21965-22168

Sentence

denotes

However, occupancy at each site, determined by spectral counts, was observed to be very low (below 4%), except for Thr0323, which had a modestly higher but still low 11% occupancy (Figure S6; Table S10).

T134

22170-22304

Sentence

denotes

3D Structural Modeling of Glycosylated SARS-CoV-2 Trimer Immunogen Enables Predictions of Epitope Accessibility and Other Key Features

T135

22305-22427

Sentence

denotes

A 3D structure of the S trimer was generated by using a homology model of the S trimer described previously (based on PDB:

T136

22428-22454

Sentence

denotes

6VSB; Wrapp et al., 2020).

T137

22455-22758

Sentence

denotes

Onto this 3D structure, we installed explicitly defined glycans at each glycosylated sequon based on one of three separate sets of criteria, thereby generating three different glycoform models for comparison that we denote as “Abundance,” “Oxford Class,” and “Processed” models (STAR Methods; Table S1).

T138

22759-22990

Sentence

denotes

These criteria were chosen in order to generate glycoform models that represent reasonable expectations for glycosylation microheterogeneity and integrate cross-validating glycomic and glycoproteomic characterization of S and ACE2.

T139

22991-23089

Sentence

denotes

The three glycoform models were subjected to multiple all-atom MD simulations with explicit water.

T140

23090-23216

Sentence

denotes

Information from analyses of these structures is presented in Figure 4 A along with the sequence of the SARS-CoV-2 S protomer.

T141

23217-23326

Sentence

denotes

We also determined variants in S that are emerging in the virus that have been sequenced to date (Table S11).

T142

23327-23502

Sentence

denotes

The inter-residue distances were measured between the most α-carbon-distal atoms of the N-glycan sites and Spike glycoprotein population variant sites in 3D space (Figure 4B).

T143

23503-23726

Sentence

denotes

Notable from this analysis, there are several variants that don’t ablate the N-linked sequon but are sufficiently close in 3D space to N-glycosites, such as D138H, H655Y, S939F, and L1203F, to warrant further investigation.

T144

23727-23877

Sentence

denotes

Figure 4 3D Structural Modeling of Glycosylated SARS-CoV-2 Spike Trimer Immunogen Reveals Predictions for Antigen Accessibility and Other Key Features

T145

23878-24070

Sentence

denotes

Results from glycomics and glycoproteomics experiments were combined with results from bioinformatics analyses and used to model several versions of glycosylated SARS-CoV-2 S trimer immunogen.

T146

24071-24178

Sentence

denotes

(A) Sequence of the SARS-CoV-2 S immunogen displaying computed antigen accessibility and other information.

T147

24179-24260

Sentence

denotes

Antigen accessibility is indicated by red shading across the amino acid sequence.

T148

24261-24464

Sentence

denotes

(B) Emerging variants confirmed by independent sequencing experiments were analyzed based on the 3D structure of SARS-CoV-2 S to generate a proximity chart to the determined N-linked glycosylation sites.

T149

24465-24678

Sentence

denotes

(C) SARS-CoV-2 S trimer immunogen model from MD simulation displaying abundance glycoforms and antigen accessibility shaded in red for most accessible, white for partial, and black for inaccessible (see Video S1).

T150

24679-24795

Sentence

denotes

(D) SARS-CoV-2 S trimer immunogen model from MD simulation displaying Oxford Class glycoforms and sequence variants.

T151

24796-24921

Sentence

denotes

Asterisk indicates not visible, whereas the box represents three amino acid variants that are clustered together in 3D space.

T152

24922-25093

Sentence

denotes

(E) SARS-CoV-2 S trimer immunogen model from MD simulation displaying processed glycoforms plus shading of Thr-323 that has O-glycosylation at low stoichiometry in yellow.

T153

25094-25351

Sentence

denotes

The percentage of simulation time that each S protein residue is accessible to a probe that approximates the size of an antibody variable domain was calculated for a model of the S trimer by using the Abundance glycoforms (Table S1) (Ferreira et al., 2018).

T154

25352-25522

Sentence

denotes

The predicted antibody accessibility is visualized across the sequence, as well as mapped onto the 3D surface, via color shading (Figures 4A and 4C; Table S13; Video S1).

T155

25523-25805

Sentence

denotes

Additionally, the Oxford Class glycoforms model (Table S1), which is arguably the most encompassing means for representing glycan microheterogeneity because it captures abundant structural topologies (Table S8), is shown with the sequence variant information (Figure 4D; Table S11).

T156

25806-26128

Sentence

denotes

A substantial number of these variants occur (directly by comparison to Figure 4A or visually by comparison to Figure 4C) in regions of high calculated epitope accessibility (e.g., N74K, T76I, R78M, D138H, H146Y, S151I, D253G, V483A, etc.; Table S14), suggesting potential selective pressure to avoid host immune response.

T157

26129-26477

Sentence

denotes

Also, it is interesting to note that three of the emerging variants would eliminate N-linked sequons in S; N74K and T76I would eliminate N-glycosylation of N74 (found in the insert variable region 1 of CoV-2 S compared to CoV-1 S), and S151I eliminates N-glycosylation of N149 (found in the insert variable region 2) (Figures 4A and S7; Table S11).

T158

26478-26733

Sentence

denotes

Lastly, the SARS-CoV-2 S Processed glycoform model is shown (Table S1), along with marking amino acid T0323 that has a modest (11% occupancy, Figure S6; Table S10) amount of O-glycosylation to represent the most heavily glycosylated form of S (Figure 4E).

T159

26734-26743

Sentence

denotes

Video S1.

T160

26744-26802

Sentence

denotes

Glycosylated S Antigen Accessibility, Related to Figure 4C

T161

26804-26885

Sentence

denotes

Glycomics-Informed Glycoproteomics Reveals Complex N-linked Glycosylation of ACE2

T162

26886-27004

Sentence

denotes

We also analyzed ACE2 glycosylation utilizing the same glycomic and glycoproteomic approaches described for S protein.

T163

27005-27234

Sentence

denotes

Glycomic analyses of released N-linked glycans (Figure 5 A; Table S3) revealed that the majority of glycans on ACE2 are complex with limited high-mannose and hybrid glycans, and we were unable to detect sulfated N-linked glycans.

T164

27235-27441

Sentence

denotes

Glycoproteomic analyses revealed that occupancy was high (> 75%) at all six sites, and significant microheterogeneity dominated by complex N-glycans was observed for each site (Figures 5B–5G; Tables S5–S8).

T165

27442-27702

Sentence

denotes

We also observed, consistent with the O-glycomics (Figure S5; Table S4), that Ser 155 and several S/T residues at the C terminus of ACE2 outside of the peptidase domain were O-glycosylated, but stoichiometry was extremely low (less than 2%; Tables S9 and S10).

T166

27703-27823

Sentence

denotes

Figure 5 Glycomics-Informed Glycoproteomics of Soluble Human ACE2 Reveals High Occupancy, Complex N-linked Glycosylation

T167

27824-27912

Sentence

denotes

(A) Glycans released from soluble, purified ACE2 were permethylated and analyzed by MSn.

T168

27913-28031

Sentence

denotes

Structures were assigned, grouped by type and structural features, and prevalence was determined based on ion current.

T169

28032-28090

Sentence

denotes

The pie chart shows basic division by broad N-glycan type.

T170

28091-28159

Sentence

denotes

The bar graph provides additional detail about the glycans detected.

T171

28160-28333

Sentence

denotes

The most abundant structure with a unique categorization by glycomics for each N-glycan type in the pie chart, or above each feature category in the bar graph, is indicated.

T172

28334-28539

Sentence

denotes

(B–G) Glycopeptides were prepared from soluble human ACE2 by using multiple combinations of proteases, analyzed by LC-MSn, and the resulting data were searched by using several different software packages.

T173

28540-28599

Sentence

denotes

All six sites of N-linked glycosylation are presented here.

T174

28600-28714

Sentence

denotes

Displayed in the bar graphs are the individual compositions observed graphed in terms of assigned spectral counts.

T175

28715-28835

Sentence

denotes

Representative glycans (as determined by glycomics analysis) for several abundant compositions are shown in SNFG format.

T176

28836-28952

Sentence

denotes

The pie chart (analogous to Figure 3 for SARS-CoV-2 S) for each site is displayed in the upper corner of each panel.

T177

28953-28962

Sentence

denotes

(B) N053.

T178

28963-28972

Sentence

denotes

(C) N090.

T179

28973-28982

Sentence

denotes

(D) N103.

T180

28983-28992

Sentence

denotes

(E) N322.

T181

28993-29002

Sentence

denotes

(F) N432.

T182

29003-29066

Sentence

denotes

(G) N546, a site that does not exist in three in 10,000 people.

T183

29068-29161

Sentence

denotes

3D Structural Modeling of Glycosylated, Soluble, ACE2-Highlighting Glycosylation and Variants

T184

29162-29287

Sentence

denotes

We integrated our glycomics, glycoproteomics, and population variant analyses results with a 3D model of Ace 2 (based on PDB:

T185

29288-29437

Sentence

denotes

6M0J (Lan et al., 2020; see STAR Methods for details) to generate two versions of the soluble glycosylated ACE2 for visualization and MD simulations.

T186

29438-29656

Sentence

denotes

We visualized the ACE2 glycoprotein with the Abundance glycoform model simulated at each site as well as highlighting the naturally occurring variants observed in the human population (Figure 6 A; Video S2; Table S11).

T187

29657-29777

Sentence

denotes

Note, that the Abundance glycoform model and the Oxford Class glycoform model for ACE2 are identical (Tables S1 and S8).

T188

29778-29965

Sentence

denotes

Notably, one site of N-linked glycosylation (N546) is predicted to not be present in three out of 10,000 humans based on naturally occurring variation in the human population (Table S11).

T189

29966-30035

Sentence

denotes

We also modeled ACE2 using the Processed glycoform model (Figure 6B).

T190

30036-30123

Sentence

denotes

In both models, the interaction domain with S is defined (Figures 6A and 6B; Video S2).

T191

30124-30190

Sentence

denotes

Figure 6 3D Structural Modeling of Glycosylated Soluble Human ACE2

T192

30191-30372

Sentence

denotes

Results from glycomics and glycoproteomics experiments were combined with results from bioinformatics analyses and used to model several versions of glycosylated soluble human ACE2.

T193

30373-30505

Sentence

denotes

(A) Soluble human ACE2 model from MD simulations displaying abundance glycoforms, interaction surface with S, and sequence variants.

T194

30506-30597

Sentence

denotes

N546 variant is boxed that would remove N-linked glycosylation at that site (see Video S2).

T195

30598-30710

Sentence

denotes

(B) Soluble human ACE2 model from MD simulations displaying processed glycoforms and interaction surface with S.

T196

30711-30720

Sentence

denotes

Video S2.

T197

30721-30774

Sentence

denotes

Glycosylated ACE2 with Variants, Related to Figure 6A

T198

30776-30927

Sentence

denotes

MD Simulation of the Glycosylated Trimer Spike of SARS-CoV-2 in Complex with Glycosylated, Soluble, Human Ace 2 Reveals Protein and Glycan Interactions

T199

30928-31052

Sentence

denotes

MD simulations were performed to examine the co-complex (generated from a crystal structure of the ACE2-RBD co-complex, PDB:

T200

31053-31235

Sentence

denotes

6M0J; Lan et al., 2020) of glycosylated S with glycosylated ACE2 with the three different glycoforms models (Abundance, Oxford Class, and Processed; Table S1; Videos S5, S6, and S7).

T201

31236-31474

Sentence

denotes

Information from these analyses is laid out along the primary structure (sequence) of the SARS-CoV-2 S protomer and ACE2 highlighting regions of glycan-protein interaction observed in the MD simulations (Table S14; Videos S5, S6, and S7).

T202

31475-31680

Sentence

denotes

Interestingly, two glycans on ACE2 (at N090 and N322), which are highlighted in Figure 7 A and shown in a more close-up view in Figure 7B, are predicted to form interactions with the S protein (Table S15).

T203

31681-31903

Sentence

denotes

The N322 glycan interaction with the S trimer is outside of the receptor-binding domain, and the interaction is observed across multiple simulations and throughout each simulation (Figures 7A and 7B; Video S5, S6, and S7).

T204

31904-32195

Sentence

denotes

The ACE2 glycan at N090 is close enough to the S trimer surface to repeatedly form interactions; however, the glycan arms interact with multiple regions of the surface over the course of the simulations, reflecting the relatively high degree of glycan dynamics (Figures 7A and 7B; Video S3).

T205

32196-32380

Sentence

denotes

Inter-molecule glycan-glycan interactions are also observed repeatedly between the glycan at N546 of ACE2 and those in the S protein at residues N0074 and N0165 (Figure 7D; Table S16).

T206

32381-32555

Sentence

denotes

Finally, a full view of the ACE2-S complex with Oxford class glycoforms on ACE2 illustrates the extensive glycosylation at the interface of the complex (Figure 7C; Video S4).

T207

32556-32713

Sentence

denotes

Figure 7 Interactions of Glycosylated Soluble Human ACE2 and Glycosylated SARS-CoV-2 S Trimer Immunogen Revealed By 3D-Structural Modeling and MD Simulations

T208

32714-32854

Sentence

denotes

(A) MD simulation of glycosylated soluble human ACE2 and glycosylated SARS-CoV-2 S trimer immunogen interaction (see Videos S5, S6, and S7).

T209

32855-32956

Sentence

denotes

ACE2 (top) is colored red with glycans in pink, whereas S is colored white with glycans in dark gray.

T210

32957-33039

Sentence

denotes

Highlighted are ACE2 glycans that interact with S that are magnified to the right.

T211

33040-33229

Sentence

denotes

(B) Magnification of ACE2-S interface highlighting ACE2 glycan interactions by using 3D-SNFG icons (Thieker et al., 2016) with S protein (pink) as well as ACE2-S glycan-glycan interactions.

T212

33230-33350

Sentence

denotes

(C) Magnification of dynamics trajectory of glycans at the interface of soluble human ACE2 and S (see Videos S3 and S4).

T213

33351-33360

Sentence

denotes

Video S3.

T214

33361-33410

Sentence

denotes

Interface of ACE2-S Complex, Related to Figure 7C

T215

33411-33420

Sentence

denotes

Video S4.

T216

33421-33474

Sentence

denotes

The Glycosylated ACE2-S Complex, Related to Figure 7C

T217

33475-33484

Sentence

denotes

Video S5.

T218

33485-33545

Sentence

denotes

Abundance Glycoforms on ACE2-S Complex, Related to Figure 7A

T219

33546-33555

Sentence

denotes

Video S6.

T220

33556-33619

Sentence

denotes

Oxford Class Glycoforms on ACE2-S Complex, Related to Figure 7A

T221

33620-33629

Sentence

denotes

Video S7.

T222

33630-33690

Sentence

denotes

Processed Glycoforms on ACE2-S Complex, Related to Figure 7A

T223

33692-33702

Sentence

denotes

Discussion

T224

33703-34170

Sentence

denotes

We have defined the glycomics-informed, site-specific microheterogeneity of 22 sites of N-linked glycosylation per monomer on a SARS-CoV-2 trimer and the six sites of N-linked glycosylation on a soluble version of its human ACE2 receptor by using a combination of mass spectrometry approaches coupled with evolutionary and variant sequence analyses to provide a detailed understanding of the glycosylation states of these glycoproteins (Figures 1, 2, 3, 4, 5, and 6).

T225

34171-34341

Sentence

denotes

Our results suggest essential roles for glycosylation in mediating receptor binding, antigenic shielding, and potentially the evolution/divergence of these glycoproteins.

T226

34342-34729

Sentence

denotes

The highly glycosylated SARS-CoV-2 Spike protein, unlike several other viral proteins including HIV-1 (Watanabe et al., 2019) but in agreement with another recent report (Watanabe et al., 2020a), presents significantly more processing of N-glycans toward complex glycosylation, suggesting that steric hindrance to processing enzymes is not a major factor at most sites (Figures 2 and 3).

T227

34730-34825

Sentence

denotes

However, the N-glycans still provide considerable shielding of the peptide backbone (Figure 4).

T228

34826-35238

Sentence

denotes

Our glycomics-guided glycoproteomic data are generally in strong agreement with the trimer immunogen data recently published by Crispin (Watanabe et al., 2020a), although we also observed sulfated N-linked glycans; were able to differentiate branching, bisected, and diLacNAc containing structures by glycomics; and observed less occupancy on the two most C-terminal N-linked sites by using a different approach.

T229

35239-35438

Sentence

denotes

Our detection of sulfated N-linked glycans at multiple sites on S is in agreement with a recent manuscript re-analyzing the Crispin data (https://www.biorxiv.org/content/10.1101/2020.05.31.125302v1).

T230

35439-35580

Sentence

denotes

Sulfated N-linked glycans could potentially play key roles in immune regulation and receptor binding as in other viruses (Wang et al., 2009).

T231

35581-35700

Sentence

denotes

This result is especially significant in that sulfated N-glycans were not observed when we performed glycomics on ACE2.

T232

35701-35965

Sentence

denotes

At each individual site, the glycans we observed on our immunogen appear to be slightly more processed, but the overlap between our analysis and the Crispin’s group results (Watanabe et al., 2020a) at each site in terms of major features are nearly superimposable.

T233

35966-36180

Sentence

denotes

This agreement differs substantially when comparing our and Crispin’s data (Watanabe et al., 2020a) to that of the Azadi group (Shajahan et al., 2020), which analyzed S1 and S2 that had been expressed individually.

T234

36181-36457

Sentence

denotes

When expressed as two separate polypeptides and not purified for trimers, several unoccupied sites of N-linked glycosylation were observed and processing at several sites was significantly different (Shajahan et al., 2020) than we and others (Watanabe et al., 2020a) observed.

T235

36458-36749

Sentence

denotes

Although O-glycosylation has recently been reported for individually expressed S1 and S2 domains of the Spike glycoprotein (Shajahan et al., 2020), in trimeric form the level of O-glycosylation is extremely low, with the highest level of occupancy we observed being 11% at T0323 (Figure 4E).

T236

36750-37019

Sentence

denotes

The low level of O-linked occupancy we observed is in agreement with the Crispin group’s analysis of a Spike Trimer immunogen (Watanabe et al., 2020a) but differs significantly from the Azadi group’s analyses of individually expressed S1 and S2 (Shajahan et al., 2020).

T237

37020-37298

Sentence

denotes

Thus, the context in which the Spike protein is expressed and purified before analysis significantly alters the glycosylation of the protomer that is reminiscent of previous studies looking at expression of the HIV-1 envelope Spike (Behrens et al., 2017; Watanabe et al., 2019).

T238

37299-37453

Sentence

denotes

The soluble ACE2 protein examined here contains six highly utilized sites of N-linked glycosylation dominated by complex type N-linked glycans (Figure 5).

T239

37454-37558

Sentence

denotes

O-glycans were also present on this glycoprotein but at very low levels of occupancy at all sites (<2%).

T240

37559-37769

Sentence

denotes

Our glycomics-informed glycoproteomics allowed us to assign defined sets of glycans to specific glycosylation sites on 3D-structures of S and ACE2 glycoproteins based on experimental evidence (Figures 4 and 6).

T241

37770-38009

Sentence

denotes

Similar to almost all glycoproteins, microheterogeneity is evident at most glycosylation sites of S and ACE2; each glycosylation site can be modified with one of several glycan structures, generating site-specific glycosylation portfolios.

T242

38010-38104

Sentence

denotes

For modeling purposes, however, explicit structures must be placed at each glycosylation site.

T243

38105-38275

Sentence

denotes

In order to capture the impact of microheterogeneity on S and ACE2 MD we chose to generate glycoforms for modeling that represented reasonable portfolios of glycan types.

T244

38276-38557

Sentence

denotes

Using three glycoform models for S (Abundance, Oxford Class, and Processed) and two models for ACE2 (Abundance, which was equivalent to Oxford Class, and Processed), we generated three MD simulations of the co-complexes of these two glycoproteins (Figure 7; Videos S5, S6, and S7).

T245

38558-38726

Sentence

denotes

The observed interactions over time allowed us to evaluate glycan-protein contacts between the two proteins and examine potential glycan-glycan interactions (Figure 7).

T246

38727-38833

Sentence

denotes

We observed glycan-mediated interactions between the S trimer and glycans at N090, N322, and N546 of ACE2.

T247

38834-38985

Sentence

denotes

Thus, variations in glycan occupancy or processing at these sites could alter the affinity of the SARS-CoV-2–ACE2 interaction and modulate infectivity.

T248

38986-39241

Sentence

denotes

It is well established that glycosylation states vary depending on tissue and cell type as well as in the case of humans, on age (Krištić et al., 2014), underlying disease (Pavić et al., 2018; Rudman et al., 2019), and ethnicity (Gebrehiwot et al., 2018).

T249

39242-39364

Sentence

denotes

Thus, glycosylation portfolios could in part be responsible for tissue tropism and individual susceptibility to infection.

T250

39365-39712

Sentence

denotes

The importance of glycosylation for S binding to ACE2 is even more emphatically demonstrated by the direct glycan-glycan interactions observed (Figure 7) between S glycans (at N0074 and N0165) and an ACE2 receptor glycan (at N546), adding an additional layer of complexity for interpreting the impact of glycosylation on individual susceptibility.

T251

39713-39838

Sentence

denotes

Several emerging variants of the virus appear to be altering N-linked glycosylation occupancy by disrupting N-linked sequons.

T252

39839-40058

Sentence

denotes

Interestingly, the two N-linked sequons in SARS-CoV-2 S directly impacted by variants, N0074 and N0149, are in divergent insert regions 1 and 2, respectively, of SARS-CoV-2 S in comparison with SARS-CoV-1 S (Figure 4A).

T253

40059-40302

Sentence

denotes

The N0074, in particular, is one of the S glycans that interact directly with ACE2 glycan (at N546; Figure 7), suggesting that glycan-glycan interactions could contribute to the unique infectivity differences between SARS-CoV-2 and SARS-CoV-1.

T254

40303-40522

Sentence

denotes

These sequon variants will also be important to examine in terms of glycan shielding that could influence immunogenicity and efficacy of neutralizing antibodies, as well as interactions with the host cell receptor ACE2.

T255

40523-40750

Sentence

denotes

Naturally occurring amino acid-changing SNPs in the ACE2 gene generate a number of variants including one variant, with a frequency of three in 10,000 humans, that eliminates a site of N-linked glycosylation at N546 (Figure 6).

T256

40751-41033

Sentence

denotes

Understanding the impact of ACE2 variants on glycosylation and more importantly on S binding, especially for N546S, which impacts the glycan-glycan interaction between S and ACE2 (Figure 7), should be prioritized in light of efforts to develop ACE2 as a potential decoy therapeutic.

T257

41034-41181

Sentence

denotes

Intelligent manipulation of ACE2 glycosylation could lead to more potent biologics capable of acting as better competitive inhibitors of S binding.

T258

41182-41516

Sentence

denotes

The data presented here, and related similar recent findings (Casalino et al., 2020; Watanabe et al., 2020a; Wrobel et al., 2020), provide a framework to facilitate the production of immunogens, vaccines, antibodies, and inhibitors as well as additional information regarding mechanisms by which glycan microheterogeneity is achieved.

T259

41517-41651

Sentence

denotes

However, considerable efforts still remain in order to fully understand the role of glycans in SARS-CoV-2 infection and pathogenicity.

T260

41652-41901

Sentence

denotes

Although HEK-expressed S and ACE2 provide a useful window for understanding human glycosylation of these proteins, glycoproteomic characterization after expression in cell lines of more direct relevance to disease and target tissue is sorely needed.

T261

41902-42099

Sentence

denotes

Although site occupancy could change depending on presentation and cell type (Struwe et al., 2018), processing of N-linked glycans will almost certainly be altered in a cell-type-dependent fashion.

T262

42100-42391

Sentence

denotes

Thus, analyses of the Spike trimer extracted from pseudoviruses, virion-like particles, and ultimately from infectious SARS-CoV-2 virions harvested from airway cells or patients will provide the most accurate view of how trimer immunogens reflect the true glycosylation pattern of the virus.

T263

42392-42599

Sentence

denotes

Detailed analyses of the impact of emerging variants in S and natural and designed-for-biologics variants of ACE2 on glycosylation and binding properties are important next steps for developing therapeutics.

T264

42600-42860

Sentence

denotes

Finally, it will be important to monitor the slow evolution of the virus to determine if existing sites of glycosylation are lost or new sites emerge with selective pressure that might alter the efficacy of vaccines, neutralizing antibodies, and/or inhibitors.

T265

42862-42874

Sentence

denotes

STAR★Methods

T266

42876-42895

Sentence

denotes

Key Resources Table

T267

42896-42933

Sentence

denotes

REAGENT or RESOURCE SOURCE IDENTIFIER

T268

42934-42979

Sentence

denotes

Chemicals, Peptides, and Recombinant Proteins

T269

42980-43015

Sentence

denotes

SARS-CoV-2 S protein This Study N/A

T270

43016-43049

Sentence

denotes

Human ACE2 protein This Study N/A

T271

43050-43095

Sentence

denotes

2x Laemmli sample buffer Bio-Rad Cat#161-0737

T272

43096-43189

Sentence

denotes

Invitrogen NuPAGE 4 to 12%, Bis-Tris, Mini Protein Gel Thermo Fisher Scientific Cat#NP0321PK2

T273

43190-43259

Sentence

denotes

Coomassie Brilliant Blue G-250 Dye Thermo Fisher Scientific Cat#20279

T274

43260-43298

Sentence

denotes

Dithiothreitol Sigma Aldrich Cat#43815

T275

43299-43336

Sentence

denotes

Iodoacetamide Sigma Aldrich Cat#I1149

T276

43337-43362

Sentence

denotes

Trypsin Promega Cat#V5111

T277

43363-43386

Sentence

denotes

Lys-C Promega Cat#V1671

T278

43387-43410

Sentence

denotes

Arg-C Promega Cat#V1881

T279

43411-43434

Sentence

denotes

Glu-C Promega Cat#V1651

T280

43435-43459

Sentence

denotes

Asp-N Promega Cat#VA1160

T281

43460-43495

Sentence

denotes

Endoglycosidase H Promega Cat#V4871

T282

43496-43521

Sentence

denotes

PNGaseF Promega Cat#V4831

T283

43522-43582

Sentence

denotes

Chymotrypsin Athens Research and Technology Cat#16-19-030820

T284

43583-43633

Sentence

denotes

Alpha lytic protease New England BioLabs Cat#P8113

T285

43634-43687

Sentence

denotes

18O water Cambridge Isotope Laboratories OLM-782-10-1

T286

43688-43730

Sentence

denotes

O-protease OpeRATOR Genovis Cat#G1-OP1-020

T287

43731-43745

Sentence

denotes

Deposited Data

T288

43746-43847

Sentence

denotes

MS data for site-specific N-linked glycopeptides for SARS-Cov-2 S and human ACE2 This Study PXD019937

T289

43848-43949

Sentence

denotes

MS data for site-specific O-linked glycopeptides for SARS-Cov-2 S and human ACE2 This Study PXD019940

T290

43950-44052

Sentence

denotes

MS data for deglycosylated N-linked glycopeptides for SARS-Cov-2 S and human ACE2 This Study PXD019938

T291

44053-44126

Sentence

denotes

MS data for disulfide bond analysis for SARS-Cov-2 S This Study PXD019939

T292

44127-44202

Sentence

denotes

MS data for N-linked glycomics deposited at GlycoPost This Study GPST000120

T293

44203-44278

Sentence

denotes

MS data for O-linked glycomics deposited at GlycoPost This Study GPST000121

T294

44279-44299

Sentence

denotes

Experimental Models:

T295

44300-44310

Sentence

denotes

Cell Lines

T296

44311-44339

Sentence

denotes

293-F Cells GIBCO Cat#R79007

T297

44340-44365

Sentence

denotes

Vero-6 Cells ATCC CRL1586

T298

44366-44386

Sentence

denotes

Experimental Models:

T299

44387-44404

Sentence

denotes

Organisms/Strains

T300

44405-44440

Sentence

denotes

VSV(G)-Pseudoviruses This Study N/A

T301

44441-44464

Sentence

denotes

Software and Algorithms

T302

44465-44545

Sentence

denotes

pGlyco v2.2.2 Liu et al., 2017 http://pfind.ict.ac.cn/software/pGlyco/index.html

T303

44546-44611

Sentence

denotes

Proteome Discoverer v1.4 Thermo Fisher Scientific CAT#OPTON-30945

T304

44612-44695

Sentence

denotes

Byonic v3.8.13 Protein Metrics Inc. https://www.proteinmetrics.com/products/byonic/

T305

44696-44818

Sentence

denotes

ProteoIQ v2.7 Premier Biosoft (Bern et al., 2012) http://www.premierbiosoft.com/protein_quantification_software/index.html

T306

44819-44890

Sentence

denotes

GRITS Toolbox V1.1 Weatherly et al., 2019 http://www.grits-toolbox.org/

T307

44891-44976

Sentence

denotes

EMBOSS needle v6.6.0 Rice et al., 2000 https://www.ebi.ac.uk/Tools/psa/emboss_needle/

T308

44977-45033

Sentence

denotes

Biopython v1.76 Cock et al., 2009 https://biopython.org/

T309

45034-45081

Sentence

denotes

Rpdb v2.3 Julien Ide https://rdrr.io/cran/Rpdb/

T310

45082-45166

Sentence

denotes

SignalP V5.0 Almagro Armenteros et al., 2019 http://www.cbs.dtu.dk/services/SignalP/

T311

45167-45265

Sentence

denotes

LibreOFFICE Writer v6.4.4.2 The Document Foundation https://www.libreoffice.org/download/download/

T312

45266-45318

Sentence

denotes

GlyGen V1.5 York et al., 2020 https://www.glygen.org

T313

45319-45409

Sentence

denotes

GNOme V1.5.5 OBO Foundry https://github.com/glygen-glycan-data/GNOme/blob/master/README.md

T314

45410-45476

Sentence

denotes

GlyTouCan V3.1.0 Aoki-Kinoshita et al., 2016 https://glytoucan.org

T315

45477-45563

Sentence

denotes

Inkscape V1.0 Inkscape project contributors https://inkscape.org/release/inkscape-1.0/

T316

45564-45617

Sentence

denotes

ffmpeg V3.4 The FFmpeg developers https://ffmpeg.org/

T317

45618-45673

Sentence

denotes

Cygwin V3.1.5 Cygwin developers https://www.cygwin.com/

T318

45675-45696

Sentence

denotes

Resource Availability

T319

45698-45710

Sentence

denotes

Lead Contact

T320

45711-45919

Sentence

denotes

Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Lance Wells (lwells@ccrc.uga.edu) or alternatively by Peng Zhao (pengzhao@uga.edu).

T321

45921-45943

Sentence

denotes

Materials Availability

T322

45944-45992

Sentence

denotes

This study did not generate new unique reagents.

T323

45994-46020

Sentence

denotes

Data and Code Availability

T324

46021-46157

Sentence

denotes

The mass spectrometry proteomics data are available via ProteomeXchange with identifiers PXD019937, PXD019940, PXD019938, and PXD019939.

T325

46158-46266

Sentence

denotes

The mass spectrometry glycomics data are available via GlycoPost with identifiers GPST000120 and GPST000121.

T326

46268-46306

Sentence

denotes

Experimental Model and Subject Details

T327

46307-46418

Sentence

denotes

HEK293-F cells (GIBCO) were maintained and passaged in FreeStyle Media (GIBCO) containing 1% Pen Strep (GIBCO).

T328

46419-46580

Sentence

denotes

Vero-6 cells (ATCC) were maintained and passaged in DMEM medium supplemented with 10% fetal bovine serum and 1% Pen Strep (GIBCO) and amphotericin B antibiotics.

T329

46581-46657

Sentence

denotes

All cells were maintained at 37°C with 5% CO2 before and after transfection.

T330

46659-46673

Sentence

denotes

Method Details

T331

46675-46761

Sentence

denotes

Expression, Purification, and Characterization of SARS-CoV-2 S and Human ACE2 Proteins

T332

46762-47189

Sentence

denotes

To express a stabilized ectodomain of Spike protein, a synthetic gene encoding residues 1−1208 of SARS-CoV-2 Spike with the furin cleavage site (residues 682–685) replaced by a “GGSG” sequence, proline substitutions at residues 986 and 987, and a foldon trimerization motif followed by a C-terminal 6xHisTag was created and cloned into the mammalian expression vector pCMV-IRES-puro (Codex BioSolutions, Inc, Gaithersburg, MD).

T333

47190-47319

Sentence

denotes

The expression construct was transiently transfected in HEK293F cells using polyethylenimine (Polysciences, Inc, Warrington, PA).

T334

47320-47564

Sentence

denotes

Protein was purified from cell supernatants using Ni-NTA resin (QIAGEN, Germany), the eluted fractions containing S protein were pooled, concentrated, and further purified by gel filtration chromatography on a Superose 6 column (GE Healthcare).

T335

47565-47662

Sentence

denotes

Negative stain electron microscopy (EM) analysis was performed as described (Shaik et al., 2019).

T336

47663-47956

Sentence

denotes

Briefly, analysis was performed at room temperature with a magnification of 52,000x and a defocus value of 1.5 μm following low-dose procedures, using a Philips Tecnai F20 electron microscope (Thermo Fisher Scientific) equipped with a Gatan US4000 CCD camera and operated at voltage of 200 kV.

T337

47957-48102

Sentence

denotes

The DNA fragment encoding human ACE2 (1-615) with a 6xHis tag at C terminus was synthesized by Genscript and cloned to the vector pCMV-IRES-puro.

T338

48103-48184

Sentence

denotes

The expression construct was transfected in HEK293F cells using polyethylenimine.

T339

48185-48261

Sentence

denotes

The medium was discarded and replaced with FreeStyle 293 medium after 6-8 h.

T340

48262-48387

Sentence

denotes

After incubation in 37°C with 5.5% CO2 for 5 days, the supernatant was collected and loaded to Ni-NTA resin for purification.

T341

48388-48463

Sentence

denotes

The elution was concentrated and further purified by a Superdex 200 column.

T342

48465-48520

Sentence

denotes

In-Gel Analysis of SARS-CoV-2 S and Human ACE2 Proteins

T343

48521-48781

Sentence

denotes

A 3.5-μg aliquot of SARS-CoV-2 S protein as well as a 2-μg aliquot of human ACE2 were combined with Laemmli sample buffer, analyzed on a 4%–12% Invitrogen NuPage Bis-Tris gel using the MES pH 6.5 running buffer, and stained with Coomassie Brilliant Blue G-250.

T344

48783-48875

Sentence

denotes

Analysis of N-linked and O-linked Glycans Released from SARS-Cov-2 S and Human ACE2 Proteins

T345

48876-49030

Sentence

denotes

Aliquots of approximately 25-50 μg of S or ACE2 protein were processed for glycan analysis as previously described (Aoki et al., 2007; Aoki et al., 2008).

T346

49031-49101

Sentence

denotes

For N-linked glycan analysis, the proteins were digested with trypsin.

T347

49102-49234

Sentence

denotes

Following trypsinization, glycopeptides were enriched by C18 Sep-Pak and subjected to PNGaseF digestion to release N-linked glycans.

T348

49235-49372

Sentence

denotes

Following PNGaseF digestion, released glycans were separated from residual glycosylated peptides bearing O-linked glycans by C18 Sep-Pak.

T349

49373-49492

Sentence

denotes

O-glycosylated peptides were eluted from the Sep-Pak and subjected to reductive β-elimination to release the O-glycans.

T350

49493-49610

Sentence

denotes

Another 25-50 μg aliquot of each protein was denatured with SDS and digested with PNGaseF to remove N-linked glycans.

T351

49611-49751

Sentence

denotes

The de-N-glycosylated, intact protein was precipitated with cold ethanol and then subjected to reductive β-elimination to release O-glycans.

T352

49752-49852

Sentence

denotes

The profiles of O-glycans released from peptides or from intact protein were found to be comparable.

T353

49853-50036

Sentence

denotes

N- and O-linked glycans released from glycoproteins were permethylated with methyliodide according to the method of Anumula and Taylor prior to MS analysis (Anumula and Taylor, 1992).

T354

50037-50158

Sentence

denotes

Glycan structural analysis was performed using an LTQ-Orbitrap instrument (Orbitrap Discovery, Thermo Fisher Scientific).

T355

50159-50469

Sentence

denotes

Detection and relative quantification of the prevalence of individual glycans was accomplished using the total ion mapping (TIM) and neutral loss scan (NL scan) functionality of the Xcalibur software package version 2.0 (Thermo Fisher Scientific) as previously described (Aoki et al., 2007; Aoki et al., 2008).

T356

50470-50583

Sentence

denotes

Mass accuracy and detector response was tuned with a permethylated oligosaccharide standard in positive ion mode.

T357

50584-50705

Sentence

denotes

For fragmentation by collision-induced dissociation (CID in MS2 and MSn), normalized collision energy of 45% was applied.

T358

50706-50812

Sentence

denotes

Most permethylated glycans were identified as singly or doubly charged, sodiated species in positive mode.

T359

50813-50917

Sentence

denotes

Sulfated N-glycans were detected as singly or doubly charged, deprotonated species in negative ion mode.

T360

50918-51014

Sentence

denotes

Peaks for all charge states were deconvoluted by the charge state and summed for quantification.

T361

51015-51067

Sentence

denotes

All spectra were manually interpreted and annotated.

T362

51068-51192

Sentence

denotes

The explicit identities of individual monosaccharide residues have been assigned based on known human biosynthetic pathways.

T363

51193-51389

Sentence

denotes

Graphical representations of monosaccharide residues are consistent with the Symbol Nomenclature for Glycans (SNFG), which has been broadly adopted by the glycomics community (Varki et al., 2015).

T364

51390-51578

Sentence

denotes

The MS-based glycomics data generated in these analyses and the associated annotations are presented in accordance with the MIRAGE standards and the Athens Guidelines (Wells et al., 2013).

T365

51579-51794

Sentence

denotes

Data annotation and assignment of glycan accession identifiers were facilitated by GRITS Toolbox, GlyTouCan, GNOme, and GlyGen (Kahsay et al., 2020; Tiemeyer et al., 2017; Weatherly et al., 2019; York et al., 2020).

T366

51796-51857

Sentence

denotes

Analysis of Disulfide Bonds for SARS-Cov-2 S Protein by LC-MS

T367

51858-52043

Sentence

denotes

Two 10-μg aliquots of SARS-CoV-2 S protein were denatured by incubating with 20% acetonitrile at room temperature and alkylated by 13.75 mM of iodoacetamide at room temperature in dark.

T368

52044-52178

Sentence

denotes

The two aliquots of proteins were then digested respectively using alpha lytic protease, or a combination of trypsin, Lys-C and Glu-C.

T369

52179-52254

Sentence

denotes

Following digestion, the proteins were deglycosylated by PNGaseF treatment.

T370

52255-52478

Sentence

denotes

The resulting peptides were separated on an Acclaim PepMap RSLC C18 column (75 μm x 15 cm) and eluted into the nano-electrospray ion source of an Orbitrap Fusion Lumos Tribrid mass spectrometer at a flow rate of 200 nL/min.

T371

52479-52624

Sentence

denotes

The elution gradient consists of 1%–40% acetonitrile in 0.1% formic acid over 370 min followed by 10 min of 80% acetonitrile in 0.1% formic acid.

T372

52625-52722

Sentence

denotes

The spray voltage was set to 2.2 kV and the temperature of the heated capillary was set to 280°C.

T373

52723-52903

Sentence

denotes

Full MS scans were acquired from m/z 200 to 2000 at 60k resolution, and MS/MS scans following electron transfer dissociation (ETD) were collected in the Orbitrap at 15k resolution.

T374

52904-53044

Sentence

denotes

The raw spectra were analyzed by Byonic (v3.8.13, Protein Metrics Inc.) with mass tolerance set as 20 ppm for both precursors and fragments.

T375

53045-53125

Sentence

denotes

The search output was filtered at 1% false discovery rate and 10 ppm mass error.

T376

53126-53208

Sentence

denotes

The spectra assigned as cross-linked peptides were manually evaluated for Cys0015.

T377

53210-53308

Sentence

denotes

Analysis of Site-Specific N-linked Glycopeptides for SARS-Cov-2 S and Human ACE2 Proteins by LC-MS

T378

53309-53488

Sentence

denotes

Four 3.5-μg aliquots of SARS-CoV-2 S protein were reduced by incubating with 10 mM of dithiothreitol at 56°C and alkylated by 27.5 mM of iodoacetamide at room temperature in dark.

T379

53489-53664

Sentence

denotes

The four aliquots of proteins were then digested respectively using alpha lytic protease, chymotrypsin, a combination of trypsin and Glu-C, or a combination of Glu-C and AspN.

T380

53665-53836

Sentence

denotes

Three 10-μg aliquots of ACE2 protein were reduced by incubating with 5 mM of dithiothreitol at 56°C and alkylated by 13.75 mM of iodoacetamide at room temperature in dark.

T381

53837-53980

Sentence

denotes

The three aliquots of proteins were then digested respectively using alpha lytic protease, chymotrypsin, or a combination of trypsin and Lys-C.

T382

53981-54204

Sentence

denotes

The resulting peptides were separated on an Acclaim PepMap RSLC C18 column (75 μm x 15 cm) and eluted into the nano-electrospray ion source of an Orbitrap Fusion Lumos Tribrid mass spectrometer at a flow rate of 200 nL/min.

T383

54205-54350

Sentence

denotes

The elution gradient consists of 1%–40% acetonitrile in 0.1% formic acid over 370 min followed by 10 min of 80% acetonitrile in 0.1% formic acid.

T384

54351-54448

Sentence

denotes

The spray voltage was set to 2.2 kV and the temperature of the heated capillary was set to 280°C.

T385

54449-54816

Sentence

denotes

Full MS scans were acquired from m/z 200 to 2000 at 60k resolution, and MS/MS scans following higher-energy collisional dissociation (HCD) with stepped collision energy (15%, 25%, 35%) were collected in the Orbitrap at 15k resolution. pGlyco v2.2.2 (Liu et al., 2017) was used for database searches with mass tolerance set as 20 ppm for both precursors and fragments.

T386

54817-54925

Sentence

denotes

The database search output was filtered to reach a 1% false discovery rate for glycans and 10% for peptides.

T387

54926-55025

Sentence

denotes

Quantitation was performed by calculating spectral counts for each glycan composition at each site.

T388

55026-55121

Sentence

denotes

Any N-linked glycan compositions identified by only one spectra were removed from quantitation.

T389

55122-55207

Sentence

denotes

N-linked glycan compositions were categorized into 22 classes (including Unoccupied):

T390

55208-55550

Sentence

denotes

HexNAc(2)Hex(9∼5)Fuc(0∼1) was classified as M9 to M5 respectively; HexNAc(2)Hex(4∼1)Fuc(0∼1) was classified as M1-M4; HexNAc(3∼6)Hex(5∼9)Fuc(0)NeuAc(0∼1) was classified as Hybrid with HexNAc(3∼6)Hex(5∼9)Fuc(1∼2)NeuAc(0∼1) classified as F-Hybrid; Complex-type glycans are classified based on the number of antenna, fucosylation, and sulfation:

T391

55551-56289

Sentence

denotes

HexNAc(3)Hex(3∼4)Fuc(0)NeuAc(0∼1) is assigned as A1 with HexNAc(3)Hex(3∼4)Fuc(1∼2)NeuAc(0∼1) assigned as F-A1; HexNAc(4)Hex(3∼5)Fuc(0)NeuAc(0∼2) is assigned as A2/A1B with HexNAc(4)Hex(3∼5)Fuc(1∼5)NeuAc(0∼2) assigned as F-A2/A1B; HexNAc(5)Hex(3∼6)Fuc(0)NeuAc(0∼3) is assigned as A3/A2B with HexNAc(5)Hex(3∼6)Fuc(1∼3)NeuAc(0∼3) assigned as F-A3/A2B; HexNAc(6)Hex(3∼7)Fuc(0)NeuAc(0∼4) is assigned as A4/A3B with HexNAc(6)Hex(3∼7)Fuc(1∼3)NeuAc(0∼4) assigned as F-A4/A3B; HexNAc(7)Hex(3∼8)Fuc(0)NeuAc(0∼1) is assigned as A5/A4B with HexNAc(7)Hex(3∼8)Fuc(1∼3)NeuAc(0∼1) as F-A5/A4B; HexNAc(8)Hex(3∼9)Fuc(0) is assigned as A6/A5B with HexNAc(8)Hex(3∼9)Fuc(1) assigned as F-A6/A5B; any glycans identified with a sulfate are assigned as Sulfated.

T392

56291-56363

Sentence

denotes

Analysis of Deglycosylated SARS-Cov-2 S and Human ACE2 Proteins by LC-MS

T393

56364-56544

Sentence

denotes

Three 3.5-μg aliquots of SARS-CoV-2 S protein were reduced by incubating with 10 mM of dithiothreitol at 56°C and alkylated by 27.5 mM of iodoacetamide at room temperature in dark.

T394

56545-56661

Sentence

denotes

The three aliquots were then digested respectively using chymotrypsin, Asp-N, or a combination of trypsin and Glu-C.

T395

56662-56831

Sentence

denotes

Two 10-μg aliquots of ACE2 protein were reduced by incubating with 5 mM of dithiothreitol at 56°C and alkylated by 13.75 mM of iodoacetamide at room temperature in dark.

T396

56832-56939

Sentence

denotes

The two aliquots were then digested respectively using chymotrypsin, or a combination of trypsin and Lys-C.

T397

56940-57074

Sentence

denotes

Following digestion, the proteins were deglycosylated by Endoglycosidase H followed by PNGaseF treatment in the presence of 18O water.

T398

57075-57298

Sentence

denotes

The resulting peptides were separated on an Acclaim PepMap RSLC C18 column (75 μm x 15 cm) and eluted into the nano-electrospray ion source of an Orbitrap Fusion Lumos Tribrid mass spectrometer at a flow rate of 200 nL/min.

T399

57299-57444

Sentence

denotes

The elution gradient consists of 1%–40% acetonitrile in 0.1% formic acid over 370 min followed by 10 min of 80% acetonitrile in 0.1% formic acid.

T400

57445-57542

Sentence

denotes

The spray voltage was set to 2.2 kV and the temperature of the heated capillary was set to 280°C.

T401

57543-57729

Sentence

denotes

Full MS scans were acquired from m/z 200 to 2000 at 60k resolution, and MS/MS scans following collision-induced dissociation (CID) at 38% collision energy were collected in the ion trap.

T402

57730-57870

Sentence

denotes

The spectra were analyzed using SEQUEST (Proteome Discoverer 1.4) with mass tolerance set as 20 ppm for precursors and 0.5 Da for fragments.

T403

57871-58001

Sentence

denotes

The search output was filtered using ProteoIQ (v2.7) to reach a 1% false discovery rate at protein level and 10% at peptide level.

T404

58002-58220

Sentence

denotes

Occupancy of each N-linked glycosylation site was calculated using spectral counts assigned to the 18O-Asp-containing (PNGaseF-cleaved) and/or HexNAc-modified (EndoH-cleaved) peptides and their unmodified counterparts.

T405

58222-58320

Sentence

denotes

Analysis of Site-Specific O-linked Glycopeptides for SARS-Cov-2 S and Human ACE2 Proteins by LC-MS

T406

58321-58538

Sentence

denotes

Three 10-μg aliquots of SARS-CoV-2 S protein and one 10-μg aliquot of ACE2 protein were reduced by incubating with 5 mM of dithiothreitol at 56°C and alkylated by 13.75 mM of iodoacetamide at room temperature in dark.

T407

58539-58656

Sentence

denotes

The four aliquots were then digested respectively using trypsin, Lys-C, Arg-C, or a combination of trypsin and Lys-C.

T408

58657-58776

Sentence

denotes

Following digestion, the proteins were deglycosylated by PNGaseF treatment and then digested with O-protease OpeRATOR®.

T409

58777-59000

Sentence

denotes

The resulting peptides were separated on an Acclaim PepMap RSLC C18 column (75 μm x 15 cm) and eluted into the nano-electrospray ion source of an Orbitrap Fusion Lumos Tribrid mass spectrometer at a flow rate of 200 nL/min.

T410

59001-59146

Sentence

denotes

The elution gradient consists of 1%–40% acetonitrile in 0.1% formic acid over 370 min followed by 10 min of 80% acetonitrile in 0.1% formic acid.

T411

59147-59244

Sentence

denotes

The spray voltage was set to 2.2 kV and the temperature of the heated capillary was set to 280°C.

T412

59245-59519

Sentence

denotes

Full MS scans were acquired from m/z 200 to 2000 at 60k resolution, and MS/MS scans following higher-energy collisional dissociation (HCD) with stepped collision energy (15%, 25%, 35%) or electron transfer dissociation (ETD) were collected in the Orbitrap at 15k resolution.

T413

59520-59638

Sentence

denotes

The raw spectra were analyzed by Byonic (v3.8.13) with mass tolerance set as 20 ppm for both precursors and fragments.

T414

59639-59740

Sentence

denotes

MS/MS filtering was applied to only allow for spectra where the oxonium ions of HexNAc were observed.

T415

59741-59821

Sentence

denotes

The search output was filtered at 1% false discovery rate and 10 ppm mass error.

T416

59822-59893

Sentence

denotes

The spectra assigned as O-linked glycopeptides were manually evaluated.

T417

59894-59993

Sentence

denotes

Quantitation was performed by calculating spectral counts for each glycan composition at each site.

T418

59994-60089

Sentence

denotes

Any O-linked glycan compositions identified by only one spectra were removed from quantitation.

T419

60090-60283

Sentence

denotes

Occupancy of each O-linked glycosylation site was calculated using spectral counts assigned to any glycosylated peptides and their unmodified counterparts from searches without MS/MS filtering.

T420

60285-60342

Sentence

denotes

Sequence Analysis of SARS-CoV-2 S and Human ACE2 Proteins

T421

60343-60488

Sentence

denotes

The genomes of SARS-CoV as well as bat and pangolin coronavirus sequences reported to be closely related to SARS-CoV-2 were downloaded from NCBI.

T422

60489-60660

Sentence

denotes

The S protein sequences from all of those genomes were aligned using EMBOSS needle v6.6.0 (Rice et al., 2000) via the EMBL-EBI provided web service (Madeira et al., 2019).

T423

60661-60761

Sentence

denotes

Manual analysis was performed in the regions containing canonical N-glycosylation sequons (N-X-S/T).

T424

60762-61081

Sentence

denotes

For further sequence analysis of SARS-CoV-2 S variants, the genomes of SARS-CoV-2 were downloaded from NCBI and GISAID and further processed using Biopython 1.76 to extract all sequences annotated as “surface glycoprotein” and to remove any incomplete sequence as well as any sequence containing unassigned amino acids.

T425

61082-61305

Sentence

denotes

For sequence analysis of human ACE2 variants, the single nucleotide polymorphisms (SNPs) of ACE2 were extracted from the NCBI dbSNP database and filtered for missense mutation entries with a reported minor allele frequency.

T426

61306-61467

Sentence

denotes

Manual analysis was performed on both SARS-CoV-2 S and human ACE2 variants to further examine the regions containing canonical N-glycosylation sequons (N-X-S/T).

T427

61468-61577

Sentence

denotes

LibreOffice Writer and its macro capabilities was used to shade regions on the linear sequence of S and ACE2.

T428

61579-61688

Sentence

denotes

3D Structural Modeling and Molecular Dynamics Simulation of Glycosylated SARS-CoV-2 S and Human ACE2 Proteins

T429

61689-61991

Sentence

denotes

SARS-CoV-2 Spike (S) protein structure and ACE2 co-complex – A 3D structure of the prefusion form of the S protein (RefSeq: YP_009724390.1, UniProt: P0DTC2 SPIKE_SARS2), based on a Cryo-EM structure (PDB code 6VSB) (Wrapp et al., 2020), was obtained from the SWISS-MODEL server (swissmodel.expasy.org).

T430

61992-62058

Sentence

denotes

The model has 95% coverage (residues 27 to 1146) of the S protein.

T431

62059-62220

Sentence

denotes

The receptor binding domain (RBD) in the “open” conformation was replaced with the RBD from an ACE2 co-complex (PDB code 6M0J) by grafting residues C336 to V524.

T432

62221-62489

Sentence

denotes

Glycoform generation – Glycans (detected by glycomics) were selected for installation on glycosylated S and ACE2 sequons (detected by glycoproteomics) based on three sets of criteria designed to reasonably capture different aspects of glycosylation microheterogeneity.

T433

62490-62829

Sentence

denotes

We denote the first of these glycoform models as “Abundance.” The glycans selected for installation to generate the Abundance model were chosen because they were identified as the most abundant glycan structure (detected by glycomics) that matched the most abundant glycan composition (detected by glycoproteomics) at each individual site.

T434

62830-63215

Sentence

denotes

We denote the second glycoform model as “Oxford Class.” The glycans selected for installation to generate the Oxford Class model were chosen because they were the most abundant glycan structure, (detected by glycomics) that was contained within the most highly represented Oxford classification group (detected by glycoproteomics) at each individual site (Figure S7; Tables S1 and S8).

T435

63216-63650

Sentence

denotes

Finally, we denote the third glycoform model as “Processed.” The glycans selected for installation to generate the Processed model were chosen because they were the most highly trimmed, elaborated, or terminally decorated structure (detected by glycomics) that corresponded to a composition (detected by glycoproteomics) which was present at ≥ 1/3rd of the abundance of the most highly represented composition at each site (Table S1).

T436

63651-63827

Sentence

denotes

3D structures of the three glycoforms (Abundance, Oxford Class, Processed) were generated for the SARS-CoV-2 S protein alone, and in complex with the glycosylated ACE2 protein.

T437

63828-64191

Sentence

denotes

The glycoprotein builder available at GLYCAM-Web (www.glycam.org) was employed together with an in-house program that adjusts the asparagine side chain torsion angles and glycosidic linkages within known low-energy ranges (Nivedha et al., 2014) to relieve any atomic overlaps with the core protein, as described previously (Grant et al., 2016; Peng et al., 2017).

T438

64192-64391

Sentence

denotes

Energy minimization and Molecular dynamics (MD) simulations – Each glycosylated structure was placed in a periodic box of TIP3P water molecules with a 10 Å buffer between the solute and the box edge.

T439

64392-64587

Sentence

denotes

Energy minimization of all atoms was performed for 20,000 steps (10,000 steepest decent, followed by 10,000 conjugant gradient) under constant pressure (1 atm) and temperature (300 K) conditions.

T440

64588-64830

Sentence

denotes

All MD simulations were performed under nPT conditions with the CUDA implementation of the PMEMD (Götz et al., 2012; Salomon-Ferrer et al., 2013) simulation code, as present in the Amber14 software suite (University of California, San Diego).

T441

64831-64999

Sentence

denotes

The GLYCAM06j force field (Kirschner et al., 2008) and Amber14SB force field (Maier et al., 2015) were employed for the carbohydrate and protein moieties, respectively.

T442

65000-65193

Sentence

denotes

A Berendsen barostat with a time constant of 1 ps was employed for pressure regulation, while a Langevin thermostat with a collision frequency of 2 ps-1 was employed for temperature regulation.

T443

65194-65246

Sentence

denotes

A nonbonded interaction cut-off of 8 Å was employed.

T444

65247-65356

Sentence

denotes

Long-range electrostatics were treated with the particle-mesh Ewald (PME) method (Darden and Pedersen, 1993).

T445

65357-65491

Sentence

denotes

Covalent bonds involving hydrogen were constrained with the SHAKE algorithm, allowing an integration time step of 2 fs to be employed.

T446

65492-65605

Sentence

denotes

The energy minimized coordinates were equilibrated at 300K over 400 ps with restraints on the solute heavy atoms.

T447

65606-65854

Sentence

denotes

Each system was then equilibrated with restraints on the Ca atoms of the protein for 1ns, prior to initiating 4 independent 250 ns production MD simulations with random starting seeds for a total time of 1 μs per system, with no restraints applied.

T448

65855-65882

Sentence

denotes

Antigenic surface analysis.

T449

65883-66237

Sentence

denotes

A series of 3D structure snapshots of the simulation were taken at 1 ns intervals and analyzed in terms of their ability to interact with a spherical probe based on the average size of hypervariable loops present in an antibody complementarity determining region (CDR), as described recently (https://www.biorxiv.org/content/10.1101/2020.04.07.030445v2).

T450

66238-66391

Sentence

denotes

The percentage of simulation time each residue was exposed to the AbASA probe was calculated and plotted onto both the 3D structure and primary sequence.

T451

66393-66458

Sentence

denotes

Analysis of SARS-CoV-2 Spike VSV Pseudoparticles (ppVSV-SARS-2-S)

T452

66459-66566

Sentence

denotes

293T cells were transfected with an expression plasmid encoding SARS-CoV-2 Spike (pcDNAintron-SARS-2-SΔ19).

T453

66567-66679

Sentence

denotes

To increase cell surface expression, the last 19 amino acids containing the Golgi retention signal were removed.

T454

66680-66761

Sentence

denotes

Two SΔ19 constructs were compared, one started with Met1 and the other with Met2.

T455

66762-66895

Sentence

denotes

Twenty-four h following transfection, cells were transduced with ppVSVΔG-VSV-G (particles that were pseudotyped with VSV-G in trans).

T456

66896-66978

Sentence

denotes

One h following transduction cells were extensively washed and media was replaced.

T457

66979-67093

Sentence

denotes

Supernatant containing particles were collected 12-24 h following transduction and cleared through centrifugation.

T458

67094-67149

Sentence

denotes

Cleared supernatant was frozen at −80°C for future use.

T459

67150-67247

Sentence

denotes

Target cells Vero E6 were seeded in 24-well plates (5x105 cells/mL) at a density of 80% coverage.

T460

67248-67437

Sentence

denotes

The following day, ppVSV-SARS-2-S/GFP particles were transduced into target cells for 60 min, particles pseudotyped with VSV-G, Lassa virus GP, or no glycoprotein were included as controls.

T461

67438-67658

Sentence

denotes

24 h following transduction, transduced cells were released from the plate with trypsin, fixed with 4% formaldehyde, and GFP-positive virus-transduced cells were quantified using flow cytometry (Bectin Dickson BD-LSRII).

T462

67659-67895

Sentence

denotes

To quantify the ability of various SARS-CoV-2 S mutants to mediate fusion, effector cells (HEK293T) were transiently transfected with the indicated pcDNAintron-SARS-2-S expression vector or measles virus H and F (Brindley et al., 2014).

T463

67896-68016

Sentence

denotes

Effector cells were infected with MVA-T7 four h following transduction to produce the T7 polymerase (Paal et al., 2009).

T464

68017-68249

Sentence

denotes

Target cells naturally expressing the receptor ACE2 (Vero) or ACE2 negative cells (HEK293T) were transfected with pTM1-luciferase, which encodes for firefly luciferase under the control of a T7 promoter (Brindley and Plemper, 2010).

T465

68250-68355

Sentence

denotes

24 h following transfection, the target cells were lifted and added to the effector cells at a 1:1 ratio.

T466

68356-68486

Sentence

denotes

4 h following co-cultivation, cells were washed, lysed and luciferase levels were quantified using Promega’s Steady-Glo substrate.

T467

68487-68602

Sentence

denotes

To visualize cell-to-cell fusion, Vero cells were co-transfected with pGFP and the pcDNAintron-SARS-2-S constructs.

T468

68603-68683

Sentence

denotes

24 h following transfection, syncytia was visualized by fluorescence microscopy.

T469

68685-68724

Sentence

denotes

Quantification and Statistical Analysis

T470

68725-68852

Sentence

denotes

Raw glycoproteomic data from the mass spectrometers was searched using Proteome Discoverer v1.4 (SEQUEST), Protein Metrics Inc.

T471

68853-68887

Sentence

denotes

Byonic v3.8.13, and pGlyco v2.2.2.

T472

68888-69020

Sentence

denotes

For data searches using Proteome Discoverer, the results were processed to apply false discovery rate filtering using ProteoIQ v2.7.

T473

69021-69193

Sentence

denotes

For the deglycosylated protein work, search results from SEQUEST were filtered in ProteoIQ with a 1% false discovery rate at the protein level and 10% at the peptide level.

T474

69194-69327

Sentence

denotes

For N-linked glycopeptide analysis, pGlyco was used with false discovery rate of 1% at the glycan level and 10% at the peptide level.

T475

69328-69444

Sentence

denotes

For disulfide bond analysis and O-glycopeptide searches, Byonic was used and the false discovery rate was set to 1%.

T476

69445-69497

Sentence

denotes

All mass spectrometry results were manually curated.

T477

69498-69750

Sentence

denotes

Antigen accessibility simulations were carried out as described in the Method Details section and the mean of four simulations (three of length 350ns, one of length 200ns; amounting to 1.25 μs of total molecular dynamics simulation time) were utilized.

T478

69751-70069

Sentence

denotes

Glycan-glycan and glycan-peptide interactions were also calculated based on simulations as a percentage of time residues were in contact and averaged (mean) to produce the corresponding supplemental (colored) sequence figures with the raw numbers for coloring present also in each corresponding supplemental table tab.

T479

70070-70166

Sentence

denotes

3D distances were computed using Rpdb as described in more detail in the Method Details section.

T480

70167-70263

Sentence

denotes

This data is presented using box & whisker plots with all underlying statistics calculated in R.

PMC:7443692 / 1853-72116 JSON TXT 15 Projects

Annnotations TAB TSV DIC JSON TextAE

PMC:7443692 / 1853-72116 JSONTXT 15 Projects

Annnotations TAB TSV DIC JSON TextAE

PMC:7443692 / 1853-72116 JSON TXT 15 Projects