CORD-19:e05ece9c6fdb3e9942598685bb94748b97d88e16 JSONTXT 8 Projects

The sequence of human ACE2 is suboptimal for binding the S spike protein of SARS coronavirus 2 Abstract The rapid and escalating spread of SARS coronavirus 2 (SARS-CoV-2) 6 poses an immediate public health emergency, and no approved therapeutics or 7 vaccines are currently available. The viral spike protein S binds ACE2 on host cells to 8 initiate molecular events that release the viral genome intracellularly. Soluble ACE2 9 inhibits entry of both SARS and SARS-2 coronaviruses by acting as a decoy for S 10 binding sites, and is a candidate for therapeutic and prophylactic development. Using deep mutagenesis, variants of ACE2 are identified with increased binding to 12 the receptor binding domain of S at a cell surface. Mutations are found across the 13 interface and also at buried sites where they are predicted to enhance folding and 14 presentation of the interaction epitope. The N90-glycan on ACE2 hinders association. The mutational landscape offers a blueprint for engineering high affinity ACE2 16 receptors to meet this unprecedented challenge. In December, 2018, a novel zoonotic betacoronavirus closely related to bat coronaviruses 18 spilled over to humans at the Huanan Seafood Market in the Chinese city of Wuhan (1, 2) . 19 The virus, called SARS-CoV-2 due to its similarities with the severe acute respiratory 20 syndrome (SARS) coronavirus responsible for a smaller outbreak nearly two decades prior 21 (3, 4), has since spread human-to-human rapidly across the world, precipitating 22 extraordinary containment measures from governments (5). Stock markets have fallen, 23 travel restrictions have been imposed, public gatherings canceled, and large numbers of 24 people are quarantined. These events are unlike any experienced in generations. 25 Symptoms of coronavirus disease 2019 (COVID-19) range from mild to dry cough, fever, 26 pneumonia and death, and SARS-CoV-2 is devastating among the elderly and other 27 vulnerable groups (6, 7). 28 The S spike glycoprotein of SARS-CoV-2 binds angiotensin-converting enzyme 2 (ACE2) on 29 host cells (2, 8-13). S is a trimeric class I viral fusion protein that is proteolytically 30 processed into S1 and S2 subunits that remain noncovalently associated in a prefusion 31 state (8, 11, 14) . Upon engagement of ACE2 by a receptor binding domain (RBD) in S1 (15), 32 conformational rearrangements occur that cause S1 shedding, cleavage of S2 by host 33 proteases, and exposure of a fusion peptide adjacent to the S2' proteolysis site (14, (16) (17) (18) Fc region of 41 human immunoglobulin can provide an avidity boost while recruiting immune effector 42 functions and increasing serum stability, an especially desirable quality if intended for 43 prophylaxis (23, 24), and sACE2 has proven safe in healthy human subjects (25) typically yield no more than one coding variant per cell, providing a tight link between 60 genotype and phenotype (30, 31). Cells were then incubated with a subsaturating dilution 61 of medium containing the RBD of SARS-CoV-2 fused C-terminally to superfolder GFP 62 (sfGFP: (32)) ( Fig. 1A) . Levels of bound RBD-sfGFP correlate with surface expression levels 63 of myc-tagged ACE2 measured by dual color flow cytometry. Compared to cells expressing 64 wild type ACE2 (Fig. 1C ), many variants in the ACE2 library fail to bind RBD, while there 65 appeared to be a smaller number of ACE2 variants with higher binding signals (Fig. 1D ). 66 Cells expressing ACE2 variants with high or low binding to RBD were collected by 67 fluorescence-activated cell sorting (FACS), referred to as "nCoV-S-High" and "nCoV-S-Low" 68 sorted populations, respectively. During FACS, fluorescence signal for bound RBD-sfGFP 69 continuously declined, requiring the collection gates to be regularly updated to 'chase' the 70 relevant populations. This is consistent with RBD dissociating over hours during the 71 experiment. Reported affinities of RBD for ACE2 range from 1 to 15 nM (8, 10). 72 . CC-BY-NC-ND 4.0 International license author/funder. It is made available under a The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.03.16.994236 doi: bioRxiv preprint Transcripts in the sorted populations were deep sequenced, and frequencies of variants 98 were compared to the naive plasmid library to calculate the enrichment or depletion of all 99 2,340 coding mutations in the library (Fig. 2 ). This approach of tracking an in vitro 100 selection or evolution by deep sequencing is known as deep mutagenesis (33). Enrichment 101 ratios ( Fig. 3A and 3B) and residue conservation scores ( Fig. 3D and 3E) closely agree 102 between two independent sort experiments, giving confidence in the data. For the most 103 part, enrichment ratios (Fig. 3C ) and conservation scores (Fig. 3F) in the nCoV-S-High sorts 104 are anticorrelated with the nCoV-S-Low sorts, with the exception of nonsense mutations 105 which were appropriately depleted from both gates. This indicates that most, but not all, 106 nonsynonymous mutations in ACE2 did not eliminate surface expression. The library is 107 biased towards solvent-exposed residues and has few substitutions of buried hydrophobics 108 that might have bigger effects on plasma membrane trafficking (31). 109 Mapping the experimental conservation scores from the nCoV-S-High sorts to the structure 124 of RBD-bound ACE2 (19) shows that residues buried in the interface tend to be conserved, 125 whereas residues at the interface periphery or in the substrate-binding cleft are 126 mutationally tolerant (Fig. 4A) . The region of ACE2 surrounding the C-terminal end of the 127 ACE2 α1 helix and β3-β4 strands has a weak tolerance of polar residues, while amino acids 128 at the N-terminal end of α1 and the C-terminal end of α2 prefer hydrophobics (Fig. 4B) , 129 likely in part to preserve hydrophobic packing between α1-α2. These discrete patches 130 contact the globular RBD fold and a long protruding loop of the RBD, respectively. 131 Two ACE2 residues, N90 and T92 that together form a consensus N-glycosylation motif, are 132 notable hot spots for enriched mutations (Fig. 2 and 4A) . Indeed, all substitutions of N90 133 and T92, with the exception of T92S which maintains the N-glycan, are highly favorable for 134 RBD binding, and the N90-glycan is thus predicted to partially hinder S/ACE2 interaction. 135 were also enriched at buried positions where they will change local packing (e.g. A25V, 169 L29F, W69V, F72Y and L351F). The selection of ACE2 variants for high binding signal 170 therefore not only reports on affinity, but also on presentation at the membrane of folded 171 structure recognized by SARS-CoV-2 S. The presence of enriched structural mutations in 172 the sequence landscape is especially notable considering the ACE2 library was biased 173 towards solvent-exposed positions. 174 Deep mutational scans in human cells have errors (34), and it is unclear how large an effect 175 an enriched mutation in a selection will have when introduced in a purified protein. Mutations of interest for ACE2 engineering will need careful assessment by targeted 177 mutagenesis, as well as considerations on how best to combine mutations for production of 178 conformationally-stable, high affinity sACE2. Other considerations will be whether to fuse 179 sACE2 to Fc of IgG1 or IgA1 to evoke specialized immune effector functions, or to fuse with 180 albumin to boost serum stability without risking an excessive inflammatory response. 181 These are unknowns. 182 . CC-BY-NC-ND 4.0 International license author/funder. It is made available under a The copyright holder for this preprint (which was not peer-reviewed) is the the myc-positive (Alexa 647) population, the top 67% were gated (Fig. 1B) . Of these, the 15 216 % of cells with the highest and 20% of cells with the lowest GFP fluorescence were 217 collected (Fig. 1D ) in tubes coated overnight with fetal bovine serum and containing 218 Expi293 Expression Medium. Total RNA was extracted from the collected cells using a 219 GeneJET RNA purification kit (Thermo Scientific), and cDNA was reverse transcribed with 220 high fidelity Accuscript (Agilent) primed with gene-specific oligonucleotides. Diversified 221 regions of ACE2 were PCR amplified as 5 fragments. Flanking sequences on the primers 222 added adapters to the ends of the products for annealing to Illumina sequencing primers, 223 unique barcoding, and for binding the flow cell. Amplicons were sequenced on an Illumina 224 . CC-BY-NC-ND 4.0 International license author/funder. It is made available under a The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.03.16.994236 doi: bioRxiv preprint NovaSeq 6000 using a 2×250 nt paired end protocol. Data were analyzed using Enrich (36), 225 and commands are provided in the GEO deposit. Briefly, the frequencies of ACE2 variants in 226 the transcripts of the sorted populations were compared to their frequencies in the naive 227 plasmid library to calculate an enrichment ratio.

Annnotations TAB TSV DIC JSON TextAE-old TextAE

  • Denotations: 140
  • Blocks: 0
  • Relations: 0