Phosphorylation of SARS-CoV-2 Viral Proteins by the Host Proteome Viral protein phosphorylation within the host cell may play a role in sensing and responding to cell state. We detected 25 phosphorylation sites in SARS-CoV-2 viral proteins that we combined with another proteomics dataset (Davidson et al., 2020) to amass a total of 49 sites detected across seven viral proteins (Table S2). Of note, this analysis does not distinguish cleaved from uncleaved viral proteins in the assignment of viral phosphorylation sites. The degree of conservation, indicative of functional constraint, was estimated for each residue position (Figure 2 A; Ng and Henikoff 2003), and the sites were mapped to positions within structured regions for five proteins, with the majority observed in accessible positions (i.e., loops) (Figure 2B). The top kinase families predicted by sequence to regulate these sites included casein kinase II (CK2), cyclin-dependent kinase (CDK), and protein kinase C (PKC), among others (Figure 2C), suggesting that these kinases may contribute to regulation of viral replication. Figure 2 Overview of SARS-CoV-2 Viral Protein PHs in the Host Cell (A) Localization of PHs across viral protein sequences from this study and a previous study (Davidson et al., 2020). Stem height indicates predicted deleteriousness of alanine substitutions. Dot color indicates whether the residue is (true) or is not (false) predicted to form part of an interaction interface based on SPPIDER analysis. Positions with no structural coverage are excluded from interface prediction. (B) Distribution of secondary structure elements in which viral PHs were found, as classified by the define secondary structure of proteins (DSSP) tool. (C) Distribution of top matching host kinases to viral PHs according to NetPhorest tool (Horn et al., 2014). (D) Phosphorylation cluster in the C-terminal tail of the M protein (red residues) structure (Heo and Feig 2020) and associated sequence motif. Asterisks indicate PHs. (E) Alignment of M protein phosphorylation clusters across different coronaviruses. Asterisks indicate PHs. (F) Surface electrostatic potential of non-phosphorylated (left) and phosphorylated (right) RNA-binding domains of the N protein (PDB: 6M3M). Positions of PHs are indicated by arrows. Blue denotes a positive charge potential, and red indicates a negative charge potential. Electrostatic potential was computed with the Advanced Poisson-Boltzmann Solver (APBS) tool after preparation with the PDB2PQR tool. Although it is unlikely that all phosphorylation sites on viral proteins play important functional roles, several sites in membrane (M) protein, Nsp9, and nucleocapsid (N) protein (Figures 2D–2F) suggest potential functionality. Five phosphorylation sites were detected in the M protein cluster within a short C-terminal region of the protein (207–215; Figure 2D). Although these acceptor residues are not predicted to be conserved, several are negatively charged residues in M proteins of other related viruses (Figure 2E). This evolutionary pattern suggests that a negative charge in this region may play a functional role, reminiscent of other multi-site phosphorylation events (Serber and Ferrell 2007). To identify phosphorylation sites that may regulate protein-protein interactions, all sites were mapped to 3D structures, and solvent accessibility based protein-protein interface identification and recognition (SPPIDER) was used to assess whether sites resided within interface regions (Porollo and Meller 2007; Figure 2A; Table S2). The single phosphorylation site in Nsp9 was predicted to be at an interface region (“True”), which was supported by inspection of the homodimer structure (PDB: 6W4B). Additional phosphorylation sites were predicted to be at interface residues within the S protein (Figure 2A). However, inspection of S in complex with the ACE2 receptor (Shang et al., 2020; Lan et al., 2020) reveals some of these phosphorylation sites to be near but not at the interface region. Finally, phosphorylation sites in N protein, a structural protein that binds to and assists with packaging viral RNA, were investigated. Most sites occurred within the N-terminal portion of the protein, at or near the RNA binding region, but avoided the C-terminal dimerization domain. The cluster of phosphorylation sites within an arginine/serine (RS)-dipeptide rich region, C-terminal to the RNA binding region (Figure 2A), is conserved in other coronavirus N proteins. This region is phosphorylated in SARS-CoV by serine-arginine (SR) protein kinases, modulating the role of SARS-CoV N protein in host translation inhibition (Peng et al., 2008). It is likely that phosphorylation of this same region in SARS-CoV-2 plays a similar role. Interestingly, in vitro inhibition of SARS-CoV N protein phosphorylation at the RS-rich region results in reduced viral load and cytopathic effects (Wu et al., 2009), highlighting its importance for viral fitness. In addition, sites spanning the sequence of the RNA binding domain, which forms a claw-like structure, have been observed (Kang et al., 2020). Several phosphorylation sites cluster in the structural model, predicted to affect the surface charge of the so-called acidic wrist region (Figure 2F) but not the positive surface charge of the RNA binding pocket. We hypothesize that this surface charge difference may modulate N protein function, potentially via allosteric regulation of RNA binding capacity.