A structural analysis of M protein in coronavirus assembly and morphology HHS Public Access
Abstract
The M protein of coronavirus plays a central role in virus assembly, turning cellular membranes into workshops where virus and host factors come together to make new virus particles. We investigated how M structure and organization is related to virus shape and size using cryoelectron microscopy, tomography and statistical analysis. We present evidence that suggests M can adopt two conformations and that membrane curvature is regulated by one M conformer. Elongated M protein is associated with rigidity, clusters of spikes and a relatively narrow range of membrane curvature. In contrast, compact M protein is associated with flexibility and low spike density. Analysis of several types of virus-like particles and virions revealed that S protein, N protein and genomic RNA each help to regulate virion size and variation, presumably through interactions with M. These findings provide insight into how M protein functions to promote virus assembly.
Cryo-electron microscopy; cryo-electron tomography; pleomorphic virus structure; coronavirus; viral matrix protein M proteins from MHV (Klumperman et al.
Every type of virus architecture has its own structural constraints. In other words, there is a limit to the variation in shape, size or protein configuration that can be realized by a particular set of structural proteins. When this tolerance is exceeded, the result becomes uncertain: it is possible that the assembly process may fail, produce misshapen but otherwise infectious particles, or yield non-infectious particles. Pleomorphic enveloped viruses represent the most extreme cases of natural variation during assembly. Pleomorphic virions can vary considerably in size, as in the case of arenaviruses (Neuman et al., 2005) or shape, as in the case of influenza A virus (Harris et al., 2006) . Although this variation has been documented, little is known about how the assembly components shape the overall particle architecture. Our understanding of virus assembly has practical implications: a new generation of HIV-1 assembly inhibitors acts at the level of particle architecture by reducing the fidelity of the assembly process (Tang et al., 2003) or by blocking connections between Gag proteins (Sticht et al., 2005) .
In this study we have chosen to analyze the relationship between composition and architecture for three pleomorphic coronaviruses: Mouse hepatitis virus (MHV), Severe acute respiratory syndrome coronavirus (SARS-CoV) and Feline coronavirus (FCoV). Recent electron microscopy studies have confirmed that coronavirus particles vary considerably in size, and so can safely be described as pleomorphic. However, there is disagreement over the extent of variation in virion shape (Barcena et al., 2009; Beniac et al., 2006; Neuman et al., 2006; Risco et al., 1996) , although a range of morphologies is represented in each study. These three coronaviruses make an interesting dataset because each is built from a conserved set of components, but amino acid identity between the homologous structural proteins is typically less than 30%.
Four structural proteins are important for coronavirus infectivity: the integral membrane protein M adapts a region of membrane for virus assembly and captures other structural proteins at the budding site, the N protein chaperones and protects the viral RNA genome, spikes consisting of three copies of the S glycoprotein promote receptor-binding and membrane fusion, and the small membrane protein E is present in sub-stoichiometric amounts and acts as an enhancer of budding (Hogue and Machamer, 2007) . In this study, we will focus on the role of M in assembly and in determining particle morphology. (Machamer and Rose, 1987) , Transmissible gastroenteritis virus, (TGEV) (Klumperman et al., 1994) and Bovine coronavirus (Nguyen and Hogue, 1997) are targeted to the vicinity of the Golgi apparatus. Reverse genetic studies and VLP assembly studies suggest that M protein promotes assembly by interacting with viral ribonucleoprotein (RNP) and S glycoproteins at the budding site (de Haan et al., 1999; Escors et al., 2001a; Escors et al., 2001b; Kuo and Masters, 2002; Narayanan et al., 2000; Nguyen and Hogue, 1997; Opstelten et al., 1995; Sturman et al., 1980) , and by forming a network of M-M interactions that is capable of excluding some host membrane proteins from the viral envelope (de Haan et al., 2000; Neuman et al., 2008b) . M proteins interact through both the transmembrane domain and endodomain (de Haan et al., 2000) . M can also interact with RNA that carries the genomic packaging signal (Narayanan et al., 2003) . Coronavirus assembly is then completed at the membrane of a pre-Golgi compartment, as shown most recently in a tomography study of intracellular structures involved in virus replication and assembly (Knoops et al., 2008) . Packets of virions are then shuttled out of the cell along the secretory pathway (reviewed in (Hogue and Machamer, 2007) ). The minimum requirement for MHV virus-like particle (VLP) production is co-expression of M and E protein (Vennema et al., 1996) , although in some expression systems, the additional co-expression of N increases the efficiency of VLP production (Boscarino et al., 2008) .
Recent studies have begun to reveal the structure of the coronavirus pre-fusion spike , N protein (Chen et al., 2007; Fan et al., 2005; Huang et al., 2004; Jayaram et al., 2006; Saikatendu et al., 2007; Schutze et al., 2006) , the hemagglutinin-esterase protein, which is found on some group 2 coronaviruses (Zeng et al., 2008) , and the E protein (Pervushin et al., 2009) . Also, transmembrane features have been identified as M on SARS-CoV, MHV, FCoV and TGEV particles using cryo-electron microscopy (Neuman et al., 2006) and cryo-electron tomography (Barcena et al., 2009 ), but the structure of M remains poorly characterized. The lack of detailed structural and functional information is largely due to its small size, close association with the viral envelope and a tendency to form insoluble aggregates when perturbed (Lee et al., 2005) .
In this study we have attempted to provide a better understanding of the structure and function of M protein. First, we have used cryo-EM and tomography to probe the structure of M in the envelope of MHV, SARS-CoV and FCoV virions. Second, we have analyzed the structure of the MHV M protein in VLPs that lack S and RNP. This is because identification of M in electron micrographs of virions is complicated by the presence of transmembrane regions of spikes and RNP, and, more importantly, intermolecular interactions could potentially affect coronavirus morphology, including M-M, M-E, M-N, M-S, M-RNA and N-RNA interactions (reviewed in (Hogue and Machamer, 2007) ), palmitin-mediated interactions involving S and E (Boscarino et al., 2008; Lopez et al., 2008; Thorp et al., 2006) , and envelope stretching caused by the packaged helical ribonucleoprotein (Barcena et al., 2009 ). Together, these experiments reveal new facets of the structure and function of M, and demonstrate how pleomorphicity can be harnessed to reveal the function of membrane protein networks.
Growth, purification and imaging of SARS-CoV from Vero-E6 cells, FCoV from AK-D cells and MHV-OBLV60 (Gallagher et al., 1991) ) from DBT cells has been described previously (Neuman et al., 2006) . A small plaque mutant derived from MHV-A59, called MHV-sp1, was isolated by plaque purification from the medium from persistently infected 17clone 1 (17cl-1) cells that had survived infection with MHV-A59 and had been passaged more than forty times (Sawicki, 1987) . The MHV-sp1 S glycoprotein is not cleaved at the S1-S2 boundary and it is resistant to cleavage by trypsin, unlike the S-glycoprotein of MHV-A59. MHV-sp1 was purified by centrifugation. First, the virus from 450 ml of infected cell supernatant (~5×10 11 pfu) was pelleted by centrifugation for 3 hours at 24,000 rpm at 4°C using an SW28 rotor. The pellet was allowed to dissolve on ice overnight in buffered saline (0.15M NaCl, 20mM HEPES, pH 6.8). The 6 ml of the suspended virus was layer on top of linear gradient of 40% (w/w) potassium tartrate, 20mM HEPES, pH 7.4 (bottom) to 15% (w/w) glycerol, 20mM HEPES, pH 7.4 (top) and subjected to isopycnic centrifugation in a SW28 rotor (3 hours at 24,000 rpm at 4°C). The resulting milky band of virus was diluted with buffered saline and pelleted by centrifugation for 3 hours at 24,000 rpm at 4°C using an SW28 rotor. The pellet was allowed to dissolve on ice overnight in 1 ml of buffered saline.
VLPs were produced by transfecting HEK-293T cells with pCAGGS expression vectors encoding M, E and N from MHV-A59 (Lokugamage et al., 2008) ). Briefly, flasks of nearconfluent cells comprising approximately 1.5 m 2 of total culture area were transfected with plasmid DNA using Lipofectamine and "Plus" reagent (Invitrogen) according to an appropriately scaled version of the manufacturer's protocol. Virus-like particles were precipitated from culture medium 48h after transfection using 10% polyethylene glycol and 2.2% w/v NaCl, then further purified by 10%-30% sucrose density gradient ultracentrifugation as described previously (Neuman et al., 2008a) .
Production and cryo-EM of unilamellar phospholipid vesicles (SUV 100 ) created from a blend of 80 parts 1,2-dioleoyl-sn-glycero-3-phosphocholine, 10 parts 1,2-dioleoyl-snglycero-3-[phosphor-rac-(1-glycerol)], 2 parts 1,2-distearoyl-sn-glycero-3phosphoethanolamine-N-[biotinyl(polyethylene glycol)2000], and 1 part 3,3'-dioctadecyloxacarbocyanine perchlorate has been described previously (Kunding et al., 2008) .
Cryo-EM was done using standard low-dose imaging conditions. Images of MHV-OBLV60, VLPs and FCoV were recorded using Leginon (Suloway et al., 2005) . Cryo-electron tomography of MHV-sp1 was done using a JEOL JEM-2200FS microscope with energy filter operated at 200 kV and 25,000 times magnification. Specimens were tilted along one axis through 140°, from +70° to −70°, and images were recorded to a 4k×4k CCD at a nominal resolution of 5 angstroms per pixel. Tomographic reconstruction was performed using the imod software suite (Kremer et al., 1996) by fiducial alignment of 10 nm gold particles in tilted images. Two-fold binning was performed at the time of imaging to produce a model with a calibrated resolution of 10.6 angstroms per pixel. Details of the equipment and conditions used in conventional cryo-EM are provided in Table 1 .
Before analysis, image contrast was inverted so that protein density appeared white rather than black. We corrected for the effects of phase reversal in the contrast transfer function using the EMAN module ctfit (Ludtke et al., 1999) . The images used for two-dimensional reconstructions displayed Thon rings, which indicate the presence of image data, to 8-16 Å resolution. Micrographs were filtered in Fourier space to truncate high-frequency image data beyond the last visible Thon ring using the EMAN module proc2d. We then selected small images showing M using the EMAN module boxer.
Image clustering and refinement of class averages was performed using the startnrclasses and classalign2 modules. Full contrast transfer function correction was implemented during construction of class averages using classalign2. The clearest, most coherent images of M densities were obtained when correction was applied through 17 Å resolution.
For radial density analysis to assess the location of S, M and the RNP inside the virion, quadrants of particles were selected to minimize the distortion caused by small variations in curvature. Quadrants were selected for inclusion in the final average based on clarity and contrast in the envelope region. For radial density analysis of M COMPACT and M LONG , radial density maps were constructed by selecting and aligning numerous small wedges centered at the particle edge, which were then averaged. Radial density analysis was performed using the SPIDER image analysis suite (Frank et al., 1996) , and was normalized to the brightest and darkest datapoints, which were assigned relative brightness values of 100% and 0%, respectively. The signal from the envelope region was used as a fiducial mark for normalization of image intensity and alignment of radial density profiles.
Two observers recorded two sets each of orthogonal measurements of the longest and shortest visible diameter of each virion or VLP, so that contributions of observer bias and measurement variation could be factored into our analysis. Average diameter (d AVG ) was taken as a measure of particle size, and a ratio of the longest to the shortest visible diameter (d MAX /d MIN ), was taken as a measure of particle shape ( Supplementary Fig. S1 ). Diameters were measured to the outer edge of the membrane, leaving out spikes. Particles were excluded if part of the membrane was outside the captured image area, or if the membrane overlapped with the carbon support layer or another particle. Multilamellar, overlapping or tubular SUV 100 particles, and exosomal vesicles that appeared to have inner contents or membrane-embedded proteins were also excluded.
To assess the precision of our measurements, the distance between lipid bilayer headgroup densities was measured repeatedly by two observers. The distance between the brightest parts of the lipid bilayer on vesicles matched expected results to within 1 nm (Nagle and Tristram-Nagle, 2000) . Measurements of the longest and shortest diameter of vesicles were less precise, perhaps due to combined errors in measurement and identification of the maximum and minimum particle diameter (see Supplementary Fig. S2 ).
In order to assess whether virus size and shape varied from one preparation to another, we compared cryo-EM images of four preparations of MHV-OBLV60 and one preparation of MHV-sp1. MHV-OBLV60 virions were fixed with 1% phosphate-buffered formalin (pH 7.0) before imaging; MHV-sp1 virions were not fixed because of differences in local biosafety regulations between EM facilities. Particle diameter was similar for all five MHV preparations (Supplementary Figure S3) . The shape of particles in two OBLV60 preparations was significantly more elongated than the other three, so these particles were excluded from shape analysis.
Before marking particles as spike-depleted or containing an envelope thickness anomaly, we revised our criteria empirically by repeatedly examining images of SARS-CoV particles and tracking the agreement between observers. The criteria were revised until inter-observer agreement was consistently greater than 80%. The final criteria for categorization were: "Does closely-packed spike decoration extend around at least half of the particle edge?" and, "Is an abnormally thin region of no less than one-twelfth the virion circumference visible in the M density layer?" Statistics describing inter-observer agreement are presented in Supplementary Figure S2 .
Complementary DNA was generated by reverse transcriptase PCR, and a fragment encoding amino acids 107-221 of the SARS-CoV-Tor2 M-protein was cloned into pET46Ek/LIC (Novagen, USA). The vector encodes an N-terminal poly-histidine tag and a flanking enterokinase cleavage site for tag removal. Expression of M 107-221 was achieved in BL21 (DE3) E. coli (Invitrogen, USA) and induced at an OD600 of 0.6-0.8 with 1mM IPTG followed by overnight growth at 18°C. The majority of the expressed protein was present as insoluble inclusion bodies which were subsequently refolded.
Bacteria were lysed at 4°C using an EmulsiFlex® C-3 cell disruptor (Avestin, Canada) at 15 kpsi in lysis buffer (20mM Tris-HCl, 100mM sodium chloride, 10mM dithiothreitol, 1% Triton X-100 at pH 7.0). The insoluble fraction containing the inclusion bodies was isolated by low-speed centrifugation. Pellets containing the inclusion bodies were resuspended in lysis buffer and then pelleted by centrifugation three times to remove any soluble material that was present. The pellets of insoluble material were then washed 3 times, as above, with wash buffer (20mM Tris-HCl, 100mM sodium chloride, 10mM dithiothreitol at pH 7.0) to remove residual detergent. The resultant white pellet was dissolved in guanidine buffer (6M guanidine hydrochloride, 50mM Tris-HCl, 10mM dithiothreitol, 1mM EDTA at pH 8.0) to a final concentration of 15 mg/ml as determined by Bradford protein assay (BioRad, USA).
Solubilized M 107-221 inclusion bodies were refolded using the rapid dilution method. Solubilized inclusion body protein was rapidly added dropwise to refolding buffer (50mM HEPES, 200mM sodium chloride, 1M NDSB-201, 10mM beta-mercaptoethanol at pH 7.0, 4°C) to give a final protein concentration of 0.1mg/ml. The protein was allowed to refold for 1-3 h before being applied to a 5 ml nickel affinity column (GE Healthcare, USA). The column was washed with 5 column volumes of binding buffer (50mM HEPES, 200mM sodium chloride, 10 mM beta-mercaptoethanol at pH 7.0) and refolded protein was eluted in binding buffer supplemented with 250 mM imidazole and 1 mM EDTA. Further purification was achieved by passing 2 ml samples of the eluted protein over a pre-equilibrated Superdex™ 75 16/60 size exclusion column (20mM HEPES, 200mM sodium chloride, 5mM dithiothreitol at pH 7.0). The purified monomeric protein was concentrated to 1.8 mg/ml using 0.5 ml Ultrafree Biomax 5kDa concentrators (Millipore, USA). The protein was >95% pure as assessed by SDS-PAGE. Purified protein was stored in 20mM HEPES, 200mM sodium chloride, 5mM dithiothreitol at pH 7.0.
The stoichiometry of expressed M endodomain was assessed using perfluoro-octanoic acid polyacrylamide gel electrophoresis (PFO-PAGE; (Ramjeesingh et al., 1999) ). To do this, 20 µl, 10 µl, 5 µl or 2 µl of purified protein was made up to 20 µl total by adding 20mM HEPES, 200mM sodium chloride, 5mM dithiothreitol, pH 7.0. Samples were incubated at either 4°C or 37°C for 1 h then separated by electrophoresis on precast 4-20% acrylamide gels in Tris-glycine buffer containing 0.5% (wt/vol) PFO. Protein was detected by SYPROruby staining (Invitrogen).
The main problem with using cryo-EM to investigate protein structure is limited resolution, which usually does not allow for the structure to be interpreted by an atomic model (Stewart and Grigorieff, 2004) . As a consequence, it can be difficult to identify small features such as M in cryo-EM images.
Previous studies have used what is known about protein size, topology and function to infer the structure of spikes and RNP (Barcena et al., 2009; Beniac et al., 2006; Cavanagh, 1983; Davies and Macnaughton, 1979; Neuman et al., 2006; Risco et al., 1996) . However, none of these studies produced a clear view of M, which is obscured in micrographs by the viral membrane, the C-termini of S proteins and the RNP. To better understand the structure and organization of M, we examined by cryo-EM ( Fig. 1a -c) and cryo-electron tomography (Figure 1d ), coronavirus particles, vesicles, VLPs containing only M and E, and VLPs containing M, E and N proteins. In purified virus preparations we also found a small number of apparently protein-free empty vesicles, which presumably were released from infected cells, and which we refer to in this study as exosomes. Viruses and VLPs had thicker envelope regions than vesicles or exosomes due to the presence of M protein, as previously reported (Barcena et al., 2009; Neuman et al., 2006) . The purification process used for MHV-sp1 also produced some free, spike-decorated envelopes. Free envelopes appeared thicker than ordinary phospholipid bilayers, suggesting that M was still present, but did not contain any trace of the RNP core ( Figure 1d ).
To determine the boundaries of packaged M (Fig. 2a) , we examined radial density maps from groups of 18-22 empty exosomes, EM and EMN VLPs, low-spike and normal virus particles, as shown in Fig. 2b . To identify individual components, we subtracted one set of appropriately scaled and aligned radial density data from another, as shown in Fig. 2c .
From these data we determined that the endodomain of M extended inward ~8 nm from the highest-density point of the outer membrane leaflet, in agreement with published measurements of MHV M protein from cryo-EM and tomography (Barcena et al., 2009; Neuman et al., 2006) . The ectodomain of M did not give rise to a detectable radial density signal, suggesting that it may either be disordered or tightly membrane-associated. An area of consistent, high density which extended from 10 nm to 30 nm inside the particle was attributed to RNP, consistent with recent measurements by Barcena et al., (Barcena et al., 2009 ). Fig. 2d shows that the variability in radial density was low from the virus exterior through the RNP feature, but innermost parts of the virion were highly variable. Taken together, these data show the extent of M and M-linked RNP inside the particle.
Viral particles were examined closely to identify individual M protein shapes. On most particles, M at the virion edge resembled tightly packed lines crossing the membrane and contacting the RNP (Fig. 3) . Small M-free areas which may represent budding scars were occasionally visible at the edge of virions and VLPs (Fig. 3a) . The observation that M remained tightly packed even on particles with excess M-free membrane provides structural validation of the existence of M-M interactions. The observation that M densities appear to make contact with the RNP core likewise can be taken as evidence of M-N or M-RNA interactions.
On a few particles, some of the M-like protein had a compact, blurred appearance and did not appear to make contact with the RNP. We have called the common form M LONG and the short, blurred form M COMPACT . Viral spikes can be seen on both M LONG and M COMPACT , but not on M-free membranes ( In order to express M conformation numerically, we performed radial density analysis of small sections at the particle edge where M LONG or M COMPACT was visible. The M endodomain density consisted of the tail region, extending 6 to 8 nm into the particle where the differences between M conformations were most apparent, and the denser body region, extending 3.5 to 5.5 nm into the particle. After subtracting the density of the ice outside the particle to correct for differences in background and scaling image brightness to a common mean, the ratio of M tail to middle was calculated for each virion or VLP. We found that the density in the "tail" region of M COMPACT was similar to the density of the background ice, and the peak M endodomain was higher than for M LONG . To express M conformation more simply, we subtracted the background from the tail and body densities and calculated tail to body ratios for M LONG and M COMPACT (Fig 3i) . We found that the difference in tail to body ratios was statistically significant (T test, P = 0.014 for FCoV, P = 2.0 × 10 −17 for SARS-CoV and P = 3.0 × 10 −6 for MHV). This suggests that the two forms of M represent different conformations of the same peptide chain, and the distribution of density at the virion edge can be used as an approximate readout for M conformation.
We next examined M proteins in more detail by classifying, aligning and averaging similarlooking regions from particle edges to produce class averages. M LONG resembles a dagger, with the ectodomain "pommel" resting on the outer membrane leaflet and the tip of the endodomain "blade" extending ~8 nm and contacting the RNP (Fig. 4a-b) . The endodomains of adjacent M LONG particles appeared to contact each other, suggesting that M LONG -M LONG interactions are mediated by the endodomain of M. M LONG on EM VLPs appeared similar to M on virions, thus confirming that M LONG is formed by M in the absence of the other high-copy virion proteins S and N (Fig. 4b) . Some class averages of M LONG also showed a distinct tilt relative to the membrane. The endodomain of M LONG was resolved into two globular components with crisp borders between adjacent molecules, while the endodomain of M COMPACT appeared as an indistinct ellipsoid and extended only ~6 nm (Fig. 4a) . The two globular components of the endodomain were not distinctly observed in class averages from EM VLPs (Fig 4b) , suggesting that either the conformation is only formed in the presence of N or that N forms part of the two-lobed M LONG . M LONG and M COMPACT were also visible in virtual slices through tomographic reconstructions of MHV. The difference in apparent length was most evident on tomographic slices through free envelopes, which showed thick convex M LONG membranes near the center giving way to thinner variably curved M COMPACT membranes near the edges (Fig. 5a-b) .
M LONG was spaced 4 to 5 nm apart in unprocessed cryo-EM images and edge view class averages (Fig. 6) . In axial views of M taken from the centers of EM and EMN VLPs, M sometimes appeared to form small ordered regions (Fig. 6a) which could be classified, aligned and averaged to produce class averages (Fig. 6b) . M spacing in axial views was generally consistent with spacing in edge views (Fig. 6c) . While axial class averages suggest that M packing generally approximates a rhombus with sides of 4.0 and 4.5 nm and an interior angle of about 75 degrees, the amount of heterogeneity in the unprocessed images is more consistent with a loosely-ordered network of M, as opposed to a rigid two-dimensional protein lattice.
Edge view class averages of M LONG from MHV, FCoV and SARS-CoV virions appeared about twice as large as expected for a single copy of the M protein, based on the partial specific volume of folded protein (Harpaz et al., 1994) . This can be seen by comparing class averages showing M (Fig. 4b) with projections of ellipsoids with the same spacing as M, and the volume of one, two or three copies of M, (Fig. 4c ). This suggested that each M LONG and M COMPACT is large enough to contain at least two copies of M protein.
We next attempted to investigate the stoichiometry of purified M protein biochemically. Attempts to express SARS-CoV M protein in E. coli and baculovirus expression systems produced only insoluble aggregates. The largest M construct which we were able to express and purify consisted of residues 107 to 221 from SARS-CoV M, which covers the area of the endodomain from just after the third predicted transmembrane region to the C-terminus. M 107-221 was expressed as an aggregate initially, but became soluble upon refolding.
Since a fusion protein incorporating the endodomain of MHV M was shown to coimmunoprecipitate with full-length M protein (de Haan et al., 2000) , and class averages suggested that adjacent M LONG dimers make contact via two parts of the endodomain, we decided to investigate the oligomerization of M 107-221 as a surrogate for full-length M. Soluble endodomain incubated in saline at 37°C was present as monomers and aggregates as well as units of two, four and six, while endodomain incubated at 4°C remained mostly monomeric (Supplementary Figure 4) . These observations support the interpretation that M functions as a homodimer.
In solution, the shape of a vesicle is determined by the interplay of opposing forces (Zhongcan and Helfrich, 1987) . The ground state for vesicles in solution is spherical. Forces imparted by fluid motion or sample freezing in preparation for cryo-EM can temporarily distort vesicle shape. However, as a particle becomes less spherical the effects of hoop stress will increase, leading either to structural failure of the vesicle wall or driving the vesicle back to a more toward spherical, lower-stress shape.
Since enveloped virus particles and are essentially protein-decorated vesicles, we reasoned that their shape should be similar to the shape of vesicles unless modified by the effects of viral protein interactions. To test whether viral proteins significantly affected the shape of viral particles, we compared the shape of virions, VLPs and small unilamellar vesicles of the same size (SUV 100 ) in cryo-EM images.
In our images, vesicles ranged from round to ellipsoidal, with the average particle having a long axis 1.08 times the length of the shortest axis ( Supplementary Fig. 1 ). The shape of MHV and EM VLPs did not differ significantly from the shape of vesicles. However, EMN VLPs were significantly rounder than the combined population of SUV 100 and endosomal vesicles (ANOVA, p=0.005). Although EM particles were rounder on average than EMN particles or vesicles, the difference was not significant due to the small sample size and heterogeneity of the EM particles. This indicates that the protein component of EMN VLPs provides significant resistance to the deformative forces that affected vesicles under cryo-EM conditions. We next sought an explanation for how E, M and N proteins can make spherical vesicles more rigid. N proteins can dimerize (Chen et al., 2007) and pack helically in the presence (Barcena et al., 2009 ) and absence (Saikatendu et al., 2007) of RNA, but neither of these interactions alone would be expected to produce a spherical arrangement. E protein is small and not abundant enough in viral particles to be a convincing explanation for overall shape. In contrast, M is abundant, interacts with other M proteins and with membranes, which are approximately spherical. We therefore hypothesized that particle shape is controlled primarily through M.
A striking difference between M LONG and M COMPACT is that M LONG appears to contact the RNP while M COMPACT does not. In cryo-EM images, the internal RNP appeared to be pulled away from the particle edge where either M COMPACT or M-free membrane is present, as seen in Figure 3 . Thus, M COMPACT and M-free membranes can both be viewed as local disruptions of the M-RNP interaction. To test the hypothesis that M-mediated interactions control particle shape, we compared the shape of particles in which M LONG appeared to form a complete ring at the particle edge to the shape of particles in which the ring of M LONG was interrupted by either a M-free membrane or M COMPACT .
Interruption of M LONG at the particle edge was associated with significant particle elongation in SARS-CoV (ANOVA, P = 5×10 −12 ), FCoV (P = 0.01) and EM (P = 0.03) particles. A similar but non-significant trend was observed for MHV (Fig. 7a) . While a relative increase in M anomalies was correlated with particle elongation, we noted that: The relative abundance of particles with interrupted M was quite different in the three coronaviruses, while the frequency of M anomalies was similar in MHV, EM and EMN particles. This result shows that an uninterrupted layer of M LONG is a marker for spherical morphology, but suggests that there may be inherent differences in the interaction affinities and membrane-bending properties of different coronavirus M proteins.
If only M LONG is associated with coronavirus-like membrane curvature, it is implicit that the ratio of M LONG to M COMPACT should be related to local membrane curvature. To test this, we selected four parts of each particle edge, which were centered on either side of the longest and shortest particle diameter as shown in Figure 7b . Two independent observers counted the number of M dimers present and marked them as M LONG or M COMPACT . The ratio of M LONG to M COMPACT was about the same at all of the virion ends, but the proportion of M marked as M LONG on the flatter sides decreased as particles became more elongated (Fig. 7c) . From these results we conclude that clusters of M LONG mark round membranes.
We noted that some samples of purified MHV appeared quite ellipsoidal while others were mostly round, while virion size was similar in all of the MHV preparations ( Supplementary Fig. 3 ). Purification over sucrose density gradients would at least temporarily expose viral particles to osmotic stress. Likewise, centrifugation would necessarily expose virus particles some mechanical stress. We suspected that in some cases the purification process might change virion shape.
To investigate whether virion shape could be changed experimentally after purification, one sample of purified virus was resuspended in HEPES-buffered 0.9% saline buffer, pH 7. Half the sample was kept in the resuspension buffer, while the other half was acidified to pH 5 for 5 minutes in order to simulate pH changes that might occur during entry via the endosomal route, and then re-buffered to pH 7. The shape and size of native MHV (Prep 1 in Supplementary Fig. 3 ) and acid-pulsed MHV (Prep 3) were examined by cryo-EM. Overall particle shape did not change significantly as a result of acidification (ANOVA, P = 0.93). However, the relative proportion of particles which appeared to have at least one flattened edge increased significantly after acid-treatment (Fisher's exact test, P = 0.0001; Fig. 8a-c) .
In some flat-edged particles, the interior RNP appeared to have a crisp edge, which was more distant from the membrane than RNP in native particles and M had the blurred appearance characteristic of M COMPACT (Fig. 8b) . Radial density plots and M tail to body ratios demonstrate that the altered M which was more frequently found at flat edges of acidtreated particles strongly resembled M COMPACT from native MHV particles (Fig. 8d- (Fig. 8e ). This suggests that intracellular N plays an important role in the formation or packaging of M LONG .
Previous studies have shown that spikes are dispensable for assembly but essential for infectivity. It was therefore surprising to find that spikeless EM and EMN VLPs were significantly larger than virions ( Supplementary Fig. 1 ) and rare viral particles on which no spikes were visible were significantly larger than spike-decorated virions (Fig. 9a) , suggesting that spike incorporation was linked to envelope size. Since the size of the virus envelope is fixed at the scission stage of the budding process, we inferred that spike incorporation is linked to factors which produce small virions.
To determine which factors were linked to particle size, we measured MHV VLPs in cryo-EM images and MHV virions in tomograms, where spike decoration could be assessed in three dimensions. MHV EM VLPs (93 nm average diameter) were larger than EMN VLPs (91 nm) and spikeless MHV in tomograms (91 nm) and spike-decorated MHV (88 nm). Viral particle size decreased as particles approached the full complement of proteins and RNP ( Supplementary Fig. 1) , suggesting that small virions are produced as a result of the interplay between all viral components during assembly.
A few SARS-CoV particles appeared to have clusters of spikes at one or two spots on the viral envelope. Spike clusters were significantly associated with the curved ends of ellipsoidal particles (Fig. 9b) , which we had previously found to be marked by M LONG . We therefore concluded that M LONG is a marker of spike decoration.
Here we demonstrate that a network of M has intrinsic membrane-bending properties, as recently demonstrated for the HIV-1 Gag polyprotein (Carlson et al., 2008) . We report that M is functionally dimeric in viral particles, and the membrane-altering properties of M depend on interactions with other viral components. Our analysis suggests that two types of M-M interactions should be considered in future structural studies: interactions which maintain the M dimer and may occur throughout the protein, and interactions between dimers which are probably mediated by the endodomains, which form a matrix-like layer underneath the membrane. In several ways, the function of coronavirus M appears be analogous to that of influenza A virus M1 protein, which shows pH-specific differences in membrane-bending (Ruigrok et al., 1992) . Crystallography studies have also revealed that M1 proteins have similar structures at acidic and neutral pH, but differ in the way protein monomers interact (Harris et al., 2001) .
M of all coronaviruses appears to adopt an N-ecto/C-endo topology (Armstrong et al., 1984) , but transmissible gastroenteritis coronavirus M also adopts an alternate N-endo/Cecto topology (Risco et al., 1995) . We initially considered that the difference between M LONG and M COMPACT might be due to altered topology. The N-ecto/C-ecto topology proposed for TGEV M involves part of the endodomain forming a fourth transmembrane span, with part of the C-terminus exposed on the virion surface. Since the pretransmembrane region of M is much smaller than the post-transmembrane region, an upsidedown topology should be visually distinctive. However, the endodomains of M LONG and M COMPACT were of similar size in class averages. Instead, we prefer an interpretation in which M COMPACT and M LONG are conformationally distinct homodimers of N-ecto/C-endo topology, and the difference in appearance is due to a conformational change that either stretches (M LONG ) or collapses (M COMPACT ) the structure of the endodomain.
Prior to this study, the existence of reduction-sensitive complexes with a molecular weight consistent with approximately 2, 4 and 8 copies of M had been reported for HCoV-229E (Arpin and Talbot, 1990) , but formation of a discrete M oligomer had not been demonstrated for any other coronavirus. The coronavirus M dimer would appear to be functionally equivalent to the heterodimeric complex of the triple-spanning membrane proteins GP5 and M which is essential for assembly of Equine arteritis virus (Snijder et al., 2003) , a distantly related nidovirus. Further study of nidovirus ultrastructure is needed, but these observations coupled with the presence of one or more predicted three-transmembrane protein genes in every known coronavirus (M, but also SARS 3A and FCoV 3C), arterivirus (M and GP5), bafinivirus (GP4/M) and torovirus (M) genome suggests that dimeric complexes of triplespanning membrane proteins may be a hallmark of nidovirus assembly.
While examining virions, we noticed that particles were most often either entirely spikedecorated or entirely spikeless. Further examination revealed that spikeless particles were significantly larger than spike-decorated particles, demonstrating that spikelessness resulted from a defect in assembly, not by accidental shearing or fusion activation. This led to a closer examination of SARS-CoV particles with patchy spike decoration. Spike patches were most often found at the ends of elongated particles, where M LONG is common, suggesting that M LONG mediates spike incorporation. Figure 10a shows a conceptual model of an elongated SARS-CoV particle with spikes decorating both ends, where M LONG is plentiful. The association of small clusters of S with M LONG was unexpected, but we were unable to determine whether the missing spikes were not incorporated, or were present in an elongated disordered conformation, as would be expected after fusion activation. Another possible explanation for this phenomenon is that incorporated spikes stabilize M LONG .
There is some evidence to link the endodomain, which appears to undergo the most noticeable change between M LONG and M COMPACT , to incorporation of S protein. While the transmembrane region of M is important for M-S interactions, one residue of the M endodomain has been implicated in incorporation of S (de Haan et al., 1999) . Mutation of a conserved tyrosine residue at position 211, near the N interaction site, does not affect VLP production but prevents S incorporation (de Haan et al., 1999) . The C-terminus of the M endodomain shows the most profound structural difference between M LONG and M COMPACT . This suggests that tyrosine-211 may be important for M LONG stability or for conversion between conformers. Further study is needed to determine the structural basis for coronavirus M, N and S protein interactions.
By comparing M proteins on virus particles and VLPs, we identified N, S and genomic RNA as factors that increase the ratio of M LONG to M COMPACT . Properties attributed specifically to M LONG , which include membrane rigidity, uniform curvature and spike incorporation, appear to be geared to assembly of infectious virus, and appear to be opposed by the properties of M COMPACT . We would therefore hypothesize that factors or treatments which increase the relative abundance of M LONG predict structural success, marked by increased virion fitness. Further work is needed to address the relationship between internal virion structure and infectivity.
Protein spacing data presented here and previously (Neuman et al., 2006) suggests that 8 dimeric M densities can accommodate a maximum of four N proteins and one trimeric spike protein, thus forming an 8M 2 :4N:1S 3 unit at the virion surface (see the boxed region in Figs. 10b-c). We previously reported that the minimum spacing between spikes on highly decorated SARS-CoV particles was ~14-15 nm, which is about 4-5 nm farther apart than would be expected based on the ~10 nm width of each spike (Neuman et al., 2006) . Fourier transformation revealed a ~13-18 nm −1 frequency signal in tomographic projections of spike-decorated MHV, but not spikeless MHV, confirming that spike spacing for MHV-sp1 and SARS CoV is similar (data not shown).
The diagrams in Figures 10b and 10c show a packing model in which adjacent spikes are incorporated at anchor points across the M LONG network. Each M dimer could potentially interact with a spike, but the closest packing that could be achieved in this model would be ~14-15 nm apart, based on the bulk of each spike and the nearest available anchor point. To test the quality of this model, we counted the number of spikes on three-dimensional reconstructions of MHV particles. These particles showed an average of 74 spikes per particle, which gives an approximate inter-spike spacing of 17 nm. Our model predicts ~90 spikes per particle. Low spike incorporation in areas of M COMPACT may explain some of the difference between predicted and actual spike count.
The protein spacing model in Fig. 10b -c shows an average spacing of 4-5 nm between M protein dimers. Using the M spacing data for each virus (Fig. 6c) , this would give ~1100 M 2 molecules per average SARS-CoV, MHV and FCoV particle. We are unable to directly count the number of M features per particle because of the resolution of the reconstructed tomograms. However, the ratio of M to N in purified coronavirus particles has been well studied, and provides an indirect way to test the validity of our proposed M 2 spacing. Estimated ratios of M to N protein in purified coronaviruses range from about 3M:1N (Cavanagh, 1983; Escors et al., 2001b) to 1M:1N (Hogue and Brian, 1986; Liu and Inglis, 1991) , giving 730 to 2200 N molecules per virion. That works out to one N protein per 14-40 nucleotides of the genome. By comparison, the nucleoproteins of rabies virus and vesicular stomatitis virus are of similar molecular weight to coronavirus N proteins, and have been demonstrated to bind 9 nucleotides per nucleoprotein (Albertini et al., 2006; Green et al., 2006) . If, as suggested by Barcena et al., (Barcena et al., 2009) , M dimers are spaced 6.5 nm apart, results of 60-200 nt/N molecule are obtained.
In the model of coronavirus assembly shown in Fig. 10d , M is shown as a mixed population of M LONG and M COMPACT in the endoplasmic reticulum membrane. M is then transported and interacts with other viral membrane proteins at the site of budding. The arrival of the ribonucleoprotein acts as a catalyst for the conversion of M COMPACT to M LONG . The M LONG densities bend the membrane to form a sphere around the ribonucleoprotein, with the size of the sphere inversely related to the relative abundance of M LONG . Supplementary Figure 5 relates virus particle size to membrane curvature per M dimer. We hypothesize that factors that interact with M such as N, S and genomic RNA could decrease particle size by limiting variation membrane curvature. After budding and release, environmental stress could convert some regions of M LONG back to M COMPACT , allowing the RNP to form longer helices and causing particle elongation.
The function of E in virogenesis remains poorly understood. It is interesting that the function and packaging of E and S are dependent on palmitin acylation (Boscarino et al., 2008; Lopez et al., 2008; Thorp et al., 2006) , and further research will be needed to test whether M densities may contain one or more palmitin binding regions. Point mutations in E have been shown to result in the assembly of large, elongated, thermolabile MHV particles (Fischer et al., 1998) . In light of the model proposed here, we would speculate that these characteristics can be attributed to an overabundance of packaged M COMPACT , suggesting that E plays a role in promoting M LONG formation or incorporation. This interpretation could be tested by cryo-EM and tomography of viruses carrying mutations in M and E.
In this study we described two functionally distinct forms of coronavirus M protein. Both types of M visible on viral particles were larger than expected for a single M protein, but consistent with the expected size of two M proteins. We demonstrated that M protein endodomains can self-assemble into oligo-dimeric complexes at 37°C. We showed that formation of a convex, rigidified viral envelope is dependent on the presence of one form of M, which we have called M LONG . Statistical evidence suggests that spike incorporation is linked to particle size, and that spikes cluster in regions where M LONG is common. To explain these observations, we proposed a model in which locally-ordered networks of M LONG , stabilized by S, N and possibly E proteins, control particle size and the efficiency of assembly.
Cryo-electron micrographs of small unilamellar vesicles (A), MHV-like particles (B), three coronaviruses (C) and cryo-electron tomography of MHV (D) are shown. The longest and shortest visible diameter of each particle was measured, as shown in panel A. Viral particles were distinguished from empty exosomes by the thickness of the envelope. A spikeless particle (*) and free viral envelopes (e) are marked. Gold particles which were used as fiducial markers in construction of the tomogram are marked with arrows. shown to illustrate the difference in appearance between M LONG and M COMPACT from three coronaviruses. Microscopy conditions and particle count. b Conventional cryo-EM images taken at a defocus of −3.3 µm or closer to focus were used for two-dimensional image analysis.
c Denotes complete tomograms
|