Id |
Subject |
Object |
Predicate |
Lexical cue |
TextSentencer_T1 |
0-61 |
Sentence |
denotes |
Modular organization of SARS coronavirus nucleocapsid protein |
TextSentencer_T2 |
63-71 |
Sentence |
denotes |
Abstract |
TextSentencer_T3 |
72-166 |
Sentence |
denotes |
The SARS-CoV nucleocapsid (N) protein is a major antigen in severe acute respiratory syndrome. |
TextSentencer_T4 |
167-237 |
Sentence |
denotes |
It binds to the viral RNA genome and forms the ribonucleoprotein core. |
TextSentencer_T5 |
238-353 |
Sentence |
denotes |
The SARS-CoV N protein has also been suggested to be involved in other important functions in the viral life cycle. |
TextSentencer_T6 |
354-591 |
Sentence |
denotes |
Here we show that the N protein consists of two non-interacting structural domains, the N-terminal RNA-binding domain (RBD) (residues 45-181) and the C-terminal dimerization domain (residues 248-365) (DD), surrounded by flexible linkers. |
TextSentencer_T7 |
592-656 |
Sentence |
denotes |
The C-terminal domain exists exclusively as a dimer in solution. |
TextSentencer_T8 |
657-793 |
Sentence |
denotes |
The flexible linkers are intrinsically disordered and represent potential interaction sites with other protein and protein-RNA partners. |
TextSentencer_T9 |
794-892 |
Sentence |
denotes |
Bioinformatics reveal that other coronavirus N proteins could share the same modular organization. |
TextSentencer_T10 |
893-1093 |
Sentence |
denotes |
This study provides information on the domain structure partition of SARS-CoV N protein and insights into the differing roles of structured and disordered regions in coronavirus nucleocapsid proteins. |
TextSentencer_T11 |
1095-1246 |
Sentence |
denotes |
Coronaviruses are the causative agents of a number of mammalian diseases which often have significant economic and health-related consequences [1, 2] . |
TextSentencer_T12 |
1247-1415 |
Sentence |
denotes |
Diseases such as transmissible gastroenteritis in pigs and avian infectious bronchitis in chicken often have great impact on the agricultural industry of a nation [3] . |
TextSentencer_T13 |
1416-1517 |
Sentence |
denotes |
In humans, coronaviruses are often associated with mild respiratory illnesses, including common cold. |
TextSentencer_T14 |
1518-1671 |
Sentence |
denotes |
However, a novel coronavirus has been identified as the etiology agent of severe acute respiratory syndrome (SARS), which has a case fatality rate of ca. |
TextSentencer_T15 |
1672-1680 |
Sentence |
denotes |
8% [4] . |
TextSentencer_T16 |
1681-1820 |
Sentence |
denotes |
Sequence analysis reveals that SARS-CoV represents either a new coronavirus group or an outliner of group 2 coronaviruses [5] [6] [7] [8] . |
TextSentencer_T17 |
1821-2017 |
Sentence |
denotes |
The SARS CoV genome contains five major open reading frames that encode the replicase polyprotein, the spike protein (S), envelope (E), membrane glycoprotein (M), and the nucleocapsid protein (N). |
TextSentencer_T18 |
2018-2099 |
Sentence |
denotes |
SARS-CoV is an enveloped virus with S, M and E proteins as the envelope proteins. |
TextSentencer_T19 |
2100-2214 |
Sentence |
denotes |
The N protein binds to the viral RNA genome and forms the ribonucleoprotein core, which is presumed to be helical. |
TextSentencer_T20 |
2215-2326 |
Sentence |
denotes |
The M protein may also be involved in the formation of the nucleocapsid through interaction with the N protein. |
TextSentencer_T21 |
2327-2470 |
Sentence |
denotes |
Upon infection, the N protein enters the host cell with the ribonucleoprotein core and is able to interact with a number of host proteins [9] . |
TextSentencer_T22 |
2471-2638 |
Sentence |
denotes |
The high abundance of the N protein makes it a major antigen, an attribute which has often been used in the development of rapid-diagnosis kits against SARS [10, 11] . |
TextSentencer_T23 |
2639-2773 |
Sentence |
denotes |
The nucleocapsid protein is a 422 amino-acid protein, sharing only 20-30% homology with the N proteins of other coronaviruses [6, 7] . |
TextSentencer_T24 |
2774-2955 |
Sentence |
denotes |
From genetic and bioinformatics studies, the N protein can be divided into three putative regions: an Nterminal domain, a RNA-binding domain (RBD) and a C-terminal domain [12, 13] . |
TextSentencer_T25 |
2956-3047 |
Sentence |
denotes |
The N-and Cterminal domains are believed to play a role in interaction with other proteins. |
TextSentencer_T26 |
3048-3211 |
Sentence |
denotes |
A number of recent studies have shown that part of the C-terminus in the N protein of SARS-CoV is involved in the oligomerization process of the protein [14, 15] . |
TextSentencer_T27 |
3212-3484 |
Sentence |
denotes |
Rather surprising, the mid-portion of the protein has been shown to interact with the M protein and hnRNP A1 [16, 17] , and structural studies have identified the region between amino acids 45-181 as the putative RNA-binding region, which is close to the N-terminus [18] . |
TextSentencer_T28 |
3485-3636 |
Sentence |
denotes |
These discrepancies from the putative domain partition necessitate the determination of both the functional and structural organization of the protein. |
TextSentencer_T29 |
3637-3747 |
Sentence |
denotes |
However, the structural organization of coronavirus N proteins in general remains largely unknown to this day. |
TextSentencer_T30 |
3748-3888 |
Sentence |
denotes |
We have employed a blend of experimental techniques and bioinformatics analyses to define the structural organization of SARS-CoV N protein. |
TextSentencer_T31 |
3889-4062 |
Sentence |
denotes |
Through the power of nuclear magnetic resonance (NMR) spectroscopy, we present the first evidence that the SARS-CoV N protein consists of two independent structural domains. |
TextSentencer_T32 |
4063-4161 |
Sentence |
denotes |
The first domain lies inside the putative RNAbinding domain identified in a previous report [18] . |
TextSentencer_T33 |
4162-4268 |
Sentence |
denotes |
The second domain lies in the C-terminal half of the protein and is capable of forming dimers in solution. |
TextSentencer_T34 |
4269-4406 |
Sentence |
denotes |
The rest of the protein is highly accessible to the solvent, and bioinformatics analysis predicts that they are intrinsically disordered. |
TextSentencer_T35 |
4407-4537 |
Sentence |
denotes |
Other coronavirus N proteins share similar features of SARS-CoV N protein at the sequence level, implying functional significance. |
TextSentencer_T36 |
4538-4770 |
Sentence |
denotes |
The elucidation of the modular organization of the SARS-CoV N protein, particularly the boundary between disordered and structured regions, facilitates future studies of this class of proteins at the functional and structural level. |
TextSentencer_T37 |
4771-4839 |
Sentence |
denotes |
Sequence alignment, secondary structure and orderdisorder prediction |
TextSentencer_T38 |
4840-5053 |
Sentence |
denotes |
The full-length sequences of SARS and other coronavirus N proteins were aligned using CLU-STALW version 1.83 with the slow algorithm, an identity matrix, a window of 4 amino acids and standard gap penalties [19] . |
TextSentencer_T39 |
5054-5170 |
Sentence |
denotes |
The result was then edited with SeaView based on the position of the known structural domains of SARS-CoV N protein. |
TextSentencer_T40 |
5171-5237 |
Sentence |
denotes |
The JPred server [20] was used for secondary structure prediction. |
TextSentencer_T41 |
5238-5444 |
Sentence |
denotes |
Order-disorder prediction was obtained through sequence submission to the PONDR server (http://www.pondr.com) using the predictor VSL1, which is an implementation of the IST-Zoran predictor [21] [22] [23] . |
TextSentencer_T42 |
5445-5521 |
Sentence |
denotes |
Access to PONDRÒ was provided by Molecular Kinetics (Indianapolis, IN, USA). |
TextSentencer_T43 |
5522-5713 |
Sentence |
denotes |
We cloned fragments spanning the different ordered and disordered regions of the SARS-CoV N protein ( Figure 1B ) based on PONDR information ( Figure 1a ) and reports in the literature [18] . |
TextSentencer_T44 |
5714-5790 |
Sentence |
denotes |
SARS-CoV TW1 strain cDNA sequencing clones were kindly provided to us by Dr. |
TextSentencer_T45 |
5791-5796 |
Sentence |
denotes |
P.-J. |
TextSentencer_T46 |
5797-5847 |
Sentence |
denotes |
Chen of National Taiwan University Hospital [24] . |
TextSentencer_T47 |
5848-6008 |
Sentence |
denotes |
Clones for SARS-CoV N protein fragments were obtained by polymerase chain reaction (PCR) on a RoboCycler Gradient 96 (Stratagene, CA) using appropriate primers. |
TextSentencer_T48 |
6009-6101 |
Sentence |
denotes |
The resulting PCR fragments contained an NcoI site at one end and a BamHI site at the other. |
TextSentencer_T49 |
6102-6203 |
Sentence |
denotes |
After restriction enzyme digestion, the resulting fragments were cloned into pET6H (a gift from Prof. |
TextSentencer_T50 |
6204-6209 |
Sentence |
denotes |
J.-J. |
TextSentencer_T51 |
6210-6289 |
Sentence |
denotes |
Lin, National Yang Ming University, Taiwan) containing a His-tag coding region. |
TextSentencer_T52 |
6290-6438 |
Sentence |
denotes |
Full-length SARS-CoV N protein construct was obtained by sequential ligation of the cloned PCR fragments using appropriate restriction enzyme sites. |
TextSentencer_T53 |
6439-6504 |
Sentence |
denotes |
The sequences of all constructs were confirmed by DNA sequencing. |
TextSentencer_T54 |
6505-6596 |
Sentence |
denotes |
The resultant protein fragments all include an extra MHHHHHHAMG sequence at the N-terminus. |
TextSentencer_T55 |
6597-6750 |
Sentence |
denotes |
For biochemical studies, the SARS-CoV N protein clones were expressed in Escherichia coli BL21(DE3) strain in Luria broth media using standard protocols. |
TextSentencer_T56 |
6751-6922 |
Sentence |
denotes |
To prepare samples suitable for NMR studies, the cells were cultured in standard M9 media supplemented with 15 NH 4 Cl (1 g/l) and 15 N-Isogro (0.5 g/l) (Isotec, OH, USA). |
TextSentencer_T57 |
6923-7124 |
Sentence |
denotes |
The cells were then broken with a microfluidizer and the protein purified through a Ni-NTA affinity column (Qiagen, CA, USA) in buffer (50 mM sodium phosphate, 150 mM NaCl, pH 7.4) containing 7 M urea. |
TextSentencer_T58 |
7125-7335 |
Sentence |
denotes |
The protein was then allowed to refold by gradually lowering the denaturant concentration through dialysis in liquid chromatography buffer (50 mM sodium phosphate, 150 mM NaCl, 1 mM EDTA, 0.01% NaN 3 , pH 7.4). |
TextSentencer_T59 |
7336-7525 |
Sentence |
denotes |
Renatured protein was loaded onto an AKTA-EXPLORER fast performance liquid chromatography (FPLC) system equipped with a HiLoad 16/60 Superdex 75 column (Amersham Pharmacia Biotech, Sweden). |
TextSentencer_T60 |
7526-7614 |
Sentence |
denotes |
Complete Protease Inhibitor cocktail (Roche, Germany) was added to the purified protein. |
TextSentencer_T61 |
7615-7748 |
Sentence |
denotes |
Protein concentration was determined with the Bio-Rad Protein Assay kit as per instructions from the manufacturer (Bio-Rad, CA, USA). |
TextSentencer_T62 |
7749-7841 |
Sentence |
denotes |
The correct molecular weights of the expressed proteins were confirmed by mass spectroscopy. |
TextSentencer_T63 |
7842-8005 |
Sentence |
denotes |
The experiments were conducted using a FPLC System (Pharmacia Biotech, Sweden) with a Hi-Load 16/60 Superdex 75 (prep grade) column at an elution rate of 1 ml/min. |
TextSentencer_T64 |
8006-8154 |
Sentence |
denotes |
The molecular weights of the proteins were estimated from the elution profile calibrated with the LMW Gel Filtration Calibration Kit (Amersham, UK). |
TextSentencer_T65 |
8156-8344 |
Sentence |
denotes |
The homo-bifunctional amine cross-linker disuccinimidyl suberate was purchased from Sigma-Aldrich (MO, USA) and was dissolved in N,N-dimethylformamide (DMF) to a concentration of 25 mg/ml. |
TextSentencer_T66 |
8345-8474 |
Sentence |
denotes |
Reactions were carried out in a final protein concentration of 0.35 mM and a final disuccinimidyl suberate concentration of 5 mM. |
TextSentencer_T67 |
8475-8596 |
Sentence |
denotes |
Mock reactions were set up as controls which contained only the protein solution and DMF without disuccinimidyl suberate. |
TextSentencer_T68 |
8597-8736 |
Sentence |
denotes |
The reaction mixtures in standard buffer were allowed to react for 1 h at 4°C prior to quenching with 100 mM glycine (final concentration). |
TextSentencer_T69 |
8737-8818 |
Sentence |
denotes |
The results were visualized on SDS-PhastGel minigels (Pharmacia Biotech, Sweden). |
TextSentencer_T70 |
8819-8966 |
Sentence |
denotes |
Sedimentation velocity studies were carried out with a Beckman-Coulter XL-A analytical ultracentrifuge with an An60Ti rotor at 20°C and 40,000 rpm. |
TextSentencer_T71 |
8967-9111 |
Sentence |
denotes |
Protein samples were diluted to 0.40-0.75 mg/ml and loaded into standard double sector cells with aluminum or Epon charcoal-filled centerpieces. |
TextSentencer_T72 |
9112-9217 |
Sentence |
denotes |
The UV absorption of the cells was scanned at 280 nm in continuous mode every 10 min for a period of 5 h. |
TextSentencer_T73 |
9218-9266 |
Sentence |
denotes |
The data were analyzed with Sedfit version 8.9d. |
TextSentencer_T74 |
9267-9430 |
Sentence |
denotes |
Collections of 10-15 radial scans were used for analysis, and 200 sedimentation coefficients between 2 and 10 S were employed in calculating the c(S) distribution. |
TextSentencer_T75 |
9431-9549 |
Sentence |
denotes |
The positions of the meniscus and cell bottom were determined by visual inspection, and then refined in the final fit. |
TextSentencer_T76 |
9550-9713 |
Sentence |
denotes |
The partial specific volumes for N45-181, N245-365 and N45-365 were calculated from the amino acid compositions to be 0.7192, 0.7244 and 0.7198 ml/g, respectively. |
TextSentencer_T77 |
9714-9791 |
Sentence |
denotes |
The solvent density and viscosity were calculated with Sednterp version 1.08. |
TextSentencer_T78 |
9792-9910 |
Sentence |
denotes |
All samples were visually checked for clarity after ultracentrifugation, and no indication of precipitation was found. |
TextSentencer_T79 |
9911-10236 |
Sentence |
denotes |
NMR spectroscopy 15 N-labeled protein samples were extensively exchanged with NMR buffer (100 mM sodium phosphate buffer, pH 6.0, containing 50 mM NaCl, 1 mM EDTA, 1 mM 2,2-dimethyl-2-silapentane-5-sulfonate, 0.01% NaN 3 , 10% D 2 O and Complete Protease Inhibitor cocktail) using an Amicon-15 concentrator (Amicon, MA, USA). |
TextSentencer_T80 |
10237-10359 |
Sentence |
denotes |
The final concentrations of the samples were between 0.2 and 3 mM, depending on the solubility of the different fragments. |
TextSentencer_T81 |
10360-10559 |
Sentence |
denotes |
All the NMR data were acquired at 27 and 30°C on 500, 600 or 800 MHz Bruker AVANCE spectrometers equipped with a triple resonance ( 1 H, 13 C and 15 N) TXI probe with an actively shielded Z-gradient. |
TextSentencer_T82 |
10560-10627 |
Sentence |
denotes |
Experimental parameters were set as described previously [25, 26] . |
TextSentencer_T83 |
10628-10762 |
Sentence |
denotes |
CLEANEX-PM spectra, which only show resonances exchanging rapidly with the solvent (k ex >2 Hz), were obtained as described [27, 28] . |
TextSentencer_T84 |
10763-10865 |
Sentence |
denotes |
Data were processed with the XWINNMR suite and AURELIA software (Bruker, Germany) on SGI workstations. |
TextSentencer_T85 |
10866-10955 |
Sentence |
denotes |
The 1 H chemical shift was referenced to 2,2-dimethyl-2-silapentane-5-sulfonate at 0 ppm. |
TextSentencer_T86 |
10956-11043 |
Sentence |
denotes |
The 15 N was referenced using the consensus ratio N of 0.101329118 for 15 N/ 1 H [29] . |
TextSentencer_T87 |
11044-11163 |
Sentence |
denotes |
A series of N protein fragments spanning different regions were constructed based on the PONDR prediction ( Figure 1 ). |
TextSentencer_T88 |
11164-11305 |
Sentence |
denotes |
We used a series of 15 N-HSQC spectra of these fragments to define the position of the structural domains of SARS-CoV N protein ( Figure 2 ). |
TextSentencer_T89 |
11306-11487 |
Sentence |
denotes |
NMR chemical shifts of amide resonances are sensitive to structural changes and the pattern of 15 N-HSQC spectrum has been commonly used to monitor order-disorder of proteins [30] . |
TextSentencer_T90 |
11488-11677 |
Sentence |
denotes |
Well-dispersed spectra are indicative of structured protein whilst congested spectra having resonances clustered around a small region of 8.3±0.5 ppm in the proton dimension are disordered. |
TextSentencer_T91 |
11678-11838 |
Sentence |
denotes |
We observed that the resonances from residues N45-181 have good chemical shift dispersion (Figure 2a) , indicating that the fragment has a structured character. |
TextSentencer_T92 |
11839-11982 |
Sentence |
denotes |
The spectrum of N1-181 is a superposition of well-dispersed resonances and a cluster of overlapping resonances around 8.3±0.4 ppm (Figure 2b ). |
TextSentencer_T93 |
11983-12153 |
Sentence |
denotes |
Comparing the spectra of N1-181 and N45-181 revealed that all resonances belonging to N45-181 were present in the spectrum of N1-181 with no change in resonance position. |
TextSentencer_T94 |
12154-12290 |
Sentence |
denotes |
These results indicate that the N-terminal flanking region between amino acids 1-44 does not affect the structure of the N45-181 domain. |
TextSentencer_T95 |
12291-12423 |
Sentence |
denotes |
To assess the structure of the C-terminal region several C-terminal fragments were prepared for the collection of 15 N-HSQC spectra. |
TextSentencer_T96 |
12424-12551 |
Sentence |
denotes |
We found that the resonances from N248-365 are welldispersed (Figure 2c ), suggesting that N248-365 forms an ordered structure. |
TextSentencer_T97 |
12552-12652 |
Sentence |
denotes |
To define the structural boundaries we constructed fragments containing N-and C-terminal extensions. |
TextSentencer_T98 |
12653-12734 |
Sentence |
denotes |
Figure 2d shows the 15 N-HSQC spectrum of uniformly 15 N-labeled N248-422 sample. |
TextSentencer_T99 |
12735-12883 |
Sentence |
denotes |
Comparing the spectrum of N248-422 with that of N248-365 ( Figure 2c ) we found that all resonances due to N248-365 can be identified in Figure 2d . |
TextSentencer_T100 |
12884-12988 |
Sentence |
denotes |
These results indicate that residues from 365 to the C-terminal do not affect the structure of N248-365. |
TextSentencer_T101 |
12989-13195 |
Sentence |
denotes |
Shortening the fragment to span amino acids 274-365 changes the 15 N-HSQC resonance pattern, which indicates that the 248-273 region is important for structure stabilization of this domain (data not shown). |
TextSentencer_T102 |
13196-13546 |
Sentence |
denotes |
To explore the structure of the region between residues 182-247 and their effect on the structure of N45-181 and N248-365, we constructed the fragment N45-365 which contains the two struc- The lack of resonance perturbation when the two domains are linked together suggests that interaction between these two domains is weak, if they interact at all. |
TextSentencer_T103 |
13547-13651 |
Sentence |
denotes |
Our results conclude that SARS-CoV N protein contains two independent structural domains located at a.a. |
TextSentencer_T104 |
13652-13671 |
Sentence |
denotes |
45-181 and 248-365. |
TextSentencer_T105 |
13672-13723 |
Sentence |
denotes |
These results are consistent with PONDR prediction. |
TextSentencer_T106 |
13724-13891 |
Sentence |
denotes |
PONDR predicts three intrinsically disordered regions in SARS-CoV N protein located at the N-terminus, the C-terminus and between the two ordered regions (Figure 1b) . |
TextSentencer_T107 |
13892-14065 |
Sentence |
denotes |
We also observed additional resonances clustered around 8.3±0.5 ppm in the proton dimension whenever the fragment was extended beyond the two structural domains (Figure 2 ). |
TextSentencer_T108 |
14066-14233 |
Sentence |
denotes |
To test whether the residues beyond the structural domains are truly disordered, we employed the CLEANEX-PM experiment to identify solvent-accessible resonances [27] . |
TextSentencer_T109 |
14234-14356 |
Sentence |
denotes |
The 15 N-HSQC spectrum obtained with CLEANEX-PM pulse sequence contains only resonances from solvent-exposed amide groups. |
TextSentencer_T110 |
14357-14530 |
Sentence |
denotes |
When we compared the CLEA-NEX-PM spectrum of N1-181 (Figure 3b) with that of N45-181 (Figure 3a) , we observed 40 resonances that only appeared in N1-181 but not in N45-181. |
TextSentencer_T111 |
14531-14709 |
Sentence |
denotes |
This number agrees with that expected for the N-terminal region (5 prolines), indicating that all amide protons in the Nterminus of SARS-CoV N protein are exposed to the solvent. |
TextSentencer_T112 |
14710-14959 |
Sentence |
denotes |
We counted 39 additional peaks in the CLEANEX-PM spectrum of N248-422 (Figure 3d ) compared to that of N248-365 ( Figure 3c ) (51 expected since there are 6 prolines), suggesting that the majority of the C-terminal residues are also solvent-exposed. |
TextSentencer_T113 |
14960-15156 |
Sentence |
denotes |
When we compared the CLEANEX-PM spectra of N45-181 (Figure 3a) , N248-365 ( Figure 3c ) and N45-365 (Figure 3f) , we observed the extra resonances representing the region between residues 182-247. |
TextSentencer_T114 |
15157-15342 |
Sentence |
denotes |
A total of 27 additional peaks can be resolved, compared to 64 expected (2 prolines), indicating that about half of the linker region between residues 182-247 is exposed to the solvent. |
TextSentencer_T115 |
15343-15498 |
Sentence |
denotes |
It should be noted here that due to resonance overlapping the numbers counted should be viewed as a lower limit for the number of solvent-exposed residues. |
TextSentencer_T116 |
15499-15717 |
Sentence |
denotes |
Nevertheless we can conclude that all N-terminal residues are solvent exposed whilst most of the residues in the Cterminus and in the linker region between the two structural domains are exposed to the solvent as well. |
TextSentencer_T117 |
15718-15934 |
Sentence |
denotes |
In conjunction with the observation that all additional resonances are observed in between 8.3±0.5 ppm in the proton dimension and PONDR results, we conclude that amino acids 1-44, 182-247 and 366-422 are disordered. |
TextSentencer_T118 |
15935-16089 |
Sentence |
denotes |
The long disordered linker between the two structural domains is consistent with the observation that there is little interaction between the two domains. |
TextSentencer_T119 |
16090-16343 |
Sentence |
denotes |
However, the number of counted peaks in the CLEANEX-PM spectra of the Cterminus and the linker region are less than that expected, so it is likely that parts of these regions are solvent-protected, possibly through the formation of transient structures. |
TextSentencer_T120 |
16344-16505 |
Sentence |
denotes |
Attempt to obtain a spectrum of the linker region alone was unsuccessful due to the extremely poor protein expression of the clone harboring the linker sequence. |
TextSentencer_T121 |
16506-16559 |
Sentence |
denotes |
N45-181 has been identified as an RNA-binding domain. |
TextSentencer_T122 |
16560-16722 |
Sentence |
denotes |
The function of the N248-365 is not clear, but many reports have identified the C-terminal half of SARS-CoV N protein to be involved in oligomerization [14, 15] . |
TextSentencer_T123 |
16723-16931 |
Sentence |
denotes |
To test this possibility, we have applied analytical gel-filtration chromatography, chemical cross-linking and analytical ultracentrifugation to assay the self-association property of the N protein fragments. |
TextSentencer_T124 |
16932-17128 |
Sentence |
denotes |
As shown in Figure 4a , N45-181 elutes out at a molecular weight of 18 kDa and N248-365 elutes out as a 28-kDa molecule, suggesting that N45-181 exists as a monomer and N248-365 exists as a dimer. |
TextSentencer_T125 |
17129-17249 |
Sentence |
denotes |
The self-association between the two N248-365 monomers is very strong, since we could not detect any monomeric fraction. |
TextSentencer_T126 |
17250-17359 |
Sentence |
denotes |
Similarly, N45-365 eluted out at molecular weight of $70 kDa, suggesting that N45-365 also exists as a dimer. |
TextSentencer_T127 |
17360-17558 |
Sentence |
denotes |
Furthermore, when N45-181 sample was mixed with N248-365 sample two peaks at 18 and 28 kDa were observed in the elution profile, demonstrating that the two fragments do not interact with each other. |
TextSentencer_T128 |
17559-17659 |
Sentence |
denotes |
Figure 4b detected the presence of only monomer for N45-181 and both monomer and dimer for N248-365. |
TextSentencer_T129 |
17660-17785 |
Sentence |
denotes |
The quaternary structures of N45-181, N248-365 and N45-365 fragments were further examined by analytical ultracentrifugation. |
TextSentencer_T130 |
17786-17924 |
Sentence |
denotes |
Only one major peak was detected for each of these three protein fragments, indicating that they are structurally homogeneous in solution. |
TextSentencer_T131 |
17925-18157 |
Sentence |
denotes |
The results of data analysis with Sedfit version 8.9d showed that protein fragments N45-181, N248-365 and N45-365 sediment at 1.4 S, 2.6 S and 3.7 S (Figure 4c ), corresponding to a molecular mass of 10, 36 and 68 kDa, respectively. |
TextSentencer_T132 |
18158-18358 |
Sentence |
denotes |
These results confirmed that N45-181, N248-365 and N45-365 exist as a monomer, dimer and dimer, respectively, in agreement with the results of gel-filtration chromatography and chemical cross-linking. |
TextSentencer_T133 |
18359-18459 |
Sentence |
denotes |
Taking together all three results indicate that N45-181 exists as a monomer and N248-365 as a dimer. |
TextSentencer_T134 |
18460-18586 |
Sentence |
denotes |
The fact that dimerization occurs through a structural domain strongly suggest that the process is dependent on the structure. |
TextSentencer_T135 |
18587-18685 |
Sentence |
denotes |
A model of the SARS-CoV N protein interaction based on our current results is shown in Figure 4d . |
TextSentencer_T136 |
18686-18883 |
Sentence |
denotes |
It is interesting to note that we did not observe the formation of higher-order multimer in our studies, which may be important for the formation of the ribonucleoprotein complex within the virion. |
TextSentencer_T137 |
18884-19063 |
Sentence |
denotes |
A possible explanation is that multimer formation may require additional factors, such as the presence of RNA or other parts of the N protein that were not present in our samples. |
TextSentencer_T138 |
19064-19201 |
Sentence |
denotes |
Also we can not exclude the possibility that multimers do form at much higher protein concentrations than the ones used in these studies. |
TextSentencer_T139 |
19202-19301 |
Sentence |
denotes |
We suggest that the dimeric form represents a basic building block of the nucleocapsid of SARS-CoV. |
TextSentencer_T140 |
19302-19425 |
Sentence |
denotes |
Since coronavirus N proteins belong to the same protein family, it is probable that they share similar structural features. |
TextSentencer_T141 |
19426-19570 |
Sentence |
denotes |
Comparison of the order-disorder profile of these proteins ( Figure 5 ) shows that they all share the same disordered regions (hatched regions). |
TextSentencer_T142 |
19571-19739 |
Sentence |
denotes |
There are two long disordered regions in the middle and at the C-termini of the proteins, whereas the length of the N-terminal disordered region shows more variability. |
TextSentencer_T143 |
19740-19894 |
Sentence |
denotes |
Two ordered regions are located between the disordered regions, and their locations generally match those of the structural domains in SARS-CoV N protein. |
TextSentencer_T144 |
19895-19962 |
Sentence |
denotes |
Disordered regions are often involved in biomolecular interactions. |
TextSentencer_T145 |
19963-20163 |
Sentence |
denotes |
The C-terminus of MHV N protein, which is disordered, has been shown to interact with hnRNP A1 [31] , whereas the disordered region in the middle is responsible for its RNA-binding activity [13, 32] . |
TextSentencer_T146 |
20164-20353 |
Sentence |
denotes |
In SARS-CoV, the disordered region in the middle of the N protein has been implicated in N-protein selfinteraction [33] , interaction with the M protein [16] and hnRNP A1 interaction [17] . |
TextSentencer_T147 |
20354-20504 |
Sentence |
denotes |
These experimental observations suggest that disordered regions of coronavirus N proteins are probable interaction sites with functional implications. |
TextSentencer_T148 |
20505-20789 |
Sentence |
denotes |
Ordered regions of coronavirus n proteins share similar secondary structure profiles Secondary structure alignment of coronavirus N protein sequences based on the two structural domains of SARS-CoV N protein show that they share very similar secondary structure profiles ( Figure 6 ). |
TextSentencer_T149 |
20790-20902 |
Sentence |
denotes |
The N-terminal domain has three conserved b strands which have been implicated in RNA binding in SARS-CoV [18] . |
TextSentencer_T150 |
20903-21011 |
Sentence |
denotes |
The C-terminal domain is also mostly conserved in terms of secondary structure position within the sequence. |
TextSentencer_T151 |
21012-21190 |
Sentence |
denotes |
The extensive secondary structure and high similarity suggests that the two structural domains observed in SARS-CoV N protein also exist in the N proteins of other coronaviruses. |
TextSentencer_T152 |
21191-21381 |
Sentence |
denotes |
The results from the order-disorder prediction and secondary structure prediction coupled with sequence alignment suggest that coronavirus N proteins all share the same modular organization. |
TextSentencer_T153 |
21382-21506 |
Sentence |
denotes |
The two structural domains are connected by a disordered linker and capped by disordered Nterminal head and C-terminal tail. |
TextSentencer_T154 |
21507-21589 |
Sentence |
denotes |
The two structural domains of SARS-CoV N protein carry out two distinct functions. |
TextSentencer_T155 |
21590-21693 |
Sentence |
denotes |
The N-terminal domain is able to bind RNA, whereas the C-terminal domain acts as a dimerization domain. |
TextSentencer_T156 |
21694-21779 |
Sentence |
denotes |
The ability of the N-terminal domain to bind RNA is closely related to its structure. |
TextSentencer_T157 |
21780-21910 |
Sentence |
denotes |
Although the structure of the C-terminal domain has not been determined, we suggest that dimerization is also structure-dependent. |
TextSentencer_T158 |
21911-21972 |
Sentence |
denotes |
A number of experimental observations support our hypothesis: |
TextSentencer_T159 |
21973-22228 |
Sentence |
denotes |
First, it has been found that oligomer dissociation and protein unfolding of SARS-CoV N protein occur simultaneously [34] ; second, most self-interaction studies have mapped the oligomerization domain to regions containing the structural domain [14, 15] . |
TextSentencer_T160 |
22229-22288 |
Sentence |
denotes |
The structural domains may also serve additional functions. |
TextSentencer_T161 |
22289-22415 |
Sentence |
denotes |
For example, a putative loop between W302 and P310 in the C-terminal domain has been suggested to bind to cyclophilin A [35] . |
TextSentencer_T162 |
22416-22497 |
Sentence |
denotes |
These additional functions may also be dependent on the structure of the protein. |
TextSentencer_T163 |
22498-22688 |
Sentence |
denotes |
Although the two structural domains do not interact with each other, we cannot discount the possibility that the two domains could act in concert to carry out important biological functions. |
TextSentencer_T164 |
22689-22793 |
Sentence |
denotes |
The long flexible linker between the two domains provides enough freedom to make this scenario possible. |
TextSentencer_T165 |
22794-22909 |
Sentence |
denotes |
Previously, the lack of information on structural organization precluded the study of multiple-domain interactions. |
TextSentencer_T166 |
22910-22982 |
Sentence |
denotes |
Now our findings provide a structural framework to perform such studies. |
TextSentencer_T167 |
22983-23060 |
Sentence |
denotes |
The flexible linker between the two structural domains is largely disordered. |
TextSentencer_T168 |
23061-23162 |
Sentence |
denotes |
This disordered region may enable transient interactions with several structurally distinct partners. |
TextSentencer_T169 |
23163-23245 |
Sentence |
denotes |
It has been shown that the M protein of SARS-CoV binds to this region between a.a. |
TextSentencer_T170 |
23246-23260 |
Sentence |
denotes |
168-208 [16] . |
TextSentencer_T171 |
23261-23366 |
Sentence |
denotes |
Interestingly, human cellular hnRNP A1 has also been shown to bind to almost the same region between a.a. |
TextSentencer_T172 |
23367-23381 |
Sentence |
denotes |
161-210 [17] . |
TextSentencer_T173 |
23382-23589 |
Sentence |
denotes |
The disordered state of this region potentially allows it to interact with different partners depending on context, e.g. with the M protein during virus assembly and with hnRNP A1 during host cell infection. |
TextSentencer_T174 |
23590-23775 |
Sentence |
denotes |
The exact mechanism by which this occurs is not known, but it could involve different induced folding pathways, which has been shown to occur in other disordered proteins [23, 36, 37] . |
TextSentencer_T175 |
23776-23840 |
Sentence |
denotes |
The same phenomenon is observed in other coronavirus N proteins. |
TextSentencer_T176 |
23841-23975 |
Sentence |
denotes |
In mouse hepatitis virus (MHV), the region corresponding to the flexible linker in its N protein is involved in RNA binding [13, 32] . |
TextSentencer_T177 |
23976-24060 |
Sentence |
denotes |
The same region has also been shown to bind murine hnRNP A1 in infected cells [31] . |
TextSentencer_T178 |
24061-24294 |
Sentence |
denotes |
It seems that the coronavirus N proteins share the common theme of using the flexible linker as an interaction ''hotspot'', and use characteristics of disordered regions to achieve multiple functions within a limited sequence length. |
TextSentencer_T179 |
24295-24395 |
Sentence |
denotes |
Phosphorylation is one of the most important regulatory post-translational modification in proteins. |
TextSentencer_T180 |
24396-24593 |
Sentence |
denotes |
SARS-CoV N protein has been shown to get serine-phosphorylated by multiple kinases and phosphorylation is proposed to be a possible mechanism for nucleocytoplasmic shuttling of the N protein [38] . |
TextSentencer_T181 |
24594-24659 |
Sentence |
denotes |
Disordered regions represent potential sites for phosphorylation. |
TextSentencer_T182 |
24660-24776 |
Sentence |
denotes |
The flexible linker of SARS-CoV N protein contains an SRrich region, which is targeted by a number of kinases [39] . |
TextSentencer_T183 |
24777-24833 |
Sentence |
denotes |
In fact, this region can be phosphorylated in vitro (Dr. |
TextSentencer_T184 |
24834-24839 |
Sentence |
denotes |
W.-Y. |
TextSentencer_T185 |
24840-24870 |
Sentence |
denotes |
Tarn, personal communication). |
TextSentencer_T186 |
24871-25078 |
Sentence |
denotes |
Recent in silico prediction suggested that most of the potential phosphorylation sites fall in the disordered regions, although the exact phosphorylations sites have not been identified experimentally [38] . |
TextSentencer_T187 |
25079-25246 |
Sentence |
denotes |
Although the exact role of phosphorylation has not been elucidated, it could be related to regulate functions such as RNAbinding and localization within the host cell. |
TextSentencer_T188 |
25247-25368 |
Sentence |
denotes |
The phosphorylation patterns of other coronavirus N proteins which have been studied also fall in the disordered regions. |
TextSentencer_T189 |
25369-25480 |
Sentence |
denotes |
In avian infections bronchitis virus (IBV), the phosphorylation sites of the N protein have been mapped to a.a. |
TextSentencer_T190 |
25481-25507 |
Sentence |
denotes |
186-198 and 367-394 [40] . |
TextSentencer_T191 |
25508-25602 |
Sentence |
denotes |
These two regions are all located in the disordered region as predicted by PONDR ( Figure 5 ). |
TextSentencer_T192 |
25603-25783 |
Sentence |
denotes |
Phosphorylation of transmissible gastroenteritis virus (TGEV) N protein has also been mapped to residues 9, 156, 254 and 256, which are at or close to the disordered regions [41] . |
TextSentencer_T193 |
25784-25920 |
Sentence |
denotes |
Phosphorylation in disordered regions of structural proteins is also observed in other virus families, such as in Paramyxovirinae [42] . |
TextSentencer_T194 |
25921-26007 |
Sentence |
denotes |
Coronavirus N proteins seem to employ a widespread property to allow for modification. |
TextSentencer_T195 |
26008-26170 |
Sentence |
denotes |
Whether or not such modification affects the folding or structural properties of the protein and how these properties affect its function remain to be determined. |
TextSentencer_T196 |
26171-26293 |
Sentence |
denotes |
Identification of the disordered regions of SARS-CoV N protein provides a blueprint for structural studies of the protein. |
TextSentencer_T197 |
26294-26423 |
Sentence |
denotes |
The structural domains are logical candidates for structural determination through X-ray crystallography or solution NMR studies. |
TextSentencer_T198 |
26424-26572 |
Sentence |
denotes |
However, structure determination of the full-length protein is hindered by the disordered regions, which often interfere with crystallization [43] . |
TextSentencer_T199 |
26573-26615 |
Sentence |
denotes |
The large size of the dimeric protein (ca. |
TextSentencer_T200 |
26616-26720 |
Sentence |
denotes |
90 kDa) also makes full-length structure determination through NMR extremely difficult due to T2 issues. |
TextSentencer_T201 |
26721-26818 |
Sentence |
denotes |
The fact that the two structural domains do not interact provides a handle to solve this problem. |
TextSentencer_T202 |
26819-26939 |
Sentence |
denotes |
The two structural domains can be solved independently and still provide fair representation of the full-length protein. |
TextSentencer_T203 |
26940-27021 |
Sentence |
denotes |
The modular organization of SARS-CoV N protein is shared among other coronavirus. |
TextSentencer_T204 |
27022-27192 |
Sentence |
denotes |
The relative positions of the two structural domains are fairly conserved in all coronavirus N proteins, making them excellent targets for comparative structural studies. |
TextSentencer_T205 |
27193-27396 |
Sentence |
denotes |
The structures of the N-terminal domains would be of special interest since in SARS-CoV it has been identified as an RNAbinding domain, whereas in other coronaviruses the exact function is not yet known. |
TextSentencer_T206 |
27397-27543 |
Sentence |
denotes |
Of special note is the RNA-binding domain of MHV, which has been mapped to the flexible linker region instead of the N-terminal structural domain. |
TextSentencer_T207 |
27544-27709 |
Sentence |
denotes |
At present the molecular mechanism involving N protein/RNA interaction is still not fully understood and the RNA binding site(s) have not been unequivocally defined. |
TextSentencer_T208 |
27710-27865 |
Sentence |
denotes |
It is possible that the Nterminal structural domain folds into different tertiary structures and plays different roles in different coronavirus N proteins. |
TextSentencer_T209 |
27866-27945 |
Sentence |
denotes |
It is also possible that the linker region may also be involved in RNA binding. |
TextSentencer_T210 |
27946-28045 |
Sentence |
denotes |
Another interesting point that needs further study is the role of the C-terminal structural domain. |
TextSentencer_T211 |
28046-28196 |
Sentence |
denotes |
It is not yet known whether it plays the same dimerization role in other coronavirus as in SARS-CoV, although there are hints in the literature [44] . |
TextSentencer_T212 |
28197-28327 |
Sentence |
denotes |
In summary, we have the following conclusions: (1) The N protein of SARS-CoV is a didomain protein connected by a flexible linker. |
TextSentencer_T213 |
28328-28513 |
Sentence |
denotes |
The protein is capped by disordered N-terminal head and C-terminal tail. ( 2) The C-terminal structural domain is sufficient for dimerization, implying a structural role in the process. |