Id |
Subject |
Object |
Predicate |
Lexical cue |
T1 |
238-353 |
Epistemic_statement |
denotes |
The SARS-CoV N protein has also been suggested to be involved in other important functions in the viral life cycle. |
T2 |
794-892 |
Epistemic_statement |
denotes |
Bioinformatics reveal that other coronavirus N proteins could share the same modular organization. |
T3 |
1416-1517 |
Epistemic_statement |
denotes |
In humans, coronaviruses are often associated with mild respiratory illnesses, including common cold. |
T4 |
1518-1671 |
Epistemic_statement |
denotes |
However, a novel coronavirus has been identified as the etiology agent of severe acute respiratory syndrome (SARS), which has a case fatality rate of ca. |
T5 |
2215-2326 |
Epistemic_statement |
denotes |
The M protein may also be involved in the formation of the nucleocapsid through interaction with the N protein. |
T6 |
2774-2955 |
Epistemic_statement |
denotes |
From genetic and bioinformatics studies, the N protein can be divided into three putative regions: an Nterminal domain, a RNA-binding domain (RBD) and a C-terminal domain [12, 13] . |
T7 |
2956-3047 |
Epistemic_statement |
denotes |
The N-and Cterminal domains are believed to play a role in interaction with other proteins. |
T8 |
3212-3484 |
Epistemic_statement |
denotes |
Rather surprising, the mid-portion of the protein has been shown to interact with the M protein and hnRNP A1 [16, 17] , and structural studies have identified the region between amino acids 45-181 as the putative RNA-binding region, which is close to the N-terminus [18] . |
T9 |
3637-3747 |
Epistemic_statement |
denotes |
However, the structural organization of coronavirus N proteins in general remains largely unknown to this day. |
T10 |
3889-4062 |
Epistemic_statement |
denotes |
Through the power of nuclear magnetic resonance (NMR) spectroscopy, we present the first evidence that the SARS-CoV N protein consists of two independent structural domains. |
T11 |
4538-4770 |
Epistemic_statement |
denotes |
The elucidation of the modular organization of the SARS-CoV N protein, particularly the boundary between disordered and structured regions, facilitates future studies of this class of proteins at the functional and structural level. |
T12 |
4771-4839 |
Epistemic_statement |
denotes |
Sequence alignment, secondary structure and orderdisorder prediction |
T13 |
4840-5053 |
Epistemic_statement |
denotes |
The full-length sequences of SARS and other coronavirus N proteins were aligned using CLU-STALW version 1.83 with the slow algorithm, an identity matrix, a window of 4 amino acids and standard gap penalties [19] . |
T14 |
11678-11838 |
Epistemic_statement |
denotes |
We observed that the resonances from residues N45-181 have good chemical shift dispersion (Figure 2a) , indicating that the fragment has a structured character. |
T15 |
12154-12290 |
Epistemic_statement |
denotes |
These results indicate that the N-terminal flanking region between amino acids 1-44 does not affect the structure of the N45-181 domain. |
T16 |
12424-12551 |
Epistemic_statement |
denotes |
We found that the resonances from N248-365 are welldispersed (Figure 2c ), suggesting that N248-365 forms an ordered structure. |
T17 |
12884-12988 |
Epistemic_statement |
denotes |
These results indicate that residues from 365 to the C-terminal do not affect the structure of N248-365. |
T18 |
12989-13195 |
Epistemic_statement |
denotes |
Shortening the fragment to span amino acids 274-365 changes the 15 N-HSQC resonance pattern, which indicates that the 248-273 region is important for structure stabilization of this domain (data not shown). |
T19 |
13196-13546 |
Epistemic_statement |
denotes |
To explore the structure of the region between residues 182-247 and their effect on the structure of N45-181 and N248-365, we constructed the fragment N45-365 which contains the two struc- The lack of resonance perturbation when the two domains are linked together suggests that interaction between these two domains is weak, if they interact at all. |
T20 |
13672-13723 |
Epistemic_statement |
denotes |
These results are consistent with PONDR prediction. |
T21 |
14066-14233 |
Epistemic_statement |
denotes |
To test whether the residues beyond the structural domains are truly disordered, we employed the CLEANEX-PM experiment to identify solvent-accessible resonances [27] . |
T22 |
14357-14530 |
Epistemic_statement |
denotes |
When we compared the CLEA-NEX-PM spectrum of N1-181 (Figure 3b) with that of N45-181 (Figure 3a) , we observed 40 resonances that only appeared in N1-181 but not in N45-181. |
T23 |
14531-14709 |
Epistemic_statement |
denotes |
This number agrees with that expected for the N-terminal region (5 prolines), indicating that all amide protons in the Nterminus of SARS-CoV N protein are exposed to the solvent. |
T24 |
14710-14959 |
Epistemic_statement |
denotes |
We counted 39 additional peaks in the CLEANEX-PM spectrum of N248-422 (Figure 3d ) compared to that of N248-365 ( Figure 3c ) (51 expected since there are 6 prolines), suggesting that the majority of the C-terminal residues are also solvent-exposed. |
T25 |
15157-15342 |
Epistemic_statement |
denotes |
A total of 27 additional peaks can be resolved, compared to 64 expected (2 prolines), indicating that about half of the linker region between residues 182-247 is exposed to the solvent. |
T26 |
15343-15498 |
Epistemic_statement |
denotes |
It should be noted here that due to resonance overlapping the numbers counted should be viewed as a lower limit for the number of solvent-exposed residues. |
T27 |
15499-15717 |
Epistemic_statement |
denotes |
Nevertheless we can conclude that all N-terminal residues are solvent exposed whilst most of the residues in the Cterminus and in the linker region between the two structural domains are exposed to the solvent as well. |
T28 |
15935-16089 |
Epistemic_statement |
denotes |
The long disordered linker between the two structural domains is consistent with the observation that there is little interaction between the two domains. |
T29 |
16090-16343 |
Epistemic_statement |
denotes |
However, the number of counted peaks in the CLEANEX-PM spectra of the Cterminus and the linker region are less than that expected, so it is likely that parts of these regions are solvent-protected, possibly through the formation of transient structures. |
T30 |
16560-16722 |
Epistemic_statement |
denotes |
The function of the N248-365 is not clear, but many reports have identified the C-terminal half of SARS-CoV N protein to be involved in oligomerization [14, 15] . |
T31 |
16723-16931 |
Epistemic_statement |
denotes |
To test this possibility, we have applied analytical gel-filtration chromatography, chemical cross-linking and analytical ultracentrifugation to assay the self-association property of the N protein fragments. |
T32 |
16932-17128 |
Epistemic_statement |
denotes |
As shown in Figure 4a , N45-181 elutes out at a molecular weight of 18 kDa and N248-365 elutes out as a 28-kDa molecule, suggesting that N45-181 exists as a monomer and N248-365 exists as a dimer. |
T33 |
17129-17249 |
Epistemic_statement |
denotes |
The self-association between the two N248-365 monomers is very strong, since we could not detect any monomeric fraction. |
T34 |
17250-17359 |
Epistemic_statement |
denotes |
Similarly, N45-365 eluted out at molecular weight of $70 kDa, suggesting that N45-365 also exists as a dimer. |
T35 |
17786-17924 |
Epistemic_statement |
denotes |
Only one major peak was detected for each of these three protein fragments, indicating that they are structurally homogeneous in solution. |
T36 |
18359-18459 |
Epistemic_statement |
denotes |
Taking together all three results indicate that N45-181 exists as a monomer and N248-365 as a dimer. |
T37 |
18460-18586 |
Epistemic_statement |
denotes |
The fact that dimerization occurs through a structural domain strongly suggest that the process is dependent on the structure. |
T38 |
18686-18883 |
Epistemic_statement |
denotes |
It is interesting to note that we did not observe the formation of higher-order multimer in our studies, which may be important for the formation of the ribonucleoprotein complex within the virion. |
T39 |
18884-19063 |
Epistemic_statement |
denotes |
A possible explanation is that multimer formation may require additional factors, such as the presence of RNA or other parts of the N protein that were not present in our samples. |
T40 |
19064-19201 |
Epistemic_statement |
denotes |
Also we can not exclude the possibility that multimers do form at much higher protein concentrations than the ones used in these studies. |
T41 |
19202-19301 |
Epistemic_statement |
denotes |
We suggest that the dimeric form represents a basic building block of the nucleocapsid of SARS-CoV. |
T42 |
19302-19425 |
Epistemic_statement |
denotes |
Since coronavirus N proteins belong to the same protein family, it is probable that they share similar structural features. |
T43 |
19963-20163 |
Epistemic_statement |
denotes |
The C-terminus of MHV N protein, which is disordered, has been shown to interact with hnRNP A1 [31] , whereas the disordered region in the middle is responsible for its RNA-binding activity [13, 32] . |
T44 |
20164-20353 |
Epistemic_statement |
denotes |
In SARS-CoV, the disordered region in the middle of the N protein has been implicated in N-protein selfinteraction [33] , interaction with the M protein [16] and hnRNP A1 interaction [17] . |
T45 |
20354-20504 |
Epistemic_statement |
denotes |
These experimental observations suggest that disordered regions of coronavirus N proteins are probable interaction sites with functional implications. |
T46 |
21012-21190 |
Epistemic_statement |
denotes |
The extensive secondary structure and high similarity suggests that the two structural domains observed in SARS-CoV N protein also exist in the N proteins of other coronaviruses. |
T47 |
21191-21381 |
Epistemic_statement |
denotes |
The results from the order-disorder prediction and secondary structure prediction coupled with sequence alignment suggest that coronavirus N proteins all share the same modular organization. |
T48 |
21694-21779 |
Epistemic_statement |
denotes |
The ability of the N-terminal domain to bind RNA is closely related to its structure. |
T49 |
21780-21910 |
Epistemic_statement |
denotes |
Although the structure of the C-terminal domain has not been determined, we suggest that dimerization is also structure-dependent. |
T50 |
21911-22228 |
Epistemic_statement |
denotes |
A number of experimental observations support our hypothesis: First, it has been found that oligomer dissociation and protein unfolding of SARS-CoV N protein occur simultaneously [34] ; second, most self-interaction studies have mapped the oligomerization domain to regions containing the structural domain [14, 15] . |
T51 |
22229-22288 |
Epistemic_statement |
denotes |
The structural domains may also serve additional functions. |
T52 |
22289-22415 |
Epistemic_statement |
denotes |
For example, a putative loop between W302 and P310 in the C-terminal domain has been suggested to bind to cyclophilin A [35] . |
T53 |
22416-22497 |
Epistemic_statement |
denotes |
These additional functions may also be dependent on the structure of the protein. |
T54 |
22498-22688 |
Epistemic_statement |
denotes |
Although the two structural domains do not interact with each other, we cannot discount the possibility that the two domains could act in concert to carry out important biological functions. |
T55 |
22689-22793 |
Epistemic_statement |
denotes |
The long flexible linker between the two domains provides enough freedom to make this scenario possible. |
T56 |
22794-22909 |
Epistemic_statement |
denotes |
Previously, the lack of information on structural organization precluded the study of multiple-domain interactions. |
T57 |
23061-23162 |
Epistemic_statement |
denotes |
This disordered region may enable transient interactions with several structurally distinct partners. |
T58 |
23163-23260 |
Epistemic_statement |
denotes |
It has been shown that the M protein of SARS-CoV binds to this region between a.a. 168-208 [16] . |
T59 |
23261-23381 |
Epistemic_statement |
denotes |
Interestingly, human cellular hnRNP A1 has also been shown to bind to almost the same region between a.a. 161-210 [17] . |
T60 |
23382-23502 |
Epistemic_statement |
denotes |
The disordered state of this region potentially allows it to interact with different partners depending on context, e.g. |
T61 |
23590-23775 |
Epistemic_statement |
denotes |
The exact mechanism by which this occurs is not known, but it could involve different induced folding pathways, which has been shown to occur in other disordered proteins [23, 36, 37] . |
T62 |
23976-24060 |
Epistemic_statement |
denotes |
The same region has also been shown to bind murine hnRNP A1 in infected cells [31] . |
T63 |
24061-24294 |
Epistemic_statement |
denotes |
It seems that the coronavirus N proteins share the common theme of using the flexible linker as an interaction ''hotspot'', and use characteristics of disordered regions to achieve multiple functions within a limited sequence length. |
T64 |
24396-24593 |
Epistemic_statement |
denotes |
SARS-CoV N protein has been shown to get serine-phosphorylated by multiple kinases and phosphorylation is proposed to be a possible mechanism for nucleocytoplasmic shuttling of the N protein [38] . |
T65 |
24594-24659 |
Epistemic_statement |
denotes |
Disordered regions represent potential sites for phosphorylation. |
T66 |
24777-24839 |
Epistemic_statement |
denotes |
In fact, this region can be phosphorylated in vitro (Dr. W.-Y. |
T67 |
24871-25078 |
Epistemic_statement |
denotes |
Recent in silico prediction suggested that most of the potential phosphorylation sites fall in the disordered regions, although the exact phosphorylations sites have not been identified experimentally [38] . |
T68 |
25079-25246 |
Epistemic_statement |
denotes |
Although the exact role of phosphorylation has not been elucidated, it could be related to regulate functions such as RNAbinding and localization within the host cell. |
T69 |
25921-26007 |
Epistemic_statement |
denotes |
Coronavirus N proteins seem to employ a widespread property to allow for modification. |
T70 |
26008-26170 |
Epistemic_statement |
denotes |
Whether or not such modification affects the folding or structural properties of the protein and how these properties affect its function remain to be determined. |
T71 |
26424-26572 |
Epistemic_statement |
denotes |
However, structure determination of the full-length protein is hindered by the disordered regions, which often interfere with crystallization [43] . |
T72 |
26616-26720 |
Epistemic_statement |
denotes |
90 kDa) also makes full-length structure determination through NMR extremely difficult due to T2 issues. |
T73 |
26721-26818 |
Epistemic_statement |
denotes |
The fact that the two structural domains do not interact provides a handle to solve this problem. |
T74 |
26819-26939 |
Epistemic_statement |
denotes |
The two structural domains can be solved independently and still provide fair representation of the full-length protein. |
T75 |
27193-27396 |
Epistemic_statement |
denotes |
The structures of the N-terminal domains would be of special interest since in SARS-CoV it has been identified as an RNAbinding domain, whereas in other coronaviruses the exact function is not yet known. |
T76 |
27544-27709 |
Epistemic_statement |
denotes |
At present the molecular mechanism involving N protein/RNA interaction is still not fully understood and the RNA binding site(s) have not been unequivocally defined. |
T77 |
27710-27865 |
Epistemic_statement |
denotes |
It is possible that the Nterminal structural domain folds into different tertiary structures and plays different roles in different coronavirus N proteins. |
T78 |
27866-27945 |
Epistemic_statement |
denotes |
It is also possible that the linker region may also be involved in RNA binding. |
T79 |
27946-28045 |
Epistemic_statement |
denotes |
Another interesting point that needs further study is the role of the C-terminal structural domain. |
T80 |
28046-28196 |
Epistemic_statement |
denotes |
It is not yet known whether it plays the same dimerization role in other coronavirus as in SARS-CoV, although there are hints in the literature [44] . |