PMC:7258756 / 2161-39433 JSONTXT 12 Projects

Annnotations TAB TSV DIC JSON TextAE

Id Subject Object Predicate Lexical cue
T22 0-126 Sentence denotes The Coronaviridae family encompasses viruses with a single-stranded, positive-sense RNA genome of size approximately 26-32 kb.
T23 127-255 Sentence denotes Initially, the virus was associated with human and animal infections that caused intestinal as well as respiratory infections12.
T24 256-424 Sentence denotes In 2002, the severe acute respiratory syndrome (SARS) coronavirus (CoV) outbreak that claimed the lives of many people in China raised the alarm towards these viruses2.
T25 425-582 Sentence denotes Further, after a decade, another human pathogenic virus emerged, Middle East respiratory syndrome CoV (MERS-CoV) that affected the Middle Eastern countries2.
T26 583-746 Sentence denotes Current knowledge identifies six virus groups that can infect humans3 in the Coronaviridae family, which includes SARS-CoV (now termed as SARS-CoV-1) and MERS-CoV.
T27 747-866 Sentence denotes Recently in December 2019, China reported cases with pneumonia of unknown aetiology in the Hubei province, Wuhan city4.
T28 867-961 Sentence denotes Further analysis of these cases was carried out to identify the causative agent of pneumonia5.
T29 962-1134 Sentence denotes Virus isolation and genomic characterization of the complete sequence of the virus through next-generation sequencing (NGS), identified it as a novel CoV, named 2019-nCoV3.
T30 1135-1237 Sentence denotes The virus characterization revealed that it is an enveloped RNA virus with a genome size of 29,903 bp.
T31 1238-1387 Sentence denotes The phylogenetic analysis of the sequence showed that it belonged to the Sarbecovirus subgenus of genus Betacoronavirus and the family Coronaviridae.
T32 1388-1624 Sentence denotes The sequence was closely related (~87.5% sequence similarity) to two bat-derived SARS-like CoV strains (bat-SL-CoVZC45 and bat-SL-CoVZXC21) that are known to infect humans, including the virus which led to the 2003 SARS-CoV-1 outbreak6.
T33 1625-1667 Sentence denotes The 2019-nCoV is now named as SARS-CoV-27.
T34 1668-1837 Sentence denotes Further, based on SimPlot analyses, it was demonstrated that SARS-CoV-2 was more closely related to the BatCoV RaTG13 sequence (~96.3% similarity) throughout the genome.
T35 1838-2073 Sentence denotes The bat-SL-CoVZC45 and bat-SL-CoVZXC21 strains clustered differently from the group formed by SARS-CoV-2 and BatCoV RaTG13 in the region spanning the 3′-end of open reading frame (ORF)1a, the ORF1b and almost half of the spike region8.
T36 2074-2283 Sentence denotes The receptor-binding domain (RBD) of the spike protein mediates interaction with the host cell receptor9, and the angiotensin-converting enzyme 2 (ACE2) has been identified as the receptor for the SARS-CoVs10.
T37 2284-2401 Sentence denotes Specific mutations in the RBD of the SARS-CoV-2 spike glycoprotein were found to have enhanced binding to the ACE211.
T38 2402-2507 Sentence denotes The human-to-human transmission of the SARS-CoV-2 created an alert with the increasing number of cases12.
T39 2508-2634 Sentence denotes The WHO report dated February 28, 2020 confirmed 83,652 cases of SARS-CoV-2, with a total of 2,858 deaths from 52 countries12.
T40 2635-2818 Sentence denotes After the first report of SARS-CoV-2 from Wuhan, China, the Government of India reviewed and initiated multisectoral measures for the mitigation of this emerging public health crisis.
T41 2819-3005 Sentence denotes These include point-of-entry surveillance at 21 international airports, enhanced State-level surveillance programmes and preparedness for handling clinical cases in designated hospitals.
T42 3006-3285 Sentence denotes Till date, the Integrated Disease Surveillance Programme (IDSP), a national health programme, Government of India, has collected samples from symptomatic travellers in liaison with the State-level Viral Research and Diagnostic Laboratories (VRDLs), Department of Health Research.
T43 3286-3344 Sentence denotes These VRDLs respond for timely diagnosis during outbreaks.
T44 3345-3521 Sentence denotes The suspected samples were collected and transported to the Indian Council of Medical Research-National Institute of Virology (ICMR-NIV), Pune, for the diagnosis of SARS-CoV-2.
T45 3522-3791 Sentence denotes The specimens of the positive cases were diagnosed with real-time reverse transcription-polymerase chain reaction (RT-PCR)-specific for SARS-CoV-2 using the protocol published by the WHO13 and characterized by complete genome sequencing and epitope prediction analyses.
T46 3792-3980 Sentence denotes These sequences were also compared with the available GenBank sequences to monitor the mutations and understand their relation with other known SARS-CoV-2 available in the public database.
T47 3981-4074 Sentence denotes Here, we report molecular characterization of SARS-CoV-2 sequences from three positive cases.
T48 4076-4094 Sentence denotes Material & Methods
T49 4095-4223 Sentence denotes The clinical samples were referred by the hospital authorities through the Kerala State Health Services for diagnostic purposes.
T50 4224-4328 Sentence denotes Further samples were received from different parts of India for establishing the presence of SARS-CoV-2.
T51 4329-4374 Sentence denotes Detection of SARS-CoV-2 in suspected samples:
T52 4375-4584 Sentence denotes Blood and throat swab (TS) specimens were collected from the suspected cases that complied with the case definition of SARS-CoV-2 infection as per the guidelines of the Ministry of Health and Family Welfare14.
T53 4585-4632 Sentence denotes The TS was collected in viral transport medium.
T54 4633-4796 Sentence denotes These samples were referred to the ICMR-NIV, Pune, India (which is the national reference laboratory for India, also referred as the government's apex laboratory).
T55 4797-4979 Sentence denotes As of February 29, 2020, 881 samples of suspected cases referred from different States, with a travel history to Wuhan, China, and other SARS-CoV-2-affected countries, were screened.
T56 4980-5128 Sentence denotes The viral RNA was extracted from the TS sample using the Magmax RNA extraction kit (Applied Biosystems, USA) as per the manufacturer's instructions.
T57 5129-5325 Sentence denotes The extracted RNA was immediately used for testing the presence of SARS-CoV-2 using the real-time RT-PCR protocol published by the WHO12 for the detection of RdRp (1), RdRp (2), E gene and N gene.
T58 5326-5389 Sentence denotes RNase P gene was used as the internal control for the analysis.
T59 5390-5479 Sentence denotes Confirmatory laboratory tests were performed as per the WHO-recommended test protocols13.
T60 5480-5582 Sentence denotes These samples were also sequenced using the NGS approach to retrieve the complete genome of the virus.
T61 5583-5667 Sentence denotes NGS of SARS-CoV-2 from India - Phylogenetic analysis and molecular characterization:
T62 5668-5808 Sentence denotes The total RNA of three positive TS specimens from Kerala, was extracted from 250-300 μl of the SARS-CoV-2 real-time RT-PCR positive samples.
T63 5809-5921 Sentence denotes QIAamp Viral RNA extraction kit (QIAGEN, Hilden, Germany) was used according to the manufacturer's instructions.
T64 5922-6020 Sentence denotes The extracted RNA was further quantified using a Qubit RNA High-Sensitivity kit (Invitrogen, USA).
T65 6021-6227 Sentence denotes RNA libraries were prepared as per the earlier-defined protocol and quantified using KAPA Library Quantification Kit (Kapa Biosystems, Roche Diagnostics Corporation, USA) as per the manufacturer's protocol.
T66 6228-6326 Sentence denotes Further, individual libraries were neutralized and loaded on the Miniseq platform (Illumina, USA).
T67 6327-6407 Sentence denotes The detailed protocols for the steps undertaken have been published earlier1516.
T68 6408-6523 Sentence denotes The data generated from the machine were analyzed using CLC genomics workbench version 11.0 (CLC, QIAGEN, Germany).
T69 6524-6605 Sentence denotes Reference-based mapping was performed to retrieve the sequence of the SARS-CoV-2.
T70 6606-6827 Sentence denotes Full-length genome sequences of SARS-CoV-2 were downloaded from the GISAID database17 (Supplementary Table I (available from http://www.ijmr.org.in/articles/2020/151/2/images/IndianJMedRes_2020_151_2_200_281471_sm5.pdf)).
T71 6828-7005 Sentence denotes Multiple sequence alignment was performed using the MEGA software version 7.018 with retrieved sequences from two of the three positive cases and the available GISAID sequences.
T72 7006-7211 Sentence denotes A phylogenetic tree was generated using the neighbour joining method and the Kimura-2-parameter as the nucleotide (nt) substitution model with 1000 bootstrap replications as implemented in MEGA software18.
T73 7212-7320 Sentence denotes Per cent nucleotide divergence and amino acid (aa) divergence were calculated using the p-distance method18.
T74 7321-7479 Sentence denotes Mutations specific to the Indian SARS-CoV-2 viruses were identified by comparing the coding regions with respect to the SARS-CoV-2, Wuhan, China (Wuhan hu-1).
T75 7480-7553 Sentence denotes Three-dimensional (3D) model of the spike protein and epitope prediction:
T76 7554-7811 Sentence denotes The pre-fusion structure of the Indian case 1 SARS-CoV-2 spike (S) glycoprotein was modelled using the Swiss-Model server (https://swissmodel.expasy.org/interactive) and the corresponding S protein of Wuhan-Hu-1 (6VSB.PDB) as the template (99.97% identity).
T77 7812-7932 Sentence denotes Sequential (linear) B-cell epitopes were predicted using BepiPred-2.0 server (http://www.cbs.dtu.dk/services/BepiPred/).
T78 7933-8081 Sentence denotes The ABCpred prediction tool (http://crdd.osdd.net/raghava/abcpred/) was also used to identify the B-cell epitopes in the Indian SARS-CoV-2 sequence.
T79 8082-8184 Sentence denotes The epitope prediction probability of >0.8 was set to increase the specificity of the peptide stretch.
T80 8185-8298 Sentence denotes The overlapping epitopes predicted by BepiPred-2.0 online server and the ABCpred prediction tool were identified.
T81 8299-8498 Sentence denotes The antigenicity of the shortlisted peptide sequences was further predicted using the Vaxijen online server (http://www.ddg-pharmfac.net/vaxijen/VaxiJen/VaxiJen.html) with a default threshold of 0.4.
T82 8499-8772 Sentence denotes Discontinuous epitopes on the modelled structure of the Indian case 1 SARS-CoV-2 spike protein were predicted using the online servers, Ellipro (http://tools.iedb.org/ellipro/) and DiscoTope 2.0 (http://tools.iedb.org/discotope/), integrated in the Immune Epitope Database.
T83 8773-8924 Sentence denotes Ellipro predicts epitopes based on the protusion index (PI), wherein the protein shape is approximated as an ellipsoid (Ref for Ellipro and DiscoTope).
T84 8925-9059 Sentence denotes An ellipsoid with the PI value of 0.8 indicates that 80 per cent of the residues are within the ellipsoid and 20 per cent are outside.
T85 9060-9141 Sentence denotes All residues that are outside the 80 per cent ellipsoid will have a score of 0.8.
T86 9142-9220 Sentence denotes Residues with larger scores are associated with greater solvent accessibility.
T87 9221-9260 Sentence denotes The PI value was set to a score of 0.8.
T88 9261-9405 Sentence denotes DiscoTope predicts epitopes using 3D structure and half-sphere exposure as a surface measure in a novel spatial neighbourhood definition method.
T89 9406-9537 Sentence denotes Default values were set for sensitivity (0.47) and specificity (0.75) for selecting the amino acids forming discontinuous epitopes.
T90 9538-9763 Sentence denotes A sensitivity of 0.47 means that 47 per cent of the epitope residues are predicted as part of the epitopes, while a specificity of 0.75 means that 25 per cent of the non-epitope residues are predicted as part of the epitopes.
T91 9764-9917 Sentence denotes Outputs from both the methods were combined, and the final regions were mapped on the modelled 3D-structure as the most probable conformational epitopes.
T92 9918-10064 Sentence denotes In addition, we also predicted N-linked glycosylation sites in the S protein using NetNGlyc 1.0 Server (http://www.cbs.dtu.dk/services/NetNGlyc/).
T93 10065-10246 Sentence denotes The spike proteins were also screened for the presence of potential epitopes presented by major histocompatibility complex (MHC) class I molecules to cytotoxic T lymphocytes (CTLs).
T94 10247-10466 Sentence denotes The online NetCTL1.2 server (http://www.cbs.dtu.dk/services/NetCTL/) based on machine learning techniques such as artificial neural network (ANN) and support vector machine (SVM) was used to predict the T-cell epitopes.
T95 10467-10576 Sentence denotes The prediction was made for all the human leucocyte antigen (HLA) supertypes and the available human alleles.
T96 10577-10712 Sentence denotes The C terminal cleavage, weight of transport-associated protein (TAP) efficiency and threshold for identification were kept as default.
T97 10713-10853 Sentence denotes VaxiJen v2.0 tool was used to predict the antigenicity of the predicted epitopes (http://www.ddg-pharmfac.net/vaxijen/VaxiJen/VaxiJen.html).
T98 10854-10987 Sentence denotes The sequences were further screened to be potential epitopes using the CTLPred online server (http://crdd.osdd.net/raghava/ctlpred/).
T99 10988-11184 Sentence denotes The ability of the predicted linear B-cell and the T-cell epitopes to mount interferon-gamma (IFN-γ) response was assessed using the IFNepitope (http://crdd.osdd.net/raghava/ifnepitope/index.php).
T100 11186-11193 Sentence denotes Results
T101 11194-11239 Sentence denotes Detection of SARS-CoV-2 in suspected samples:
T102 11240-11424 Sentence denotes Three of the 881 TS/nasal swab (NS) specimens from the suspected cases, tested positive for the SARS-CoV-2 using the real-time RT-PCR specific to E gene, RdRp (1), RdRp (2) and N gene.
T103 11425-11501 Sentence denotes The Ct value of the E gene ranged from 19.8 to 34.5 for the TS/NS specimens.
T104 11502-11632 Sentence denotes Detailed Ct values for the real-time RT-PCRs specific to the above-mentioned genes of the positive specimens are given in Table I.
T105 11633-11692 Sentence denotes Blood samples were found to be negative for the SARS-CoV-2.
T106 11693-11954 Sentence denotes Table I Real-time reverse transcription-polymerase chain reaction (RT-PCR) values for RdRp (1), RdRp (2), E gene and N gene, per cent genome coverage recovered and reads mapped for the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) positive cases
T107 11955-12121 Sentence denotes Positive cases Ct values for real-time RT-PCR for the confirmation of SARS-CoV-2 Relevant reads Total reads Genome length recovered (bp) Per cent genome coverage
T108 12122-12182 Sentence denotes RdRp (1) RdRp (2) E gene N gene Rnase P internal control
T109 12183-12260 Sentence denotes Case 1 33.33 27.93 34.5 33.90 Positive 20,096 5,615,846 29,854 99.83
T110 12261-12328 Sentence denotes Case 2 24.6 29 19.8 38 Positive 610 8,587,146 16,047 53.66
T111 12329-12549 Sentence denotes Case 3 34.17 32.64 28.98 36.35 Positive 11,296 1,405,038 29,851 99.83 Case 1 travelled from Wuhan, China, reached India on January 23, 2020 and further travelled to the final destination of Kerala on January 24.
T112 12550-12708 Sentence denotes This individual developed cough on January 25 and further experienced a sore throat and mild fever and was admitted to the General Hospital, Thrissur, Kerala.
T113 12709-12832 Sentence denotes The second case travelled from Wuhan and had close contact with case 1 during the travel to the final destination in India.
T114 12833-12988 Sentence denotes Case 2 developed similar symptoms along with fever and diarrhoea on January 26, and the collected TS specimens were referred to the ICMR-NIV on January 28.
T115 12989-13077 Sentence denotes The second case was hospitalized on January 30, in a medical college, Alappuzha, Kerala.
T116 13078-13137 Sentence denotes The clinical sample (TS) was collected on January 31, 2020.
T117 13138-13294 Sentence denotes Case 3 travelled from China to India, developed a runny nose on January 30 and was admitted to the General Hospital, Kasaragod, Kerala, on January 31, 2020.
T118 13295-13343 Sentence denotes TS specimens were collected on January 31, 2020.
T119 13344-13527 Sentence denotes NGS of SARS-CoV-2 from India - Phylogenetic analysis and molecular characterization: NGS analysis from the TS specimens retrieved two complete genome sequences from case 1 and case 3.
T120 13528-13697 Sentence denotes The complete genomic sequence data for case 2 could not be recovered due to the lower kappa concentration of the sample and hence not included in the study for analysis.
T121 13698-13857 Sentence denotes The FastQ files were reference mapped with the available Wuhan seafood pneumonia virus (Wuhan Hu-1) complete SARS-CoV-2 genome (accession number: NC 045512.2).
T122 13858-13979 Sentence denotes The total reads which were mapped and the percentage of the genome recovered for the two cases are summarized in Table I.
T123 13980-14189 Sentence denotes Analysis of the complete genome sequences of SARS-CoV-2 from the positive cases in India revealed that the percentage nt and aa differences between case 1 and case 3 were 0.038 and 0.10 per cent, respectively.
T124 14190-14318 Sentence denotes The sequences of case 1 and case 3 diverged from the Wuhan-Hu1 sequence by 0.017 per cent nt and 0.041 per cent aa respectively.
T125 14319-14466 Sentence denotes Indian SARS-CoV-2 clustered with the Sarbecovirus subgenus of the Betacoronavirus genus and was closest to the BatCoV RaTG13 sequence (96.09% nt)8.
T126 14467-14627 Sentence denotes The phylogenetic comparison showed the clustering of the genome sequences of case 1 and case 3 with the existing sequences of the SARS-CoV-2 sequences (Fig. 1).
T127 14628-14715 Sentence denotes The phylogeny revealed emerging heterogeneity within the SARS-CoV-2 sequences globally.
T128 14716-14784 Sentence denotes The Indian SARS-CoV-2 viruses were positioned in different clusters.
T129 14785-14894 Sentence denotes Fig. 1 Phylogenetic tree of the complete genomes of severe acute respiratory syndrome coronavirus 2 viruses.
T130 14895-14943 Sentence denotes Indian viruses are shown in magenta font colour.
T131 14944-15076 Sentence denotes Indian SARS-CoV-2 sequences showed two changes 408 Arg→Ile and 930 Ala→Val in the spike protein compared to the Wuhan Hu-1 sequence.
T132 15077-15295 Sentence denotes The mutations were further mapped on the spike protein model of the Indian sequence (Supplementary Fig. 1 (available from http://www.ijmr.org.in/articles/2020/151/2/images/IndianJMedRes_2020_151_2_200_281471_sm6.pdf)).
T133 15296-15500 Sentence denotes Deletion of a three-nucleotide stretch, encoding tyrosine residue at position 144, of the spike gene was also observed in the Indian SARS-CoV-2 from case 1 when compared to the other SARS-CoV-2 sequences.
T134 15501-15709 Sentence denotes As noted in the earlier SARS-CoV-2 sequences, both the Indian sequences possessed the polybasic cleavage site (RRAR) in the spike protein at the junction of S1 and S2, the two subunits of the spike protein19.
T135 15710-15730 Sentence denotes Epitope predictions:
T136 15731-15904 Sentence denotes Thirty one linear B-cell epitopes were predicted by Bepipred in the Indian SARS-CoV-2, of which three were found to have a length of <6 amino acids and hence not considered.
T137 15905-16034 Sentence denotes Linear epitopes were also predicted using the ABCpred prediction tool, which predicted 47 epitopes based on the threshold of 0.8.
T138 16035-16113 Sentence denotes Regions common to both the prediction methods (n=17) were identified manually.
T139 16114-16299 Sentence denotes The 17 epitopes were screened for their antigenicity using the VaxiJen v2.0 tool (http://www.ddg-pharmfac.net/vaxijen/VaxiJen/VaxiJen.html), and nine of these epitopes were shortlisted.
T140 16300-16430 Sentence denotes These epitopes were further screened for their ability to elicit an IFN-γ response, which was predicted using the IFNepitope tool.
T141 16431-16611 Sentence denotes Finally, five epitopes, four in the S1 domain and one in the S2 domain, were predicted, which could possibly generate an immune response and suppress the IFN-γ response (Table II).
T142 16612-16784 Sentence denotes N-linked glycosylation site prediction revealed that two putative glycosylation sites (with a low value for jury agreement) were present within the epitope stretch 328-344.
T143 16785-16910 Sentence denotes Table II Linear B-cell epitopes predicted on the spike protein of the Indian severe acute respiratory syndrome coronavirus 2
T144 16911-16984 Sentence denotes Peptide Epitope probability Vaxigen score Interferon (IFN)-γ response#
T145 16985-17046 Sentence denotes 243-HRSYLTPGDSSSGWTA-258 0.92 Antigen (0.602) Negative (1)
T146 17047-17113 Sentence denotes 327-FPNITNLCPFGEVFNA-342 0.82 Antigen (0.606) Negative (−0.132)
T147 17114-17175 Sentence denotes 404-EVIQIAPGQTGKIADY-419 0.86 Antigen (1.231) Negative (1)
T148 17176-17243 Sentence denotes 413-TGKIADYNYKLPDDFT-428 0.84 Antigen (0.9642) Negative (−0.334)
T149 17244-17307 Sentence denotes 1204-YEQYIKWPWYIWLGFI-1219 0.89 Antigen (0.951) Negative (1)
T150 17308-17409 Sentence denotes Epitopes were predicted using a combination of the Bepipred server and the ABCpred prediction server.
T151 17410-17469 Sentence denotes The antigenicity was predicted using the VaxiJen v2.0 tool.
T152 17470-17734 Sentence denotes IFN-γ response was predicted using the INFepitope server. #Values in bracket show prediction score given by the software The discontinuous epitopes in the spike protein of the Indian SARS-CoV-2 were further identified using multiple methods, Ellipro and DiscoTope.
T153 17735-17870 Sentence denotes Conformational epitopes based on these methods were mapped on the pre-fusion structure of the modelled Indian SARS-CoV-2 spike protein.
T154 17871-17996 Sentence denotes The newly released structure of the SARS-CoV-2 spike protein was used as the template for modelling the Indian spike protein.
T155 17997-18182 Sentence denotes Ramachandran plot statistics revealed 83.7 per cent of the residues to be in the core region, 14.4 per cent in the additionally allowed region and 0.5 per cent in the disallowed region.
T156 18183-18394 Sentence denotes Four epitopes were predicted by Ellipro based on the PI threshold of 0.8 (Supplementary Table II (available from http:/ /www. ijmr.org.in/articles/2020/151/2/images/IndianJMedRes_2020_151_2_200_281471_sm7.pdf)).
T157 18395-18579 Sentence denotes The result from the DiscoTope is presented in Supplementary Table III (available from http://www.ijmr.org.in/articles/ 2020/151/2/ images/ IndianJMedRes_2020_151_2_200_281471_sm8.pdf).
T158 18580-18640 Sentence denotes The mapped conformational epitopes are depicted in Figure 2.
T159 18641-19132 Sentence denotes For the purpose of comparison, the Indian S protein sequence was also modelled using the pre-fusion structure of SARS-CoV-1 (6ACC.PDB; 87.29% identity), and the results for the conformational epitopes predicted are in Supplementary Table IV (available from http://www.ijmr.org.in/articles/2020/151/2/images/IndianJMedRes_2020_151_2_200_28147 1_sm9.pdf and Supplementary Figure 2 (available from http://www.ijmr.org.in/articles/2020/151/2/images/IndianJMedRes_2020_151_2_200_281471_sm10.pdf).
T160 19133-19350 Sentence denotes Supplementary Table II Conformational B-cell epitopes predicted by Ellipro based on the chains A, B and C of the Indian severe acute respiratory syndrome coronavirus 2 spike protein modelled structure (template used:
T161 19351-19361 Sentence denotes 6VSB.PDB).
T162 19362-19414 Sentence denotes Ellipro protusion Index threshold set to 0.8 cut-off
T163 19415-19437 Sentence denotes Chain A Ellipro Score
T164 19438-19502 Sentence denotes 1 D1137, P1138, L1139, Q1140, P1141, E1142, L1143, D1144 0.984
T165 19503-19850 Sentence denotes 2 Y705, S706, N707, N708, S709, T1074, T1075, A1076, P1077, A1078, I1079, C1080, H1081, D1082, G1083, K1084, A1085, H1086, F1087, P1088, R1089, E1090, G1091, F1093, V1094, S1095, N1096, G1097, T1098, H1099, W1100, F1101, V1102, Y1108, E1109, P1110, Q1111, I1112, I1113, T1114, T1115, D1116, N1117, T1118, F1119, V1120, S1121, G1122, N1123, 0.906
T166 19851-20459 Sentence denotes 3 N341, A342, T343, R344, F345, A346, S347, V348, Y349, A350, W351, N352, S397, F398, V399, I400, R401, E404, Q412, T413, G414, K415, I416, A417, D418, Y419, N420, Y421, K422, L423, S436, N437, N438, L439, D440, S441, K442, V443, G444, G445, N446, Y447, N448, Y449, L450, Y451, R452, L453, F454, R455, K456, S457, N458, L459, K460, P461, F462, E463, R464, D465, I466, S467, T468, E469, I470, Y471, Q472, A473, G474, S475, T476, P477, C478, N479, G480, V481, G483, F484, N485, C486, Y487, F488, P489, L490, Q491, S492, Y493, G494, F495, Q496, P497, T498, N499, G500, V501, G502, Y503, Q504, P505, R507 0.887
T167 20460-20774 Sentence denotes 4 I68, H69, V70, S71, G72, T73, N74, G75, T76, K77, R78, S98, I100, C136, D138, F140, G142, Y143, H144, K145, N146, N147, K148, S149, W150, M151, E152, S153, E154, F155, R156, N183, F184, A241, L242, H243, R244, S245, Y246, L247, T248, P249, G250, D251, S252, S253, S254, G255, W256, T257, A258, G259, A260 0.876
T168 20775-20782 Sentence denotes Chain B
T169 20783-21319 Sentence denotes 1 A704, Y705, S706, N707, N708, S709, F1073, T1074, T1075, A1076, P1077, A1078, I1079, C1080, H1081, D1082, G1083, K1084, A1085, H1086, F1087, P1088, R1089,E1090, G1091, V1092,F1093, V1094, S1095, N1096, G1097, T1098, H1099, W1100, F1101, V1102, T1103, Q1104, R1105, F1107, Y1108, E1109,P1110, Q1111, I1112, I1113, T1114, T1115, D1116, N1117, T1118, F1119, V1120, S1121, G1122, N1123, C1124, D1125, V1126, V1127, I1128, G1129, I1130, V1131, N1132, N1133, T1134, V1135, Y1136, D1137, P1138, L1139, Q1140,P1141,E1142, L1143, D1144 0.902
T170 21320-21868 Sentence denotes 2 N341, A342, T343, F345, A346, S347, V348, Y349, A350, W351, V399, R401, G402, T413, G414, K415, D418, Y419, N420, Y421, K422, S436, N437, N438, L439, D440, S441, K442, V443, G444, G445, N446, Y447, N448, Y449, L450, Y451, R452, L453, F454, R455, K456, S457, N458, L459, K460, P461, E463, R464, D465, I466, S467, T468, E469, I470, Y471, Q472, A473, G474, S475, T476, P477, C478, N479, G480, V481, E482, G483, F484, N485, C486, Y487, F488, P489, L490, Q491, S492, Y493, G494, F495, Q496, P497, T498, N499, G500, V501, G502, Y503, Q504, P505 0.886
T171 21869-22251 Sentence denotes 3 A67, I68, H69, V70, S71, G72, T73, N74, G75, T76, K77, R78, E96, K97, S98, N99, I100, R102, N122, A123, T124, N125, C136, N137, D138, P139, F140, L141, G142, Y143, H144, K145, N146, N147, K148, S149, W150, M151, E152, S153, E154, F155, L239, L240, A241, L242, H243, R244, S245, Y246, L247, T248, P249, G250, D251, S252, S253, S254, G255, W256, T257, A258, G259, A260, A261 0.869
T172 22252-22259 Sentence denotes Chain C
T173 22260-22324 Sentence denotes 1 D1137, P1138, L1139, Q1140, P1141, E1142, L1143, D1144 0.984
T174 22325-22762 Sentence denotes 2 Y705, S706, N707, N708, S709, T1074, T1075, A1076, P1077, A1078, I1079, C1080, H1081, D1082, G1083, K1084, A1085, H1086, F1087, P1088, R1089, E1090, G1091, F1093, V1094, S1095, N1096, G1097, T1098, H1099, W1100, F1101, V1102, Y1108, E1109, P1110, Q1111, I1112, I1113, T1114, T1115, D1116, N1117, T1118, F1119, V1120, S1121, G1122, N1123, C1124, D1125, V1126, V1127, I1128, G1129, I1130, V1131, N1132, N1133, T1134, V1135, Y1136 0.906
T175 22763-23371 Sentence denotes 3 N341, A342, T343, R344, F345, A346, S347, V348, Y349, A350, W351, N352, S397, F398, V399, I400, R401, E404, Q412, T413, G414, K415, I416, A417, D418, Y419, N420, Y421, K422, L423, S436, N437, N438, L439, D440, S441, K442, V443, G444, G445, N446, Y447, N448, Y449, L450, Y451, R452, L453, F454, R455, K456, S457, N458, L459, K460, P461, F462, E463, R464, D465, I466, S467, T468, E469, I470, Y471, Q472, A473, G474, S475, T476, P477, C478, N479, G480, V481, G483, F484, N485, C486, Y487, F488, P489, L490, Q491, S492, Y493, G494, F495, Q496, P497, T498, N499, G500, V501, G502, Y503, Q504, P505, R507 0.887
T176 23372-23686 Sentence denotes 4 I68, H69, V70, S71, G72, T73, N74, G75, T76, K77, R78, S98, I100, C136, D138, F140, G142, Y143, H144, K145, N146, N147, K148, S149, W150, M151, E152, S153, E154, F155, R156, N183, F184, A241, L242, H243, R244, S245, Y246, L247, T248, P249, G250, D251, S252, S253, S254, G255, W256, T257, A258, G259, A260 0.876
T177 23687-24107 Sentence denotes Fig. 2 Predicted conformational B-cell epitopes mapped on the pre-fusion structure of the modelled Indian severe acute respiratory syndrome coronavirus 2 spike protein using the pre-fusion structure of severe acute respiratory syndrome-coronavirus-2 (6VSB.PDB) (colour key: blue - epitopes 67-261; green - epitopes 341-507 based on the predicted epitopes as shown in Supplementary Table II). (A) Top view (B) Side view.
T178 24108-24343 Sentence denotes Supplementary Table III Conformational B-cell epitopes predicted by the Discotope server based on the chain A, B and C of India severe acute respiratory syndrome coronavirus 2 spike protein modelled structure (template used 6VSB.PDB).
T179 24344-24383 Sentence denotes Discotope threshold set to -3.7 cut-off
T180 24384-24967 Sentence denotes CHAIN A 71S, 72G, 73T, 74N, 75G, 76T, 148K, 149S, 150W, 151M, 152E, 178E, 179G, 180K, 181Q, 209N, 245S, 247L, 248T, 250G, 252S, 253S, 440D, 441S, 442K, 443V, 444G, 445G, 446N, 447Y, 448N, 452R, 453L, 454F, 455R, 456K, 458N, 460K, 461P, 465D, 468T, 470I, 482E, 487Y, 488F, 489P, 490L, 491Q, 492S, 493Y, 494G, 495F, 496Q, 497P, 498T, 499N, 500G, 501V, 502G, 554N, 556K, 558L, 559P, 560F, 568A, 677N, 678S, 679P, 680R, 681R, 682A, 683R, 701N, 702S, 703V, 791P, 792I, 807P, 808S, 810P, 912N, 915Y, 916E, 1069Q, 1097G, 1099H, 1116D, 1137D, 1138P, 1139L, 1140Q, 1141P, 1142E, 1143L, 1144D
T181 24968-25528 Sentence denotes CHAIN B 70V, 73T, 74N, 75G, 76T, 147N, 148K, 149S, 150W, 151M, 176D, 177L, 178E, 179G, 180K, 181Q, 182G, 183N, 212R, 244R, 245S, 246Y, 247L, 248T, 249P, 250G, 251D, 252S, 253S, 254S, 255G, 256W, 438N, 441S, 442K, 443V, 444G, 445G, 446N, 447Y, 454F, 456K, 457S, 458N, 460K, 467S, 476T, 492S, 494G, 495F, 496Q, 497P, 498T, 499N, 500G, 501V, 502G, 503Y, 554N, 556K, 558L, 559P, 676T, 677N, 678S, 679P, 680R, 701N, 702S, 703V, 714T, 791P, 792I, 807P, 808S, 810P, 912N, 915Y, 916E, 1069Q, 1109E, 1112I, 1116D, 1137D, 1138P, 1139L, 1140Q, 1141P, 1142E, 1143L, 1144D
T182 25529-26040 Sentence denotes CHAIN C 72G, 73T, 74N, 75G, 97K, 98S, 143Y, 144H, 145K, 146N, 147N, 148K, 149S, 150W, 151M, 152E, 153S, 180K, 181Q, 182G, 183N, 184F, 209N, 252S, 253S, 441S, 442K, 443V, 444G, 445G, 446N, 447Y, 454F, 456K, 457S, 458N, 460K, 480G, 492S, 494G, 496Q, 497P, 498T, 499N, 500G, 501V, 502G, 503Y, 554N, 556K, 558L, 559P, 676T, 678S, 679P, 680R, 681R, 682A, 683R, 684S, 685V, 701N, 702S, 714T, 791P, 792I, 807P, 808S, 912N, 915Y, 916E, 1069Q, 1072N, 1098T, 1112I, 1116D, 1138P, 1139L, 1140Q, 1141P, 1142E, 1143L, 1144D
T183 26041-26171 Sentence denotes Supplementary Table IV Conformational B-cell epitopes predicted using the modelled structure of the spike protein (template used:
T184 26172-26282 Sentence denotes 6ACC.PDB; 87.29% identity). (A) Ellipro server (using a protusion Index threshold of 0.9) (B) DiscoTope server
T185 26283-26285 Sentence denotes A.
T186 26286-26312 Sentence denotes Ellipro epitope prediction
T187 26313-26360 Sentence denotes Epitope number Epitope residues Epitope score
T188 26361-26391 Sentence denotes 1 244-RSYLTPGDSSSGW-256 0.95
T189 26392-26495 Sentence denotes 2 347S, 349Y, 419Y, 441-SKVGGNYNYLYRLFR-455, 457S, 465-DISTEIYQAGSTPCNGVEGFNCYFPLQSYGFQPTN -499 0.943
T190 26496-26592 Sentence denotes 3 1074 TTAPAICHDGKAHFPR 1089, 1094-VSNGTHWFV-1102, 1110- PQIITTDNTFVSGNCDVVIGIVNNTV-1135 0.934
T191 26593-26622 Sentence denotes 4 144-HKNNKSWMESE-154 0.912
T192 26623-26645 Sentence denotes 5 72-GTNGTK-77 0.907
T193 26646-26648 Sentence denotes B.
T194 26649-26677 Sentence denotes DiscoTope epitope prediction
T195 26678-27306 Sentence denotes G72, T73, N74, G75, K145, N146, N147, K148, S149, L174, E178, K180, Q181, G182, 183, V211, S245, Y246, L247, T248, P249, G250, D251, S252, S253, K415, N438, S441, K442, V443, G444, G445, N446, Y447, N448, K456, S457, N458, K460, A473, G474, S475, S492, G494, Q496, P497, T498, N499, G500, V501, Y503, N554, K556, L558, P559, I567, Q675, T676, N677, S678, P679, R680, R681, A682, R683, S702, V703, A704, Y705, T714, P791, P807, S808, K809, P810, E916, Q1069, E1070 T-cell epitope prediction revealed 105 strong binding epitopes capable of binding to different HLA types using the NetCTL1.2 software based on the threshold of 0.4.
T196 27307-27437 Sentence denotes Twelve of these were shortlisted, considering a binding efficiency of >0.5 nM and capable of eliciting IFN-γ response (Table III).
T197 27438-27570 Sentence denotes Table III Spike protein peptides capable of binding to major histocompatibility complex (MHC) class I predicted using NetCTL server
T198 27571-27658 Sentence denotes Peptide Vaxijen Interferon (IFN)-γ response CTLPred Score (ANN/SVM) MHC restriction
T199 27659-27762 Sentence denotes 89-GVYFASTEK-97 0.711 Positive (1) 0.58/0.986 HLA-A*1101, HLA-A3, HLA-A*3101, HLA-A68.1, HLA-B*2705
T200 27763-27996 Sentence denotes 166-FEYVSQPFL-174 0.632 Positive (0.087) 0.65/0.184 HLA-A2, HLA-A*0201, HLA-A*0205, HLA-A2.1, HLA-B*2702, HLA-B*2705, HLA-B*3701, HLA-B40, HLA-B*4403, HLA-B*5301, HLA-B*5401, HLA-B*51, HLA-B60, HLA-B61, HLA-Cw*0301, H2-Kb, H2-Kk,
T201 27997-28163 Sentence denotes 256-WTAGAAAYY-264 0.630 Positive (0.576) 0.82/0.544 HLA-A1, HLA-B*2702, HLA-B*3501, HLA-B*4403, HLA-B*5301, HLA-B*5401, HLA-B*51, HLA-B*5801, HLA-B62, HLA-Cw*0702
T202 28164-28307 Sentence denotes 348-VYAWNRKRI-356 0.500 Positive (0.499) 0.93/0.497 HLA-A24, HLA-B*5101, HLA-B*5102, HLA-B*5103, HLA-B*51, HLA-Cw*0401, H2-Db, H2-Kd, H2-Kk
T203 28308-28567 Sentence denotes 503-YQPYRVVVL-511 0.596 Positive (0.292) 0.40/0.596 HLA-A*0201, HLA-A*0205, HLA-A24, HLA-B14, HLA-B*2702, HLA-B*2705, HLA-B*3902, HLA-B*5201, HLA-B*5301, HLA-B*5401, HLA-B*51, HLA-B60, HLA-B62, HLA-B7, HLA-B8, HLA-Cw*0401, HLA-Cw*0602, H2-Dd, H2-Kb, H2-Ld
T204 28568-28685 Sentence denotes 510-VLSFELLHA-518 1.077 Positive (0.268) 0.86/0.276 HLA-A*0201, HLA-A*0205, HLA-A3, HLA-B*5301, HLA-B*51, HLA-B62
T205 28686-28812 Sentence denotes 825-TLADAGFIK-833 0.578 Positive (0.014) 0.75/0.992 HLA-A1, HLA-A*1101, HLA-A3, HLA-A*3101, HLA-A68.1, HLA-A20, HLA-B*2705
T206 28813-29009 Sentence denotes 1058-VVFLHVTYV-1066 1.512 Positive (1) 0.77/0.779 HLA-A2, HLA-A*0201, HLA-A*0205, HLA-A68.1, HLA-A2.1, HLA-B14, HLA-B*5101, HLA-B*5102, HLA-B*5103, HLA-B*5201, HLA-B*5301, HLA-B*5401, HLA-B*51
T207 29010-29250 Sentence denotes 1210-WPWYIWLGF-1218 1.495 Positive (0.221) 0.68/0.0695 HLA-B*2702, HLA-B*2705, HLA-B*3501, HLA-B*3801, HLA-B*5101, HLA-B*5102, HLA-B*5201, HLA-B*5301, HLA-B*5401, HLA-B*51, HLA-B*5801, HLA-B62, HLA-B*0702, HLA-Cw*0401, HLA-Cw*0702.H2-Ld
T208 29251-29325 Sentence denotes Threshold of >0.7 nM was used for increased specificity of the prediction.
T209 29326-29402 Sentence denotes The peptides were reconfirmed using CTLPred server using default parameters.
T210 29403-29526 Sentence denotes The peptides that were classified as epitopes were further checked for their antigenicity score using the VaxiJen v2.0 tool
T211 29528-29538 Sentence denotes Discussion
T212 29539-29673 Sentence denotes Till February 29, 2020, three positive cases of SARS-CoV-2 were reported from India from 881 suspected cases tested at ICMR-NIV, Pune.
T213 29674-29754 Sentence denotes All the three cases had a travel history from Wuhan, China, during January 2020.
T214 29755-29909 Sentence denotes Although NGS was performed on the specimens for all the three positive cases, the complete genome sequence could be retrieved only from case 1 and case 3.
T215 29910-30079 Sentence denotes The three cases were recovered after hospitalization and were home quarantined as per the guidelines of the Ministry of Health and Family Welfare, Government of India14.
T216 30080-30256 Sentence denotes The low viral copy number of the TS specimen from case 2 could be the possible reason for lesser viral reads being retrieved during the NGS run, leading to a fragmented genome.
T217 30257-30398 Sentence denotes The recent study from China on serial samples (TSs, sputum, urine and stool) from two patients followed days 3-12 and days 4-15 post onset20.
T218 30399-30599 Sentence denotes N gene-specific real-time RT-PCR assay showed that the viral loads in TS and sputum samples peaked at around 5-6 days after symptom onset, ranging from around 104-107 copies per ml during this time20.
T219 30600-30755 Sentence denotes In another study, the virus was detected in the saliva specimens of 11 of the 12 patients, and serial saliva testing showed declines of viral RNA levels21.
T220 30756-30957 Sentence denotes The two Indian SARS-CoV-2 sequences were found to be non-identical (0.04% nt divergence), and the result of phylogenetic analysis indicated that there were two different introductions into the country.
T221 30958-31169 Sentence denotes A recent study using 52 published GenBank sequences showed evidence of substantial genetic heterogeneity and estimated the time to the most recent common ancestor to be December 5, 2019 (95% confidence interval:
T222 31170-31204 Sentence denotes November 6 - December 13, 2019)22.
T223 31205-31380 Sentence denotes Continuous monitoring and analysis of the sequences from the affected countries would be vital to understand the genetic evolution and rates of substitution of the SARS-CoV-2.
T224 31381-31572 Sentence denotes The comparison of the amino acid sequences of the non-structural (nsp1-nsp16) and structural polyproteins was undertaken with reference to the Wuhan-Hu1 strain for molecular characterization.
T225 31573-31755 Sentence denotes Some human Betacoronaviruses, including HCoV-HKU1 (lineage A), have a polybasic cleavage site as well as predicted O-linked glycans near the S1/S2 cleavage site of the spike protein.
T226 31756-31957 Sentence denotes As published recently, the polybasic cleavage site that has not been previously observed in related lineage B Betacoronaviruses and is a unique feature of SARS-CoV-2 was noted in the Indian SARS-CoV-2.
T227 31958-32102 Sentence denotes The mutation Arg408Ile in the spike protein of one of the Indian sequences is noted to be in the RBD and Ala930Val, is located in the S2 domain.
T228 32103-32171 Sentence denotes However, both are away from the ACE2 receptor-binding interface1923.
T229 32172-32349 Sentence denotes Mutations in the spike protein sequences of SARS-CoV-2 observed currently are localized over the S1 and S2 domains and, so far have not been found in the ACE2-binding interface.
T230 32350-32736 Sentence denotes From the alignment of the spike protein sequences of SARS CoV-1 and SARS-CoV-2 (Wuhan-Hu1 and India), it can be observed that the three nucleotide-deletion in the case 1 SARS-CoV-2 from India, is located close to the insert 1 region of the SARS CoV-1 (Supplementary Fig. 3 (available from http://www.ijmr.org.in/articles/2020/151/2/images/IndianJMedRes_2020_151_2_200_281471_sm11.pdf)).
T231 32737-32942 Sentence denotes Notably, case 1 and case 2 were in close contact while travelling to India, but due to the absence of the complete genome of case 2, the genetic relatedness and source of infection could not be pinpointed.
T232 32943-33052 Sentence denotes Among the SARS-CoV structural proteins, the spike protein has been found to elicit neutralizing antibodies24.
T233 33053-33216 Sentence denotes In this study, it was observed that of the five B-cell linear epitopes, which were predicted, four epitopes were present in the S1 domain and one in the S2 domain.
T234 33217-33538 Sentence denotes Prediction of conformational B-cell epitopes revealed that one of these (residue positions 341-505) in the spike protein incorporates two of the predicted linear epitopes (327-342 and 404-419) having good antigenicity along with a favourable IFN-γ response that enables differentiation and proliferation of the B-cells25.
T235 33539-33662 Sentence denotes Notably, an equivalent epitope (347-499) is predicted for the model generated using the SARS-CoV-1 S protein as a template.
T236 33663-33712 Sentence denotes In both cases, this epitope lies within the RBD6.
T237 33713-33887 Sentence denotes Although the epitope has two putative N-linked glycosylation sites within it at positions 330 and 332, the probability of these sites being actually glycosylated is very low.
T238 33888-33983 Sentence denotes A major immuno-dominant epitope has been reported from SARS-CoV between residues 441 and 70026.
T239 33984-34126 Sentence denotes Hence, the predicted B-cell conformational epitope identified in the present study may play an important role in initiating a B-cell response.
T240 34127-34262 Sentence denotes Among the five linear epitopes predicted in this study, epitopes 327-342 and 1204-1219 are conserved between SARS-CoV-2 and SARS-CoV-1.
T241 34263-34330 Sentence denotes Epitopes 243-258, 404-419 and 413-428 are found to have variations.
T242 34331-34451 Sentence denotes The spike protein of SARS-CoV has also been reported to be immunogenic and elicit high IFN-γ-specific T-cell response26.
T243 34452-34617 Sentence denotes The prediction results in this study revealed that nine possible CTL epitopes possessing good antigenicity and inducing IFN-γ response were present in the S protein.
T244 34618-34741 Sentence denotes A recent report27 also predicted T-cell epitopes in the S protein based on a similar ANN/SVM method and antigenicity score.
T245 34742-34872 Sentence denotes Although the IFN-γ response was not considered by these authors, it was noted that two of the predictions were found to be common.
T246 34873-35124 Sentence denotes Among the T-cell epitopes predicted in the present study, four epitopes 89-97 and 256-264 in the S1 domain and 825-833 and 1058-1066 in the S2 domain were found to have good CTL prediction scores with a broad HLA allele coverage of A and B supertypes.
T247 35125-35286 Sentence denotes These HLA supertypes being predominant in the Indian population, the predicted epitopes may be considered suitable for future experiments towards vaccine design.
T248 35287-35476 Sentence denotes To conclude, the prompt intervention by the Government of India and the health authorities of the State of Kerala, ensured that the said cases did not become secondary foci of transmission.
T249 35477-35700 Sentence denotes Further, the timely identification of SARS-CoV-2 in these suspected cases by the ICMR-NIV, Pune, has helped in the isolation of the patients, containment and enhanced surveillances for the virus and its restricted movement.
T250 35701-35883 Sentence denotes The availability of the genomic sequences of the identified cases will contribute to the public repositories and help towards the development of diagnostics, vaccines and antivirals.
T251 35884-36000 Sentence denotes The sequence data would also help in tracking the virus from its origin and evolution with its transmission in time.
T252 36001-36022 Sentence denotes Availability of data:
T253 36023-36124 Sentence denotes Sequences are deposited in GISAID database, with accession numbers EPI ISL 413522 and EPI ISL 413523.
T254 36126-36254 Sentence denotes Supplementary Table I Acknowledgement for the list of the sequences downloaded from GISAID database that were used in the study
T255 36255-36471 Sentence denotes Supplementary Fig. 1 Receptor-binding domain and mutations within the ‘S’ protein of the Indian severe acute respiratory syndrome coronavirus 2 mapped on the respective modelled structure (A) Side view (B) Top view.
T256 36472-36844 Sentence denotes Supplementary Fig. 2 Conformational B-cell epitopes predicted on the S protein of the Indian severe acute respiratory syndrome coronavirus (SARS-CoV-2) using the pre-fusion structure of SARS-CoV-1 (6ACC.PDB; 87.29%) (colour key: blue - epitope 1; green - epitope 2; yellow - epitope 4; pink - epitope 5 as indicated in Supplementary Table III) (A) Top view (B) Side view.
T257 36845-37077 Sentence denotes Supplementary Fig. 3 Alignment of Wuhan severe acute respiratory syndrome coronavirus 2, Indian severe acute respiratory syndrome (SARS-CoV-2) and SARS-CoV-1 (accession number: NC_004718) spike protein using CLC Genomics workbench.
T258 37078-37193 Sentence denotes Boxes represent the different insertions in the spike protein region of Indian SARS-CoV-2 (deletions in SARS-CoV-1.
T259 37194-37272 Sentence denotes S1 domain is highlighted in light red colour and the S2 domain in blue colour.