> top > projects > LitCovid-sentences > docs > PMC:7111504 > annotations

PMC:7111504 JSONTXT 24 Projects

Annnotations TAB TSV DIC JSON TextAE

Id Subject Object Predicate Lexical cue
T1 1-58 Sentence denotes In silico identification of vaccine targets for 2019-nCoV
T2 59-83 Sentence denotes [version 2; peer review:
T3 84-95 Sentence denotes 3 approved]
T4 97-105 Sentence denotes Abstract
T5 106-117 Sentence denotes Background:
T6 118-211 Sentence denotes The newly identified coronavirus known as 2019-nCoV has posed a serious global health threat.
T7 212-365 Sentence denotes According to the latest report (18-February-2020), it has infected more than 72,000 people globally and led to deaths of more than 1,016 people in China.
T8 366-374 Sentence denotes Methods:
T9 375-475 Sentence denotes The 2019 novel coronavirus proteome was aligned to a curated database of viral immunogenic peptides.
T10 476-627 Sentence denotes The immunogenicity of detected peptides and their binding potential to HLA alleles was predicted by immunogenicity predictive models and NetMHCpan 4.0.
T11 628-636 Sentence denotes Results:
T12 637-817 Sentence denotes We report in silico identification of a comprehensive list of immunogenic peptides that can be used as potential targets for 2019 novel coronavirus (2019-nCoV) vaccine development.
T13 818-998 Sentence denotes First, we found 28 nCoV peptides identical to Severe acute respiratory syndrome-related coronavirus (SARS CoV) that have previously been characterized immunogenic by T cell assays.
T14 999-1147 Sentence denotes Second, we identified 48 nCoV peptides having a high degree of similarity with immunogenic peptides deposited in The Immune Epitope Database (IEDB).
T15 1148-1502 Sentence denotes Lastly, we conducted a de novo search of 2019-nCoV 9-mer peptides that i) bind to common HLA alleles in Chinese and European population and ii) have T Cell Receptor (TCR) recognition potential by positional weight matrices and a recently developed immunogenicity algorithm, iPred, and identified in total 63 peptides with a high immunogenicity potential.
T16 1503-1515 Sentence denotes Conclusions:
T17 1516-1719 Sentence denotes Given the limited time and resources to develop vaccine and treatments for 2019-nCoV, our work provides a shortlist of candidates for experimental validation and thus can accelerate development pipeline.
T18 1721-1733 Sentence denotes Introduction
T19 1734-1908 Sentence denotes The emergence and rapid spread of the recent novel coronavirus known as 2019-nCoV has posed a serious global health threat 1 and has already caused a huge financial burden 2.
T20 1909-2084 Sentence denotes It has further challenged the scientific and industrial community for quick control practices, and equally importantly to develop effective vaccines to prevent its recurrence.
T21 2085-2359 Sentence denotes In facing a rapid epidemical outbreak to a novel and unknown pathogen, a key bottleneck for a proper and deep investigation, which is fundamental for vaccine development, is the limited -- to almost no -- access of the scientific community to samples from infected subjects.
T22 2360-2555 Sentence denotes As such, in silico predictions of targets for vaccines are of high importance and can serve as a guidance to medical and experimental experts for the best and timely use of the limited resources.
T23 2556-2686 Sentence denotes In this regard, we report our recent effort to computationally identify immunogenic and/or cross-reactive peptides from 2019-nCoV.
T24 2687-3017 Sentence denotes We provide a detailed screen of candidate peptides based on comparison with immunogenic peptides deposited in the Immune Epitope Database and Analysis Resource (IEDB) database including those derived from Severe acute respiratory syndrome-related coronavirus (SARS CoV) along with de novo prediction from 2019-nCoV 9-mer peptides.
T25 3018-3440 Sentence denotes Here, we found i) 28 SARS-derived peptides having exact matches in 2019-nCoV proteome previously characterized to be immunogenic by in vitro T cell assays, ii) 22 nCoV peptides having a high sequence similarity with immunogenic peptides but with a greater predicted immunogenicity score, and iii) 44 + 19 nCoV peptides predicted to be immunogenic by the iPred algorithm and 1G4 TCR positional weight matrices respectively.
T26 3442-3449 Sentence denotes Results
T27 3451-3562 Sentence denotes Identification of 28 exact matches to SARS CoV immunogenic peptides by screening all epitopes deposited in IEDB
T28 3563-3691 Sentence denotes We collected all peptides in IEDB ( 3, as of 13-02-2020) reported positive in T cell assays and have human as the host organism.
T29 3692-3837 Sentence denotes We then conducted a local sequence alignment of 10 2019-nCoV open reading frames (ORFs) against 35,225 IEDB peptides, and found 28 exact matches.
T30 3838-3974 Sentence denotes Surprisingly, all identical hits (against peptides having sequence length greater than 3) were from SARS-CoV ( Table 1, Data Table 1 4).
T31 3975-4175 Sentence denotes These peptides have been shown to bind various HLA alleles, although with higher tendency towards HLA-A:02:01, from both class I and class II, and can be target for CD8+ and CD4+ T cells respectively.
T32 4176-4184 Sentence denotes Table 1.
T33 4186-4264 Sentence denotes 28 2019-nCoV peptides having exact matches with immunogenic SARS-CoV peptides.
T34 4265-4323 Sentence denotes IEDB.peptide 2019-nCoV.pattern Antigen.Name Allele.Name
T35 4324-4383 Sentence denotes TLACFVLAAV TLACFVLAAV Membrane glycoprotein HLA-A *02:01
T36 4384-4433 Sentence denotes AFFGMSRIGMEVTPSGTW AFFGMSRIGMEVTPSGTW N protein
T37 4434-4483 Sentence denotes ALNTPKDHI ALNTPKDHI Nucleoprotein HLA-A *02:01
T38 4484-4552 Sentence denotes AQFAPSASAFFGMSR AQFAPSASAFFGMSR nucleocapsid protein HLA class II
T39 4553-4602 Sentence denotes AQFAPSASAFFGMSRIGM AQFAPSASAFFGMSRIGM N protein
T40 4603-4652 Sentence denotes GMSRIGMEV GMSRIGMEV Nucleoprotein HLA-A *02:01
T41 4653-4702 Sentence denotes ILLNKHIDA ILLNKHIDA Nucleoprotein HLA-A *02:01
T42 4703-4750 Sentence denotes IRQGTDYKHWPQIAQFA IRQGTDYKHWPQIAQFA N protein
T43 4751-4798 Sentence denotes KHWPQIAQFAPSASAFF KHWPQIAQFAPSASAFF N protein
T44 4799-4848 Sentence denotes LALLLLDRL LALLLLDRL Nucleoprotein HLA-A *02:01
T45 4849-4898 Sentence denotes LLLDRLNQL LLLDRLNQL Nucleoprotein HLA-A *02:01
T46 4899-4948 Sentence denotes LLNKHIDAYKTFPPTEPK LLNKHIDAYKTFPPTEPK N protein
T47 4949-4998 Sentence denotes LQLPQGTTL LQLPQGTTL Nucleoprotein HLA-A *02:01
T48 4999-5066 Sentence denotes RRPQGLPNNTASWFT RRPQGLPNNTASWFT nucleocapsid protein HLA class I
T49 5067-5112 Sentence denotes YKTFPPTEPKKDKKKK YKTFPPTEPKKDKKKK N protein
T50 5113-5160 Sentence denotes ILLNKHID ILLNKHID Nucleoprotein HLA-A *02:01
T51 5161-5219 Sentence denotes MEVTPSGTWL MEVTPSGTWL nucleocapsid protein HLA-B *40:01
T52 5220-5265 Sentence denotes ALNTLVKQL ALNTLVKQL S protein HLA-A *02:01
T53 5266-5324 Sentence denotes FIAGLIAIV FIAGLIAIV Spike glycoprotein precursor HLA-A2
T54 5325-5383 Sentence denotes LITGRLQSL LITGRLQSL Spike glycoprotein precursor HLA-A2
T55 5384-5429 Sentence denotes NLNESLIDL NLNESLIDL S protein HLA-A *02:01
T56 5430-5494 Sentence denotes QALNTLVKQLSSNFGAI QALNTLVKQLSSNFGAI S protein HLA-DRB1 *04:01
T57 5495-5559 Sentence denotes RLNEVAKNL RLNEVAKNL Spike glycoprotein precursor HLA-A *02:01
T58 5560-5605 Sentence denotes VLNDILSRL VLNDILSRL S protein HLA-A *02:01
T59 5606-5670 Sentence denotes VVFLHVTYV VVFLHVTYV Spike glycoprotein precursor HLA-A *02:01
T60 5671-5744 Sentence denotes GAALQIPFAMQMAYRF GAALQIPFAMQMAYRF S protein HLA-DRA *01:01/DRB1 *07:01
T61 5745-5807 Sentence denotes MAYRFNGIGVTQNVLY MAYRFNGIGVTQNVLY S protein HLA-DRB1 *04:01
T62 5808-5874 Sentence denotes QLIRAAEIRASANLAATK QLIRAAEIRASANLAATK S protein HLA-DRB1 *04:01
T63 5875-5885 Sentence denotes *SARS-CoV:
T64 5886-5931 Sentence denotes Severe acute respiratory syndrome coronavirus
T65 5933-6053 Sentence denotes Identification of 22 2019-nCoV peptides with high degree of similarity to previously reported immunogenic viral peptides
T66 6054-6249 Sentence denotes In addition to 28 identical hits against SARS CoV, we observed a long tail in distribution of normalized alignment scores between 10 2019-nCoV ORFs and 35,225 IEDB peptides ( Figure 1A, Methods).
T67 6250-6351 Sentence denotes We therefore set out to further investigate potential vaccine targets among highly similar sequences.
T68 6352-6361 Sentence denotes Figure 1.
T69 6363-6444 Sentence denotes 2019-nCoV peptides with high sequence similarity to immunogenic peptides in IEDB.
T70 6445-6447 Sentence denotes A.
T71 6448-6544 Sentence denotes Comparison of normalized sequence alignment score for peptides with exact and non-exact matches.
T72 6545-6547 Sentence denotes B.
T73 6548-6607 Sentence denotes Number of target peptides grouped by their source organism.
T74 6608-6730 Sentence denotes The peptides having an exact sequence alignment with epitopes in IEDB had normalized alignment scores ranging from 4 to 6.
T75 6731-6869 Sentence denotes Taking the normalized alignment score of exact matches as a reference, we extracted 2019-nCoV peptides having score greater or equal to 4.
T76 6870-7008 Sentence denotes As illustrated in Figure 1A, we observed 45 and 11 peptides having normalized alignment score ≥ 4 and ≥ 5 respectively ( Figure 1A inset).
T77 7009-7159 Sentence denotes The target peptides were originated from 10 different sources ( Figure 1B) where a total 36 peptides were derived from strains associated to SARS CoV.
T78 7160-7258 Sentence denotes Of interest, we also observed 7 hits having high sequence similarity to targets from Homo sapiens.
T79 7259-7579 Sentence denotes In order to investigate the extent to which the difference between the source (2019-nCoV) and target (IEDB) peptides influences the immunogenicity of the source peptides we used a recently published immunogenicity model 5 to predict and compare the immunogenicity between the source and target peptides (Data Table 2 4).
T80 7580-7755 Sentence denotes We could see a similar (close to identical) immunogenicity scores for a number of IEDB and 2019-nCov peptides especially for those with high immunogenicity scores ( Figure 2).
T81 7756-7883 Sentence denotes While all 48 can be potential targets, of particular interest were those having higher immunogenicity score than IEDB peptides.
T82 7884-8033 Sentence denotes Here, we list 22 out of 48 2019-nCoV peptides that scored higher compared to their targets that have been characterized to be immunogenic ( Table 2).
T83 8034-8183 Sentence denotes In this list 15 (68%) 2019-nCov peptides have a score higher than 0.5 whereas only 11(50%) of IEDB get a score immunogenicity score greater than 0.5.
T84 8184-8192 Sentence denotes Table 2.
T85 8194-8298 Sentence denotes List of 22 2019-nCoV peptides having a higher predicted immunogenicity score than their target peptides.
T86 8299-8352 Sentence denotes IEDB.peptide 2019-nCoV.pattern IEDB.prob nCol.prob
T87 8353-8390 Sentence denotes WYMWLGARY WYIWLG 0.999249 0.999441
T88 8391-8431 Sentence denotes GLMWLSYFV GLMWLSYFI 0.995073 0.998216
T89 8432-8471 Sentence denotes GLVFLCLQY GIVFMCVEY 0.98123 0.984127
T90 8472-8544 Sentence denotes TWLTYHGAIKLDDKDPQFKDNVILL TWLTYTGAIKLDDKDPNFKDQVILL 0.925862 0.975242
T91 8545-8596 Sentence denotes IGMEVTPSGTWLTYH IGMEVTPSGTWLTY 0.903518 0.919184
T92 8597-8649 Sentence denotes GETALALLLLDRLNQ GDAALALLLLDRLNQ 0.853114 0.900655
T93 8650-8702 Sentence denotes TPSGTWLTYHGAIKL TPSGTWLTYTGAIKL 0.620894 0.662417
T94 8703-8743 Sentence denotes SIVAYTMSL SIIAYTMSL 0.589694 0.693763
T95 8744-8796 Sentence denotes RRPQGLPNNIASWFT RRPQGLPNNTASWFT 0.533253 0.584355
T96 8797-8831 Sentence denotes YNLKWN YNL-WN 0.520244 0.765309
T97 8832-8889 Sentence denotes AGCLIGAEHVDTSYECDI AGCLIGAEHVNNSYECDI 0.503905 0.56813
T98 8890-8948 Sentence denotes GFMKQYGECLGDINARDL GFIKQYGDCLGDIAARDL 0.471939 0.506817
T99 8949-9001 Sentence denotes ANKEGIVWVATEGAL ANKDGIIWVATEGAL 0.367723 0.404796
T100 9002-9036 Sentence denotes WNPDDY WNADLY 0.355018 0.584726
T101 9037-9071 Sentence denotes PDDYGG PDDFTG 0.334887 0.527287
T102 9072-9129 Sentence denotes TWLTYHGAIKLDDKDPQF TWLTYTGAIKLDDKDPNF 0.27017 0.529675
T103 9130-9163 Sentence denotes DEVNQI DEVRQI 0.18504 0.187797
T104 9164-9216 Sentence denotes SSKRFQPFQQFGRDV SNKKFLPFQQFGRDI 0.098384 0.119472
T105 9217-9256 Sentence denotes NHDSPDAEL NHTSPDVDL 0.067808 0.17889
T106 9257-9299 Sentence denotes TKQYNVTQAF TKAYNVTQAF 0.054818 0.171488
T107 9300-9358 Sentence denotes VKQMYKTPTLKYFGGFNF VKQIYKTPPIKDFGGFNF 0.018685 0.135681
T108 9359-9411 Sentence denotes QKRTATKQYNVTQAF QKRTATKAYNVTQAF 0.004891 0.037776
T109 9412-9421 Sentence denotes Figure 2.
T110 9423-9492 Sentence denotes Predicted immunogenicity for IEDB immunogenic vs. 2019-nCoV peptides.
T111 9493-9656 Sentence denotes 2019-nCoV peptides having a high sequence similarity to immunogenic peptides and their targets were analysed for their immunogenicity potential by iPred algorithm.
T112 9657-9890 Sentence denotes It is worth noting that in general predicting immunogenicity of given a peptide is challenging and not a fully solved problem, and therefore current models for predicting immunogenicity are suboptimal. iPred is also not an exception.
T113 9891-10056 Sentence denotes In fact, we could see that a substantial number of IEDB immunogenic peptides were scored < 0.5 (the threshold score used to classify immunogenic vs non-immunogenic).
T114 10057-10162 Sentence denotes This led us to ask whether we can gather any other evidence of either immunogenicity or cross-reactivity.
T115 10164-10224 Sentence denotes De novo search of immunogenic peptides in 2019-nCoV proteome
T116 10225-10359 Sentence denotes As a complementary reciprocal approach, we conducted a de novo search of immunogenic peptides against the 2019-nCov proteome sequence.
T117 10360-10481 Sentence denotes We scanned 9-mers from 2019-nCoV proteome with a window of 9 amino acids and step length of 1 amino acid (9613 in total).
T118 10482-10630 Sentence denotes The immunogenicity of 9-mer peptides were predicted using iPred and MHC presentation scores were gauged using NetMHCpan 4.0 6 for various HLA types.
T119 10631-10820 Sentence denotes In this task, we focused on haplotypes common in Chinese and European populations, which include HLA-A*02:01, HLA-A*01:01, HLA-B*07:02, HLA-B*40:01 and HLA-C*07:02 alleles (Data Table 3 4).
T120 10821-10999 Sentence denotes Based on MHC presentation and immunogenicity prediction, we detected 5 peptides predicted to bind 4 different HLA alleles of which 2 had strong immunogenicity scores ( Figure 3).
T121 11000-11100 Sentence denotes For those 65 strong binders to 3 different HLA types, 39 had immunogenicity scores ≥ 0.5 ( Table 3).
T122 11101-11214 Sentence denotes Collectively this analysis suggests a number of 9-mer immunogenic candidates for further experimental validation.
T123 11215-11223 Sentence denotes Table 3.
T124 11225-11394 Sentence denotes 2019-nCoV 9-mer peptides predicted to bind 4 different HLA alleles by NetMHCpan 4.0, and those predicted to bind ≥ 3 HLA alleles and immunogenicity score ≥ 0.9 by iPred.
T125 11395-11496 Sentence denotes For different alleles, 0 denotes non-binding and 1 denotes binding predicted for specific HLA allele.
T126 11497-11591 Sentence denotes Antigen.epitope Imm.prob A0101.NB A0201.NB B0702.NB B4001.NB C0702.NB Total binding HLA
T127 11592-11629 Sentence denotes VQMAPISAM 0.893938 0 1 1 1 1 4
T128 11630-11667 Sentence denotes AMYTPHTVL 0.862427 0 1 1 1 1 4
T129 11668-11705 Sentence denotes TLDSKTQSL 0.254998 1 1 1 0 1 4
T130 11706-11743 Sentence denotes KVDGVVQQL 0.191786 1 1 1 0 1 4
T131 11744-11780 Sentence denotes KVDGVDVEL 0.18632 1 1 1 0 1 4
T132 11781-11818 Sentence denotes MADQAMTQM 0.991227 1 0 1 0 1 3
T133 11819-11856 Sentence denotes LEAPFLYLY 0.983072 1 0 0 1 1 3
T134 11857-11894 Sentence denotes RTAPHGHVM 0.972153 1 0 1 0 1 3
T135 11895-11932 Sentence denotes IPFAMQMAY 0.961569 1 0 1 0 1 3
T136 11933-11970 Sentence denotes FLTENLLLY 0.951715 1 1 0 0 1 3
T137 11971-12008 Sentence denotes YLQPRTFLL 0.947743 1 1 0 0 1 3
T138 12009-12046 Sentence denotes MMISAGFSL 0.941318 0 1 1 0 1 3
T139 12047-12084 Sentence denotes ATLPKGIMM 0.926603 1 0 1 0 1 3
T140 12085-12094 Sentence denotes Figure 3.
T141 12096-12190 Sentence denotes De novo search of 9-mer 2019-nCoV peptides with MHC presentation and immunogenicity potential.
T142 12191-12359 Sentence denotes The MHC binding was predicted for HLA-A*02:01, HLA-A*01:01, HLA-B*07:02, HLA-B*40:01 and HLA-C*07:02 alleles by NetMHCpan 4.0 and immunogenicity was predicted by iPred.
T143 12361-12422 Sentence denotes Immunogenicity of 2019-nCoV peptides to 1G4 CD8+ TCR molecule
T144 12423-12584 Sentence denotes While our de novo candidates are appealing shortlisted targets for experimental validation, it does not provide information about target T cell receptors (TCRs).
T145 12585-12683 Sentence denotes We therefore set out to interrogate the possibility of cross reactivity with one well-studied TCR.
T146 12684-12805 Sentence denotes T cell cross-reactivity has been instrumental for the T cell immunity against both tumor antigens and external pathogens.
T147 12806-12979 Sentence denotes In that regard, a number of T cells have been extensively characterized including 1G4 CD8+ TCR, which is known to recognize the ‘SLLMWITQC’ peptide presented by HLA-A*02:01.
T148 12980-13141 Sentence denotes We therefore set out to leverage the data from a recently published study 7 and exploit the possibility of cross reactivity of this TCR to any 2019-nCoV peptide.
T149 13142-13428 Sentence denotes Here, we scanned all 9-mers from the 2019-nCoV proteome (9613 peptides) with Binding, Activating and Killing Position Weight Matrices (PWM, see the method section) and associated each peptide with the geometric mean of these three assays as a measure of immunogenicity (Data Table 4 4).
T150 13429-13574 Sentence denotes The distributions of binding, activation and killing scores along with their multiplicative score and geometric mean are illustrated in Figure 4.
T151 13575-13676 Sentence denotes Based on geometric mean, we observed 20 2019-nCoV peptides with a score > 0.8 and 516 peptides > 0.7.
T152 13677-13805 Sentence denotes The 9-mer peptides with geometric mean > 0.7 and positive HLA-A*02:01 binding prediction by NetMHCpan 4.0 are listed in Table 4.
T153 13806-13815 Sentence denotes Figure 4.
T154 13817-13896 Sentence denotes Distribution of 1G4 TCR positional weight matrix scores for 2019-nCoV peptides.
T155 13897-14048 Sentence denotes The positional weight matrices were obtained from 7 and 9613 9-mers generated from 10 2019-nCoV ORFs were computed for their TCR recognition potential.
T156 14049-14057 Sentence denotes Table 4.
T157 14059-14230 Sentence denotes 2019-nCoV 9-mer peptides with geometric mean ≥ 0.7 by 1G4 TCR positional weight matrix and predicted positive to bind HLA-A*02:01 by NetMHCpan 4.0 (Rank = NetMHCpan rank).
T158 14231-14309 Sentence denotes Peptide Binding score Activation score Killing score geoMean Rank Binder
T159 14310-14374 Sentence denotes RIMTWLDMV 0.866377428 0.853995 0.776303 0.831249 0.3481 SB
T160 14375-14438 Sentence denotes ALNTLVKQL 0.802453741 0.75073 0.785957 0.779413 0.6159 WB
T161 14439-14501 Sentence denotes LLLDRLNQL 0.809895414 0.7752 0.741096 0.774888 0.0423 SB
T162 14502-14566 Sentence denotes MIAQYTSAL 0.766262499 0.789511 0.749477 0.768242 0.9238 WB
T163 14567-14630 Sentence denotes VLSTFISAA 0.799672451 0.756117 0.687278 0.746239 0.536 WB
T164 14631-14695 Sentence denotes NVLAWLYAA 0.761297552 0.686117 0.739944 0.728423 1.4457 WB
T165 14696-14760 Sentence denotes RLANECAQV 0.783161706 0.719705 0.680504 0.726572 0.2049 SB
T166 14761-14825 Sentence denotes KLLKSIAAT 0.748896679 0.708996 0.697463 0.718118 1.0923 WB
T167 14826-14889 Sentence denotes QLSLPVLQV 0.70128376 0.715259 0.708405 0.708293 0.4768 SB
T168 14890-14954 Sentence denotes VQMAPISAM 0.729320768 0.698514 0.689612 0.705612 1.4677 WB
T169 14955-15017 Sentence denotes LLLTILTSL 0.7131709 0.715194 0.680064 0.702623 0.2712 SB
T170 15018-15081 Sentence denotes SVLLFLAFV 0.736972762 0.690855 0.679534 0.70202 1.1449 WB
T171 15082-15353 Sentence denotes LMWLIINLV 0.727847374 0.681119 0.694007 0.700717 1.304 WB We further analysed the MHC binding propensities and gathered peptides not only predicted positive by NetMHCpan but also to have leucine (L) and valine (V) in anchor positions 2 (P2) and 9 (P9) respectively.
T172 15354-15534 Sentence denotes Previous studies have shown that for MHC-I HLA-A02:01 specific peptides, 9-mer peptides with leucine at P2 and valine at P9 are preferably presented on the surface of HLA-A02:01 8.
T173 15535-15682 Sentence denotes Looking at the LV peptide, we identified 44 2019-nCoV peptides of which 2 peptides had immunogenicity score > 0.7 and 12 peptides > 0.6 ( Table 5).
T174 15683-15816 Sentence denotes Thus, here we provide the list of peptides that are potential targets for 1G4 TCR recognition for subjects with HLA-A02:01 haplotype.
T175 15817-15825 Sentence denotes Table 5.
T176 15827-15894 Sentence denotes 2019-nCoV 9-mer peptides having leucine-valine in anchor positions.
T177 15895-16100 Sentence denotes Peptides have geometric mean ≥ 0.6 and ≤ 0.7 (for those ≥ 0.7, refer to Table 4) by 1G4 TCR positional weight matrix and predicted positive for HLA-A*02:01 binding by NetMHCpan 4.0 (Rank = NetMHCpan rank).
T178 16101-16179 Sentence denotes Peptide Binding score Activation score Killing score geoMean Rank Binder
T179 16180-16241 Sentence denotes TLMNVLTLV 0.723687 0.658986 0.652178 0.677534 0.0444 SB
T180 16242-16303 Sentence denotes QLEMELTPV 0.711291 0.651003 0.608605 0.655625 1.6769 WB
T181 16304-16364 Sentence denotes MLAKALRKV 0.668756 0.610664 0.65968 0.645854 0.3524 SB
T182 16365-16426 Sentence denotes GLFKDCSKV 0.675952 0.632375 0.594753 0.633494 0.2677 SB
T183 16427-16488 Sentence denotes ALSKGVHFV 0.652549 0.604952 0.586236 0.613954 0.0425 SB
T184 16489-16550 Sentence denotes YLNTLTLAV 0.624147 0.610826 0.575445 0.603119 0.0453 SB
T185 16552-16562 Sentence denotes Discussion
T186 16563-16724 Sentence denotes In this study we provide a profile of computationally predicted immunogenic peptides from 2019-nCoV for functional validation and potential vaccine developments.
T187 16725-16859 Sentence denotes We are fully aware that an effective vaccine development will require a very thorough investigation of immune correlates to 2019-nCoV.
T188 16860-17028 Sentence denotes However, due to the emergency and severity of the outbreak as well as the lack of access to samples from infected subjects, such approaches would not serve the urgency.
T189 17029-17235 Sentence denotes Therefore, computational prediction is instrumental for guiding biologists towards a quick and cost-effective solution to prevent the spread and ultimately help eliminate the infection from the individuals.
T190 17236-17376 Sentence denotes With a rising global concern of novel coronavirus outbreak, numerous research groups have started to investigate and publish their findings.
T191 17377-17533 Sentence denotes At the time of preparing this manuscript, we became aware of a similar study conducted in comparing 2019-nCoV proteome with SARS CoV immunogenic peptides 9.
T192 17534-17770 Sentence denotes Our in silico approach takes the search beyond presenting only common immunogenic peptide between SARS and 2019-nCoV and provides the experimental community with a more comprehensive list including de novo and cross reactive candidates.
T193 17771-17972 Sentence denotes On the other hand, considering the fact that two studies have been accomplished independently with distinct approaches, this serves to demonstrate a high level of confidence in reproducing the results.
T194 17973-18123 Sentence denotes Reproducibility of computational prediction is always of high importance and becomes even more significant under urgent scenarios as of this outbreak.
T195 18124-18279 Sentence denotes Our study also suggests the need for further efforts to develop accurate predictive models and algorithms for the characterization of immunogenic peptides.
T196 18280-18627 Sentence denotes In this study, we provide potential immunogenic peptides from 2019-nCoV for vaccine targets that i) have been characterized immunogenic by previous studies on SARS CoV, ii) have high degree of similarity with immunogenic SARS CoV peptides and iii) are predicted immunogenic by combination of NetMHCpan and iPred/1G4 TCR positional weight matrices.
T197 18628-18751 Sentence denotes Given the limited time and resources, our work serves as a guide to save time and cost for further experimental validation.
T198 18753-18759 Sentence denotes Method
T199 18761-18777 Sentence denotes Data acquisition
T200 18778-18857 Sentence denotes 2019-nCoV open reading frame sequences were downloaded from NCBI ( MN908947.3).
T201 18858-18930 Sentence denotes All sequences subjected for analysis are deposited in GitHub repository.
T202 18932-18945 Sentence denotes Data analysis
T203 18946-19001 Sentence denotes All subsequent analyses have been conducted in R 3.6.1.
T204 19003-19033 Sentence denotes Sequence similarity comparison
T205 19034-19255 Sentence denotes The sequence similarity between 2019-nCoV open reading frames and previously characterized immunogenic peptides in IEDB was analysed by local alignment using R ‘pairwiseAlignment’ function from Biostrings v2.40.2 package.
T206 19256-19340 Sentence denotes The local alignment utilized BLOSUM62 matrix, gapOpening of 5 and gapExtension of 5.
T207 19341-19405 Sentence denotes The alignment score was normalized by length of target peptides.
T208 19406-19565 Sentence denotes In extracting peptides with the exact sequence alignment with epitope sequences in IEDB, only peptides with more than 3 amino acids in length were shortlisted.
T209 19567-19592 Sentence denotes Immunogenicity prediction
T210 19593-19662 Sentence denotes We have used iPred 5 to predict immunogenicity of each given peptide.
T211 19663-19880 Sentence denotes Briefly, iPred employs peptides’ length and physicochemical properties of amino acids modelled by sums of ten Kidera factors and associates a score to each peptide reflecting its likelihood of recognition by a T cell.
T212 19882-19913 Sentence denotes Predicting presentation by MHCs
T213 19914-19980 Sentence denotes In order to predict peptide binding to MHC we used NetMHCpan V4 6.
T214 19981-20192 Sentence denotes This version of NetMHCpan that comes with a number of improvements, incorporate both eluted ligand and peptide binding affinity data into a neural network model to predict MHC presentation of each given peptide.
T215 20194-20232 Sentence denotes Predicting cross reactivity to 1G4 TCR
T216 20233-20369 Sentence denotes To gauge the level of 1G4 TCR cross-reactivity to list of 2019-nCoV virus, we have leveraged the data from a recently published study 7.
T217 20370-20518 Sentence denotes 1G4 or NY-ESO-1-specific TCR is a very well-studied and clinically efficacious TCR which recognize the peptide ‘SLLMWITQC’ presented by HLA-A*02:01.
T218 20519-20725 Sentence denotes Karapetyan et al. have recently provided data from three experimental assays reflecting Binding, Activating and Killing upon each mutation at each position of all possible 9-mers using these three datasets.
T219 20726-20881 Sentence denotes In a similar way to the original paper, we trained three Position Weight Matrices named B, A and K respectively from Binding, Activating and Killing assay.
T220 20882-20982 Sentence denotes We defined the cross-reactivity score of a given 9-mer sequence as the geometric mean of B, A and K.
T221 20983-21126 Sentence denotes We then scanned 2019-nCoV virus protein sequence with each of B, A and K PWMs and associated each of 9613 9-mers with a cross reactivity score.
T222 21127-21215 Sentence denotes At the same we utilized NetMHCpan and associated each 9-mer with its presentation score.
T223 21216-21414 Sentence denotes Our final list of cross-reactive candidate peptides were those with a cross-reactivity sore >= 0.8 and reported as strong binders from NetMHCpan and have ‘L’ and ‘V’ amino acids at anchor positions.
T224 21415-21502 Sentence denotes The custom R codes are accessible from GitHub repository (see software availability 4).
T225 21504-21525 Sentence denotes Software availability
T226 21526-21603 Sentence denotes Replication code: https://github.com/ChloeHJ/Vaccine-target-for-2019-nCoV.git
T227 21604-21688 Sentence denotes Archived source code at time of publication: http://doi.org/10.5281/zenodo.3676908 4
T228 21689-21697 Sentence denotes License:
T229 21698-21744 Sentence denotes Creative Commons Attribution 4.0 International
T230 21746-21763 Sentence denotes Data availability
T231 21765-21776 Sentence denotes Source data
T232 21777-21856 Sentence denotes 2019-nCoV open reading frame sequences were downloaded from NCBI ( MN908947.3).
T233 21858-21873 Sentence denotes Underlying data
T234 21874-21881 Sentence denotes Zenodo:
T235 21882-21995 Sentence denotes In silico identification of vaccine targets for 2019-nCoV (Data tables). http://doi.org/10.5281/zenodo.3676886 10
T236 21996-22209 Sentence denotes This project contains the following underlying data: – Table1 nCoV peptides having exact match with immunogenic SARS CoV peptides.xlsx (Table of nCoV peptides having exact match with immunogenic SARS CoV peptides)
T237 22210-22378 Sentence denotes – Table2 nCoV peptides with high sequence similarity with immunogenic IEDB peptides.csv (Table of peptides with high sequence similarity with immunogenic IEDB peptides)
T238 22379-22567 Sentence denotes – Table3 de novo search on 9-mer nCoV for immunogenic peptides by NetMHCpan and iPred.csv (Table of results of de novo search on 9-mer nCoV for immunogenic peptides by NetMHCpan and iPred)
T239 22568-22753 Sentence denotes – Table4 de novo search on 9-mer nCoV for immunogenic peptides by NetMHCpan and PWM.xlsx (Table of results of de novo search on 9-mer nCoV for immunogenic peptides by NetMHCpan and PWM)
T240 22754-22863 Sentence denotes Data are available under the terms of the Creative Commons Attribution 4.0 International license (CC-BY 4.0).
T241 22865-22881 Sentence denotes Acknowledgements
T242 22882-23048 Sentence denotes We appreciate assistance and computing support from Human Immunology Unit and WIMM Centre for Computational Biology at MRC Weatherall Institute of Molecular Medicine.
T243 23049-23060 Sentence denotes We thank G.
T244 23061-23078 Sentence denotes Napolitani and M.
T245 23079-23130 Sentence denotes Salio for insightful discussions about the project.