PMC:7189872 / 0-74 JSONTXT 3 Projects
Epitope based vaccine prediction for SARS-COV-2 by deploying immuno-informatics approach
Abstract
A new virus termed SARS-COV-2 (causing COVID-19 disease) can exhibit a progressive, fatal impact on individuals. The World Health Organization (WHO) has declared the spread of the virus to be a global pandemic. Currently, there are over 1 million cases and over 100,000 confirmed deaths due to the virus. Hence, prophylactic and therapeutic strategies are promptly needed. In this study we report an epitope, ITLCFTLKR, which is biochemically fit to HLA allelic proteins. We propose that this could be used as a potential vaccine candidate against SARS-COV-2. A selected putative epitope and HLA-allelic complexes show not only better binding scores, but also RMSD values in the range of 0–1 Å. This epitope was found to have a 99.8% structural favorability as per Ramachandran-plot analysis. Similarly, a suitable range of IC50 values and population coverage was obtained to represent greater validation of T-cell epitope analysis. Stability analysis using MDWeb and half-life analysis using the ProtParam tool has confirmed that this epitope is well-selected. This new methodology of epitope-based vaccine prediction is fundamental and fast in application, ad can be economically beneficial and viable.
1 Introduction
The outbreak of COVID-19 in the Hubei region of the Chinese city of Wuhan [37] has resulted in a difficult situation for the global populace and for the World Health Organization (WHO). On 30 Jan 2020, WHO declared an emergency regarding COVID-19 spread, prevention, and control [28,29]. As of 20 March 2020, there are over 240,000 cases and over 10,000 confirmed deaths, affecting 181 countries [9]. Present updates indicate that there are over 1 million confirmed cases and over 100,000 deaths worldwide [39] (information is obtained from https://www.worldometers.info/coronavirus/). The coronavirus is an RNA type virus with positive sense strand feature and is associated with the Coronaviridae family under order Nidovirales, and is found to be dispersed among Primate order, members of class Mammalia, and specifically in humans [31]. Severe acute respiratory syndrome coronavirus (SARS-CoV) [11,21,22] and Middle East respiratory disorder coronavirus (MERS-CoV) [7,40], classified as β-coronaviruses, are found to be related to the novel SARS-COV-2. Until now, there has been no efficacious therapy to regulate its spread [24]. Investigating its spread across continents and the vulnerability, there is an urgent need to craft vaccines for reinforcing immune defense against the SARS-COV-2 virus. One of the strategies to combat it is the development of vaccines that can initiate an adaptive immune response in humans. Here an attempt has been made to design an epitope based vaccine directed at SARS-COV-2, by analyzing the proteome of the virus by using Immuno-informatics tools. In this study, we deployed the use of various bioinformatics servers and Immuno-informatics tools for identifying and recognizing the T-cell epitopes from the intensive study of available protein sequences and structures that are related to SARS-COV-2. These epitope stretches can interact with MHC Class I and Class II HLA alleles; further validation of epitopes was analyzed by Ramachandran Plot analysis, Antigenicity parameters evaluation, Toxicity analysis, Population coverage, Molecular dynamics, and ProtParam analysis [19]. This approach is an excellent method in modern vaccine design, as it provides a lead over classical trial and error methods of wet labs [23]. We tried to identify T-cell epitopes that can elicit a robust immune response in the global human population and act as potential vaccine candidates. However, the ability of these epitopes to act as a vaccine candidate needs to be analyzed in Molecular biology lab studies. Our investigation can open new dimensions in crafting peptide-based vaccine regimens for Novel SARS-COV2. The greatest decline in virus expansion was noted following ORF3a removal. ORF7a encodes a 122-amino-acid type I transmembrane protein and structural studies disclose a packed seven-stranded β sandwich comparable in fold and topology to members of the immunoglobulin super family. SARS-CoV is an enveloped, positive-stranded RNA virus with a genome of around 29,700 bases. The genome incorporates at least 14 open reading frames (ORFs) that encode 28 proteins in three distinct classes: two large polyproteins P1a and P1ab that are cleaved into 16 non-structural proteins (nsp1–nsp16) during viral RNA synthesis; four structural proteins (S, E, M and N) that are necessary for viral entrance and gathering; and eight accessory proteins that are assumed to be dispensable for viral replication, but may facilitate viral assembly and take part in virulence and pathogenesis (Fig. 1 ).
Fig. 1 Genome organization and viral proteins of SARS-CoV [44].
In our investigatory study, out of five, two proteins namely ORF-3a and ORF-7a specific to SARS-COV-2 were found to be putative T-cell epitope determinants that create useful information to distinguish, and these proteins are also important for viral replication and growth [41]. Both of these proteins may influence in viral pathogenesis and disease spread, although the literature lacks unanimity [43]. B-cell epitopes prediction is still considered to be untrustworthy for both linear and conformational epitopes as compared to T-cell epitopes. Furthermore, the B-cell epitopes do not elicit a strong antibody response. For this reason, only T-cell epitopes are considered in the present study. It is capable to produce CD4+ and CD8+ T-cells with long-lasting response [45]. . In one of the recent studies, epitopes were designed, but they have focused on the single protein (i.e. SPIKE protein) to generate multiple epitopes like 13 for MHC I and 3 for MHC II [46]. We have analyzed multiple proteins to screen only effective epitopes based on various in-silico filters, to provide the most appropriate and authentic epitopes, which can be further tested in a wet lab.
2 Methodology
2.1 Protein retrieval and allergenicity analysis
Five protein sequences were selected from NCBI-GenBank, listed in Table 1 , for SARS-COV-2 based on their Allergenicity that relies on the Tanimoto similarity index score produced by AllergenFP 1.0 [8]. The selected proteins were the envelope protein, ORF3a protein, nucleocapsid phosphoprotein, ORF7a protein, and membrane glycoprotein, which are crucial for the structural integrity and functionality of the virus [38].
Table 1 List of proteins selected for SARS COV-2 with allergenicity.
Protein Gene Bank accession no. Protein name Allergen FP Score (Tanimoto similarity Index) Allergen/Non-Allergen
QHD43418.1 Envelop Protein 0.87 Non-Allergen
QHD43417.1 ORF3a Protein 0.84 Non-Allergen
QHD43423.2 Nulceocapsid Phosphoprotein 0.85 Non-Allergen
QHD43421.1 ORF7a protein 0.80 Non-Allergen
QHD43419.1 Membrane Glycoprotein 0.83 Non-Allergen
2.2 T-cell epitope prediction for MHC HLA alleles
IEDB (Immune epitope database) [20] along with NetMHCII PAN 3.2 and NETMHC 4.0 servers [18] were effectively used for finding putative peptide sequences that were aimed to interact with the MHC Class II and I HLA alleles, respectively (because of efficient algorithms based on artificial neural networks). The VaxiJen score is determined for screening the best antigenic epitopes using the VaxiJen online tool [10] with a threshold ≥ 1.0 for the viruses’ domain.
2.3 Structural prediction: Putative epitopes and MHC HLA alleles
The epitopes 3D structural findings were conducted by using the PEP-FOLD-3.5 server [25,33,34] and MHC HLA Allelic peptides tertiary or 3D structure were obtained from the RCSB-PDB database [4].
2.4 Molecular docking analysis
The selected epitopes and HLA complexes were docked for calculating refined interactions and binding energies, along with atomic contact energy (ACE), by using two docking web-servers: DINC 2.0 [2] and PatchDock [32].
2.5 Molecular dynamics-simulation analysis of docked complex
Molecular dynamics study was conducted to analyze RMSD values and atomic fluctuations for all amino acids under the 100 ps time frame by deploying the MDWeb server. MDWeb server was deployed to analyze Coarse grained MD Brownian dynamics (C-alpha) with specifications → Time: 100 ps, output frequency (steps) = 10, force constant (kcal/mol Ǻ2) = 40, distance between alpha carbon atoms(Ǻ) = 3.8 for both the interacting epitopes, and it was based on a GROMACS MD setup with solvation using an Amber-99sb* force-field [17].
2.6 Toxicity, Ramachandran-plot, and population coverage analysis
The ToxinPred server [16] is utilized for determining the toxicity scoring of Epitopes for selecting non-toxic ones; also, the Ramachandran plot analysis was deployed by using the MolProbity 4.2 server [6] to analyze the quantitative presence of residues in the favorable region.
The Immune Epitope Database (IED) resource web-server of population coverage was used to predict population coverage of the MHC II and MHC I alleles that interact with screened out epitopes based on their restriction database [5]. The MHCPred web-server was effectively used in quantitative prediction of sorted out epitopes interacting with HLA alleles of MHC II and MHC I [15]. Thereafter the ProtParam tool [42] of the ExPASy server was used to screen final stable epitopes based on the instability index and half-life.
3 Results
3.1 T-cell epitopes prediction and VaxiJen scoring
Non-allergen proteins selected (Table 1) based on allergenicity scores (by deploying AllergenFP 1.0) New to NetMHCII PAN 3.2 and NETMHC 4.0 servers for determining 1-log50k value and affinity values for selecting the best possible pair of epitopes with their corresponding HLA alleles. In Table 2 and Table 3 , results for MHC Class I HLA and MHC Class II HLA alleles paired epitopes along with their VaxiJen scores were represented respectively, to obtain putative epitopes. The results are self-explanatory in Fig. 2 . Graphical representation of selected peptides for docking are based on their interaction with MHC Class I and Class II HLA alleles along with their antigenicity.
Table 2 Probable antigenic epitopes and MHC I Allele interaction based on NETMHC 4.0 server and VaxiJen 2.0 scores (≥1.0) for antigenicity prediction. (* - sign is used for no interaction any HLA Allele under consideration).
NCBI-GenBank ID MHC I Allele POS Core peptide 1-LOG50K Affinity(nM) Vaxijen score Antigenicity
QHD43417.1 HLA-A*68:01 7 FTIGTVTLK 0.853 4.9 2.0317 ANTIGEN
HLA-A*31:01 125 RLWLCWKCR 0.74 16.59 1.1604 ANTIGEN
QHD43418.1 – – – – – – NO INTERACTION
QHD43419.1 HLA-A*11:01 5 GTITVEELK 0.719 20.98 1.0976 ANTIGEN
QHD43421.1 HLA-A*11:01 109 ITLCFTLKR 0.71 22.97 2.0208 ANTIGEN
HLA-A*68:01 109 ITLCFTLKR 0.695 27.24 2.0208 ANTIGEN
HLA-A*23:01 107 VFITLCFTL 0.621 60.42 1.2490 ANTIGEN
QHD43423.2 – – – – – – NO INTERACTION
Table 3 Probable antigenic epitopes and MHC II Allele interaction based on NETMHC II PRED 3.2 server and VaxiJen 2.0 scores (≥1.0) for antigenicity prediction. (* - sign is used for no interaction any HLA Allele under consideration).
NCBI-GenBank ID MHC II Allele POS Core peptide 1- LOG50K Affinity(nM) Vaxijen score Antigenicity
QHD43417.1 – – – – – – NO INTERACTION
QHD43418.1 – – – – – – NO INTERACTION
QHD43419.1 HLA-DRB1*04:01 55 WLLWPVTA 0.282 2375.78 1.0631 ANTIGEN
QHD43421.1 HLA-DRB1*01:01 74 VYQLRARSV 0.484 267.05 1.3108 ANTIGEN
HLA-DRB1*07:01 74 VYQLRARSV 0.342 1235.08 1.3108 ANTIGEN
QHD43423.2 – – – – – – NO INTERACTION
Fig. 2 Graphical representation of selected peptides for docking based on their interaction with MHC Class I and Class II HLA alleles along with their Antigenicity.
3.2 Structural findings of epitope and MHC HLA-Alleles
Epitope structures were obtained by using the PEP-FOLD-3.5 web-server; the HLA allele's structures were retrieved from the RCSB-PDB database. In Table 4 the crystal structure/model structure details with reference PDB-Id is provided.
Table 4 Listing of MHC HLA-Alleles respective Crystal structures/Models with the PDB ID.
Allele Name Template structure (PDB-ID) Crystal structure/Model
HLA-A*11:01 2HN7 CRYSTAL STRUCTURE
HLA-A*23:01 3I6L CRYSTAL STRUCTURE
HLA-A*31:01 3RL1 CRYSTAL STRUCTURE
HLA-A*68:01 6PBH CRYSTAL STRUCTURE
HLA-DRB1*01:01 4AH2 CRYSTAL STRUCTURE
HLA-DRB1*04:01 5LAX CRYSTAL STRUCTURE
HLA-DRB1*07:01 6BIJ CRYSTAL STRUCTURE
3.3 Molecular docking analysis
It was found that FTIGTVTLK, ITLCFTLKR epitopes interacted with MHC class I HLA Alleles, and VYQLRARSV epitope interacted with MHC Class II HLA alleles with a perfect binding score and ACE values as shown in Table 5 . The ITLCFTLKR Epitope of the ORF-7A protein exhibits binding with 2 HLA alleles (HLA-A*11:01, HLA-A*68:01) of MHC Class I, while FTIGTVTLK Epitope of ORF-3a protein interact with 1 HLA Allele (HLA-A*68:01) of MHC Class I. The VYQLRARSV Epitope of ORF-7a protein interacts clearly with 2 HLA Alleles (HLA-DRB1*01:01, HLA-DRB1*07:01) of the MHC Class II domain.
Table 5 Binding Energies, Ace Values for Docked Complexes based on DINC Server and PatchDock Analysis for Putative Epitopes.
Epitope HLA Allele Binding score (kcal/mol) Patch dock score ACE Selection
FTIGTVTLK HLA-A*68:01 −8.80 8066 178.91 Selected
RLWLCWKCR HLA-A*31:01 −4.80 8916 117.53 Rejected
GTITVEELK HLA-A*11:01 −3.80 8040 78.78 Rejected
ITLCFTLKR HLA-A*11:01 −3.70 8206 −25.88 Selected
ITLCFTLKR HLA-A*68:01 −7.60 8136 184.55 Selected
VFITLCFTL HLA-A*23:01 −4.40 7706 −134.91 Rejected
WLLWPVTLA HLADRB1*04:01 −8.60 9432 −150.96 Rejected
VYQLRARSV HLADRB1*01:01 −6.20 6874 −168.74 Selected
VYQLRARSV HLADRB1*07:01 −6.20 6842 262.74 Selected
In Fig. 3, Fig. 4, Fig. 5 , interactions between a selected three T-Cell epitopes with respective MHC Class I and II HLA-Alleles via hydrogen bond formation and van der Waals interactions is depicted. After positive docking results, these epitopes were subjected to further Molecular dynamic simulation and biochemical parameters assessment. Fig. 6 represents a graphical plot of binding scores for epitopes interacting with HLA-Alleles.
Fig. 3 FTIGTVTLK Epitope interaction with an antigen-binding pocket of HLA-A*68:01, of MHC I-HLA Allele. Here, Threonine at 2nd,5th, and 7th position in the epitope generate preferably a hydrogen bond due to the presence of partially charged positive and negative atoms, and also 4th position Glutamic acid and lysine at 8th position side chains can form a salt bridge, while other amino acids result in van der Waals interactions.
Fig. 4 ITLCFTLKR Epitope interaction with an antigen-binding pocket of HLA-A*68:01, of MHC I-HLA Allele. Here, Threonine at 2nd and 6th position, as well as cysteine at the 4th position in epitope, generate preferably a hydrogen bond due to the presence of partially charged positive and negative atoms, while other amino acids result in van der Waals interactions.
Fig. 5 VYQLRARSV Epitope interaction with an antigen-binding pocket of HLA-DRB1*07:01, of MHCII-HLA Allele, Here, Tyrosine at 2nd, Glutamine at 3rd position and Serine at 8th position in epitope generate preferably a hydrogen bond due to the presence of partially charged positive and negative atoms, and 5th and 7tharginine residue side chains can form a salt bridge, while other amino acids result in van der Waals interactions.
Fig. 6 Binding energy graphical plot for selected Epitope and HLA-Allelic pair.
3.4 Molecular dynamics and simulation analysis
RMSD values and Atomic fluctuation per amino acid residue were obtained for Epitopes interacting with the HLA-Allele structure; this analysis allows a perfect pair selection and validation. Moreover, only two Epitope pairs, i.e., ITLCFTLKR and VYQLRARSV, were identified as probable T-cell epitopes and as putative vaccine specimens. Fig. 7 shows the RMSD Plot and Atomic fluctuation per residue for the ITLCFTLKR- HLA-A*68:01 complex, the RMSD Plot and Atomic fluctuation per residue for the VYQLRARSV- HLA-DRB1*07:01 complex. Both results were positive as best interactions, for protein-ligand docked complexes must possess RMSD values from 0 to 1.0 Å as a preferred range [13].
Fig. 7 A. RMSD Plot for ITLCFTLKR- HLA-A*68:01 complex, for each amino acid residue by Molecular dynamics analysis, B. B-Factor (atomic fluctuation) values per amino acid residue for Epitope ITLCFTLKR- HLA-A*68:01 docked complex, C. RMSD Plot for VYQLRARSV- HLA-DRB1*07:01, for each amino acid residue by Molecular dynamics analysis, D. B-Factor (atomic fluctuation) values per amino acid residue for Epitope VYQLRARSV- HLA-DRB1*07:01 docked complex.
3.5 Toxicity analysis, Ramachandran Plot analysis, and population coverage results
ToxinPred 4.0 server results (in Table 6 .) represent Finalized T-cell Epitopes that were nontoxic from the biochemical perspective.
Table 6 Results of ToxinPred on probable antigens.
Peptide/Probable antigen SVM score Hydrophilicity Molecular weight Toxicity
FTIGTVTLK −1.36 −1.23 979.32 NON-TOXIN
GTITVEELK −0.98 0.34 989.27 NON-TOXIN
ITLCFTLKR −1.32 −0.41 1094.51 NON-TOXIN
VFITLCFTL −1.21 −1.52 1056.46 NON-TOXIN
WLLWPVTLA −1.18 −1.62 1098.49 NON-TOXIN
VYQLRARSV −1.07 −0.12 1091.40 NON-TOXIN
Ramachandran plot analysis, Fig. 8 A and B suggest that most of the residues are allowed in a favored region; this gives more confidence in the structural conformation for targeted T-Cell Epitopes.
Fig. 8 A. 99.8% residues of the ITLCFTLKR Epitope were in the allowed and favored region under Ramachandran Plot analysis. B. 99.8% residues of the VYQLRARSV Epitope were in the allowed and favored region under Ramachandran Plot analysis.
MHCPred results (Table 7 ) indicate quantitative estimation of IC50 values for both MHC I and MHC II alleles for respective Epitopes shows elicitation of an immune response when this data is deployed in a population coverage analysis.
Table 7 MHCPred results depict IC50 Values for HLA Alleles and confidence of the prediction.
HLA Alleles Amino acid groups Predicted -logIC50 (M) Predicted IC50 Value (nM) Confidence of prediction (Max = 1)
HLA-A*68:01 FTIGTVTLK 7.116 76.56 1.00
HLA-A*11:01 ITLCFTLKR 7.028 93.76 1.00
HLA-A*68:01 ITLCFTLKR 6.282 522.40 0.78
HLA-DRB1*01:01 VYQLRARSV 7.624 23.77 0.89
HLA-DRB1*07:01 VYQLRARSV 6.734 184.50 0.89
IEDB population coverage analysis suggests that ITLCFTLKR and VYQLRARSV epitopes exhibit a suitable population coverage, as depicted in the graphical representation of Fig. 9 A and B. This allows only two probable Epitopes for the final selection of vaccine crafting.
Fig. 9 A. Graphical representation of population conservancy analysis of ITLCFTLKR Epitope. B. Graphical representation of population conservancy analysis of VYQLRARSV Epitope.
In Table 8 , ProtParam analysis further reveals the stability of the considered epitopes and final revelation of one epitope ITLCFTLKR is screened out. This particular Epitope exhibits an instability index of 35.68, with a grand average of hydropathicity (GRAVY) calculated was 0.844, and the estimated half-life for this peptide was determined to be 20 h for mammalian reticulocytes.
Table 8 ProtParam analysis for selected epitopes.
Selected Epitope GRAVY Score Instability Index (Indication) Estimated Half-Life(Mammalian reticulocytes) Theoretical pI Aliphatic Index
ITLCFTLKR 0.844 35.68(Stable) 20 Hours 9.51 130.00
VYQLRARSV −0.067 70.73(Unstable) 100 Hours 10.83 118.89
4 Discussion
In this study, SARS-COV-2 virus proteins were analyzed by using In-silico methods, and can be further utilized for vaccine trials as per earlier successes in the case of similar SARS-COV studies, and later observed in the development of polyclonal antibodies [27]. Here we obtained two epitopes ITLCFTLKR and VYQLRARSV after successful docking and molecular dynamics simulation; furthermore, these two epitopes were subjected to population coverage and toxicity analysis. Similarly, in another study, for MERS-COV, nucleocapsid peptides were used for T-cell epitope prediction, and found to be successful [36]. The IEDB and NCBI-GenBank database were fully deployed to analyze sequence homology, to predict targets for COV-2 in case of viral protein identification as per the related studies [14], as ViPR (Virus Pathogen database analysis resource) are also dependent on IEDB and GenBank primarily [30]. We analyzed five different proteins in SARS-COV-2 for the present study (because of their availability in the NCBI-GenBank database and importance in a structural role in SARS-COV-2 [14] and finally revealed T–Cell epitopes that can be used for wet lab considerations and time savings. In a very recent study, different epitopes were found for SARS-COV-2, based on In-silico approaches and focused on only surface glycoprotein [3], but in our research study there are many differences as we analyzed a different group of proteins from SARS-COV-2 to sort out short length T-Cell epitopes specific to MHC I as well as MHC II diversified HLA-Alleles.
It is reported for SARS-CoV HLA-B*4601, HLA-B*0703, HLA-DR B1*1202 are activated [26], interaction with different MHC I and II allelic forms namely HLA-A*11:01, HLA-A*68:01, HLA-DRB1*01:01 and HLA-DRB1*07:01. CD4+ and CD8+ memory T cells. Based on prior literature, it is anticipated that it can persist for four years as in the case of SARS-CoV recovered individuals, show T-cell proliferation, DTH response, and production of IFN-γ [12]. We surmise that our screen can be more effective and useful. Primarily molecular docking reveals three Epitopes, but as we proceed to Molecular dynamic simulations, it reveals best interactions for two epitopes i.e., ITLCFTLKR and VYQLRARSV, with acceptable stability analyzed with the help of MDWeb and identified by using best available tools with easy-to-apply methods. One recent study was found to be focused on developing monoclonal antibodies like CR-3022 against the Spike protein of SARS-COV-2 that also exhibits interaction with ACE (Angiotensin Converting Enzyme) enzyme of the Human respiratory epithelium and requires complex neutralizing mechanisms for several binding domains [35], whereas in our study the putative T-cell epitopes can directly interact with MHC-Allelic sets that can be useful for developing immunization against SARS-COV-2. ProtParam [42] analysis further reveals the stability of the considered epitopes, and final revelation of one epitope ITLCFTLKR is screened out. This particular Epitope shows an instability index of 35.68 with a grand average of hydropathicity (GRAVY) calculated as 0.844, and the estimated half-life for this peptide was determined to be 20 h for mammalian reticulocytes.
Satisfactory population coverage was observed for targeted epitopes - HLA allelic complexes at the worldwide, South Asia, and India level. The biochemical integrity in epitope structure was further evident by deploying Ramachandran plot analysis. Both epitopes were non-toxic, non-allergenic, and possess good antigenicity. In a similar study of the preliminary analysis of COVID-19 vaccine targets [1] the investigators tried to use the spike protein and nucleo-capsid protein sequences of SARS-COV that are homologous to some extant with SARS-COV-2 proteins to determine multiple different epitopes for Vaccine prediction, but in our study out of five two proteins namely ORF-3a and ORF-7a specific to SARS-COV-2 were found to be putative T-cell epitope determinants that create useful information; these proteins are also important for viral replication [41]. Both biochemical parameters, as well as an advanced HMM and ANN based algorithm in selected Immuno-informatics tools, were very useful to present a clear picture of predicted epitopes for crafting vaccine against SARS-COV-2. The only limitation that can be considered as future scope is that these easily synthesized peptides should be tested with In-vitro study for more practical validation.
5 Conclusion
ITLCFTLKR epitope was selected for crafting and designing a vaccine against SARS-COV-2. This particular epitope has good antigenicity, exhibits active binding with MHC HLA-Alleles, and has maximum population coverage for different geographical regions. Therefore, this peptide can be further used in vaccine design against SARS-COV-2 after wet lab verification. This novel approach can also assist life science research groups to reduce time, monetary expenditures, as well as physical hit-trial efforts.
Ethical approval
I confirm that authors did not perform any experiments on human or animals.
Declaration of competing interest
I confirm that the authors hereby declare they that have no conflict of interest.
Appendix A Supplementary data
The following is the Supplementary data to this article:Multimedia component 1
Acknowledgement
We acknowledge the support provided by Computational lab of Lovely professional University. Appendix A Supplementary data to this article can be found online at https://doi.org/10.1016/j.imu.2020.100338.
|
Document structure show
article-title | Epitope based vaccine prediction for SARS-COV-2 by deploying immuno-informatics approach |
abstract | A new virus termed SARS-COV-2 (causing COVID-19 disease) can exhibit a progressive, fatal impact on individuals. The World Health Organization (WHO) has declared the spread of the virus to be a global pandemic. Currently, there are over 1 million cases and over 100,000 confirmed deaths due to the virus. Hence, prophylactic and therapeutic strategies are promptly needed. In this study we report an epitope, ITLCFTLKR, which is biochemically fit to HLA allelic proteins. We propose that this could be used as a potential vaccine candidate against SARS-COV-2. A selected putative epitope and HLA-allelic complexes show not only better binding scores, but also RMSD values in the range of 0–1 Å. This epitope was found to have a 99.8% structural favorability as per Ramachandran-plot analysis. Similarly, a suitable range of IC50 values and population coverage was obtained to represent greater validation of T-cell epitope analysis. Stability analysis using MDWeb and half-life analysis using the ProtParam tool has confirmed that this epitope is well-selected. This new methodology of epitope-based vaccine prediction is fundamental and fast in application, ad can be economically beneficial and viable. |
p | A new virus termed SARS-COV-2 (causing COVID-19 disease) can exhibit a progressive, fatal impact on individuals. The World Health Organization (WHO) has declared the spread of the virus to be a global pandemic. Currently, there are over 1 million cases and over 100,000 confirmed deaths due to the virus. Hence, prophylactic and therapeutic strategies are promptly needed. In this study we report an epitope, ITLCFTLKR, which is biochemically fit to HLA allelic proteins. We propose that this could be used as a potential vaccine candidate against SARS-COV-2. A selected putative epitope and HLA-allelic complexes show not only better binding scores, but also RMSD values in the range of 0–1 Å. This epitope was found to have a 99.8% structural favorability as per Ramachandran-plot analysis. Similarly, a suitable range of IC50 values and population coverage was obtained to represent greater validation of T-cell epitope analysis. Stability analysis using MDWeb and half-life analysis using the ProtParam tool has confirmed that this epitope is well-selected. This new methodology of epitope-based vaccine prediction is fundamental and fast in application, ad can be economically beneficial and viable. |
body | 1 Introduction The outbreak of COVID-19 in the Hubei region of the Chinese city of Wuhan [37] has resulted in a difficult situation for the global populace and for the World Health Organization (WHO). On 30 Jan 2020, WHO declared an emergency regarding COVID-19 spread, prevention, and control [28,29]. As of 20 March 2020, there are over 240,000 cases and over 10,000 confirmed deaths, affecting 181 countries [9]. Present updates indicate that there are over 1 million confirmed cases and over 100,000 deaths worldwide [39] (information is obtained from https://www.worldometers.info/coronavirus/). The coronavirus is an RNA type virus with positive sense strand feature and is associated with the Coronaviridae family under order Nidovirales, and is found to be dispersed among Primate order, members of class Mammalia, and specifically in humans [31]. Severe acute respiratory syndrome coronavirus (SARS-CoV) [11,21,22] and Middle East respiratory disorder coronavirus (MERS-CoV) [7,40], classified as β-coronaviruses, are found to be related to the novel SARS-COV-2. Until now, there has been no efficacious therapy to regulate its spread [24]. Investigating its spread across continents and the vulnerability, there is an urgent need to craft vaccines for reinforcing immune defense against the SARS-COV-2 virus. One of the strategies to combat it is the development of vaccines that can initiate an adaptive immune response in humans. Here an attempt has been made to design an epitope based vaccine directed at SARS-COV-2, by analyzing the proteome of the virus by using Immuno-informatics tools. In this study, we deployed the use of various bioinformatics servers and Immuno-informatics tools for identifying and recognizing the T-cell epitopes from the intensive study of available protein sequences and structures that are related to SARS-COV-2. These epitope stretches can interact with MHC Class I and Class II HLA alleles; further validation of epitopes was analyzed by Ramachandran Plot analysis, Antigenicity parameters evaluation, Toxicity analysis, Population coverage, Molecular dynamics, and ProtParam analysis [19]. This approach is an excellent method in modern vaccine design, as it provides a lead over classical trial and error methods of wet labs [23]. We tried to identify T-cell epitopes that can elicit a robust immune response in the global human population and act as potential vaccine candidates. However, the ability of these epitopes to act as a vaccine candidate needs to be analyzed in Molecular biology lab studies. Our investigation can open new dimensions in crafting peptide-based vaccine regimens for Novel SARS-COV2. The greatest decline in virus expansion was noted following ORF3a removal. ORF7a encodes a 122-amino-acid type I transmembrane protein and structural studies disclose a packed seven-stranded β sandwich comparable in fold and topology to members of the immunoglobulin super family. SARS-CoV is an enveloped, positive-stranded RNA virus with a genome of around 29,700 bases. The genome incorporates at least 14 open reading frames (ORFs) that encode 28 proteins in three distinct classes: two large polyproteins P1a and P1ab that are cleaved into 16 non-structural proteins (nsp1–nsp16) during viral RNA synthesis; four structural proteins (S, E, M and N) that are necessary for viral entrance and gathering; and eight accessory proteins that are assumed to be dispensable for viral replication, but may facilitate viral assembly and take part in virulence and pathogenesis (Fig. 1 ). Fig. 1 Genome organization and viral proteins of SARS-CoV [44]. In our investigatory study, out of five, two proteins namely ORF-3a and ORF-7a specific to SARS-COV-2 were found to be putative T-cell epitope determinants that create useful information to distinguish, and these proteins are also important for viral replication and growth [41]. Both of these proteins may influence in viral pathogenesis and disease spread, although the literature lacks unanimity [43]. B-cell epitopes prediction is still considered to be untrustworthy for both linear and conformational epitopes as compared to T-cell epitopes. Furthermore, the B-cell epitopes do not elicit a strong antibody response. For this reason, only T-cell epitopes are considered in the present study. It is capable to produce CD4+ and CD8+ T-cells with long-lasting response [45]. . In one of the recent studies, epitopes were designed, but they have focused on the single protein (i.e. SPIKE protein) to generate multiple epitopes like 13 for MHC I and 3 for MHC II [46]. We have analyzed multiple proteins to screen only effective epitopes based on various in-silico filters, to provide the most appropriate and authentic epitopes, which can be further tested in a wet lab. 2 Methodology 2.1 Protein retrieval and allergenicity analysis Five protein sequences were selected from NCBI-GenBank, listed in Table 1 , for SARS-COV-2 based on their Allergenicity that relies on the Tanimoto similarity index score produced by AllergenFP 1.0 [8]. The selected proteins were the envelope protein, ORF3a protein, nucleocapsid phosphoprotein, ORF7a protein, and membrane glycoprotein, which are crucial for the structural integrity and functionality of the virus [38]. Table 1 List of proteins selected for SARS COV-2 with allergenicity. Protein Gene Bank accession no. Protein name Allergen FP Score (Tanimoto similarity Index) Allergen/Non-Allergen QHD43418.1 Envelop Protein 0.87 Non-Allergen QHD43417.1 ORF3a Protein 0.84 Non-Allergen QHD43423.2 Nulceocapsid Phosphoprotein 0.85 Non-Allergen QHD43421.1 ORF7a protein 0.80 Non-Allergen QHD43419.1 Membrane Glycoprotein 0.83 Non-Allergen 2.2 T-cell epitope prediction for MHC HLA alleles IEDB (Immune epitope database) [20] along with NetMHCII PAN 3.2 and NETMHC 4.0 servers [18] were effectively used for finding putative peptide sequences that were aimed to interact with the MHC Class II and I HLA alleles, respectively (because of efficient algorithms based on artificial neural networks). The VaxiJen score is determined for screening the best antigenic epitopes using the VaxiJen online tool [10] with a threshold ≥ 1.0 for the viruses’ domain. 2.3 Structural prediction: Putative epitopes and MHC HLA alleles The epitopes 3D structural findings were conducted by using the PEP-FOLD-3.5 server [25,33,34] and MHC HLA Allelic peptides tertiary or 3D structure were obtained from the RCSB-PDB database [4]. 2.4 Molecular docking analysis The selected epitopes and HLA complexes were docked for calculating refined interactions and binding energies, along with atomic contact energy (ACE), by using two docking web-servers: DINC 2.0 [2] and PatchDock [32]. 2.5 Molecular dynamics-simulation analysis of docked complex Molecular dynamics study was conducted to analyze RMSD values and atomic fluctuations for all amino acids under the 100 ps time frame by deploying the MDWeb server. MDWeb server was deployed to analyze Coarse grained MD Brownian dynamics (C-alpha) with specifications → Time: 100 ps, output frequency (steps) = 10, force constant (kcal/mol Ǻ2) = 40, distance between alpha carbon atoms(Ǻ) = 3.8 for both the interacting epitopes, and it was based on a GROMACS MD setup with solvation using an Amber-99sb* force-field [17]. 2.6 Toxicity, Ramachandran-plot, and population coverage analysis The ToxinPred server [16] is utilized for determining the toxicity scoring of Epitopes for selecting non-toxic ones; also, the Ramachandran plot analysis was deployed by using the MolProbity 4.2 server [6] to analyze the quantitative presence of residues in the favorable region. The Immune Epitope Database (IED) resource web-server of population coverage was used to predict population coverage of the MHC II and MHC I alleles that interact with screened out epitopes based on their restriction database [5]. The MHCPred web-server was effectively used in quantitative prediction of sorted out epitopes interacting with HLA alleles of MHC II and MHC I [15]. Thereafter the ProtParam tool [42] of the ExPASy server was used to screen final stable epitopes based on the instability index and half-life. 3 Results 3.1 T-cell epitopes prediction and VaxiJen scoring Non-allergen proteins selected (Table 1) based on allergenicity scores (by deploying AllergenFP 1.0) New to NetMHCII PAN 3.2 and NETMHC 4.0 servers for determining 1-log50k value and affinity values for selecting the best possible pair of epitopes with their corresponding HLA alleles. In Table 2 and Table 3 , results for MHC Class I HLA and MHC Class II HLA alleles paired epitopes along with their VaxiJen scores were represented respectively, to obtain putative epitopes. The results are self-explanatory in Fig. 2 . Graphical representation of selected peptides for docking are based on their interaction with MHC Class I and Class II HLA alleles along with their antigenicity. Table 2 Probable antigenic epitopes and MHC I Allele interaction based on NETMHC 4.0 server and VaxiJen 2.0 scores (≥1.0) for antigenicity prediction. (* - sign is used for no interaction any HLA Allele under consideration). NCBI-GenBank ID MHC I Allele POS Core peptide 1-LOG50K Affinity(nM) Vaxijen score Antigenicity QHD43417.1 HLA-A*68:01 7 FTIGTVTLK 0.853 4.9 2.0317 ANTIGEN HLA-A*31:01 125 RLWLCWKCR 0.74 16.59 1.1604 ANTIGEN QHD43418.1 – – – – – – NO INTERACTION QHD43419.1 HLA-A*11:01 5 GTITVEELK 0.719 20.98 1.0976 ANTIGEN QHD43421.1 HLA-A*11:01 109 ITLCFTLKR 0.71 22.97 2.0208 ANTIGEN HLA-A*68:01 109 ITLCFTLKR 0.695 27.24 2.0208 ANTIGEN HLA-A*23:01 107 VFITLCFTL 0.621 60.42 1.2490 ANTIGEN QHD43423.2 – – – – – – NO INTERACTION Table 3 Probable antigenic epitopes and MHC II Allele interaction based on NETMHC II PRED 3.2 server and VaxiJen 2.0 scores (≥1.0) for antigenicity prediction. (* - sign is used for no interaction any HLA Allele under consideration). NCBI-GenBank ID MHC II Allele POS Core peptide 1- LOG50K Affinity(nM) Vaxijen score Antigenicity QHD43417.1 – – – – – – NO INTERACTION QHD43418.1 – – – – – – NO INTERACTION QHD43419.1 HLA-DRB1*04:01 55 WLLWPVTA 0.282 2375.78 1.0631 ANTIGEN QHD43421.1 HLA-DRB1*01:01 74 VYQLRARSV 0.484 267.05 1.3108 ANTIGEN HLA-DRB1*07:01 74 VYQLRARSV 0.342 1235.08 1.3108 ANTIGEN QHD43423.2 – – – – – – NO INTERACTION Fig. 2 Graphical representation of selected peptides for docking based on their interaction with MHC Class I and Class II HLA alleles along with their Antigenicity. 3.2 Structural findings of epitope and MHC HLA-Alleles Epitope structures were obtained by using the PEP-FOLD-3.5 web-server; the HLA allele's structures were retrieved from the RCSB-PDB database. In Table 4 the crystal structure/model structure details with reference PDB-Id is provided. Table 4 Listing of MHC HLA-Alleles respective Crystal structures/Models with the PDB ID. Allele Name Template structure (PDB-ID) Crystal structure/Model HLA-A*11:01 2HN7 CRYSTAL STRUCTURE HLA-A*23:01 3I6L CRYSTAL STRUCTURE HLA-A*31:01 3RL1 CRYSTAL STRUCTURE HLA-A*68:01 6PBH CRYSTAL STRUCTURE HLA-DRB1*01:01 4AH2 CRYSTAL STRUCTURE HLA-DRB1*04:01 5LAX CRYSTAL STRUCTURE HLA-DRB1*07:01 6BIJ CRYSTAL STRUCTURE 3.3 Molecular docking analysis It was found that FTIGTVTLK, ITLCFTLKR epitopes interacted with MHC class I HLA Alleles, and VYQLRARSV epitope interacted with MHC Class II HLA alleles with a perfect binding score and ACE values as shown in Table 5 . The ITLCFTLKR Epitope of the ORF-7A protein exhibits binding with 2 HLA alleles (HLA-A*11:01, HLA-A*68:01) of MHC Class I, while FTIGTVTLK Epitope of ORF-3a protein interact with 1 HLA Allele (HLA-A*68:01) of MHC Class I. The VYQLRARSV Epitope of ORF-7a protein interacts clearly with 2 HLA Alleles (HLA-DRB1*01:01, HLA-DRB1*07:01) of the MHC Class II domain. Table 5 Binding Energies, Ace Values for Docked Complexes based on DINC Server and PatchDock Analysis for Putative Epitopes. Epitope HLA Allele Binding score (kcal/mol) Patch dock score ACE Selection FTIGTVTLK HLA-A*68:01 −8.80 8066 178.91 Selected RLWLCWKCR HLA-A*31:01 −4.80 8916 117.53 Rejected GTITVEELK HLA-A*11:01 −3.80 8040 78.78 Rejected ITLCFTLKR HLA-A*11:01 −3.70 8206 −25.88 Selected ITLCFTLKR HLA-A*68:01 −7.60 8136 184.55 Selected VFITLCFTL HLA-A*23:01 −4.40 7706 −134.91 Rejected WLLWPVTLA HLADRB1*04:01 −8.60 9432 −150.96 Rejected VYQLRARSV HLADRB1*01:01 −6.20 6874 −168.74 Selected VYQLRARSV HLADRB1*07:01 −6.20 6842 262.74 Selected In Fig. 3, Fig. 4, Fig. 5 , interactions between a selected three T-Cell epitopes with respective MHC Class I and II HLA-Alleles via hydrogen bond formation and van der Waals interactions is depicted. After positive docking results, these epitopes were subjected to further Molecular dynamic simulation and biochemical parameters assessment. Fig. 6 represents a graphical plot of binding scores for epitopes interacting with HLA-Alleles. Fig. 3 FTIGTVTLK Epitope interaction with an antigen-binding pocket of HLA-A*68:01, of MHC I-HLA Allele. Here, Threonine at 2nd,5th, and 7th position in the epitope generate preferably a hydrogen bond due to the presence of partially charged positive and negative atoms, and also 4th position Glutamic acid and lysine at 8th position side chains can form a salt bridge, while other amino acids result in van der Waals interactions. Fig. 4 ITLCFTLKR Epitope interaction with an antigen-binding pocket of HLA-A*68:01, of MHC I-HLA Allele. Here, Threonine at 2nd and 6th position, as well as cysteine at the 4th position in epitope, generate preferably a hydrogen bond due to the presence of partially charged positive and negative atoms, while other amino acids result in van der Waals interactions. Fig. 5 VYQLRARSV Epitope interaction with an antigen-binding pocket of HLA-DRB1*07:01, of MHCII-HLA Allele, Here, Tyrosine at 2nd, Glutamine at 3rd position and Serine at 8th position in epitope generate preferably a hydrogen bond due to the presence of partially charged positive and negative atoms, and 5th and 7tharginine residue side chains can form a salt bridge, while other amino acids result in van der Waals interactions. Fig. 6 Binding energy graphical plot for selected Epitope and HLA-Allelic pair. 3.4 Molecular dynamics and simulation analysis RMSD values and Atomic fluctuation per amino acid residue were obtained for Epitopes interacting with the HLA-Allele structure; this analysis allows a perfect pair selection and validation. Moreover, only two Epitope pairs, i.e., ITLCFTLKR and VYQLRARSV, were identified as probable T-cell epitopes and as putative vaccine specimens. Fig. 7 shows the RMSD Plot and Atomic fluctuation per residue for the ITLCFTLKR- HLA-A*68:01 complex, the RMSD Plot and Atomic fluctuation per residue for the VYQLRARSV- HLA-DRB1*07:01 complex. Both results were positive as best interactions, for protein-ligand docked complexes must possess RMSD values from 0 to 1.0 Å as a preferred range [13]. Fig. 7 A. RMSD Plot for ITLCFTLKR- HLA-A*68:01 complex, for each amino acid residue by Molecular dynamics analysis, B. B-Factor (atomic fluctuation) values per amino acid residue for Epitope ITLCFTLKR- HLA-A*68:01 docked complex, C. RMSD Plot for VYQLRARSV- HLA-DRB1*07:01, for each amino acid residue by Molecular dynamics analysis, D. B-Factor (atomic fluctuation) values per amino acid residue for Epitope VYQLRARSV- HLA-DRB1*07:01 docked complex. 3.5 Toxicity analysis, Ramachandran Plot analysis, and population coverage results ToxinPred 4.0 server results (in Table 6 .) represent Finalized T-cell Epitopes that were nontoxic from the biochemical perspective. Table 6 Results of ToxinPred on probable antigens. Peptide/Probable antigen SVM score Hydrophilicity Molecular weight Toxicity FTIGTVTLK −1.36 −1.23 979.32 NON-TOXIN GTITVEELK −0.98 0.34 989.27 NON-TOXIN ITLCFTLKR −1.32 −0.41 1094.51 NON-TOXIN VFITLCFTL −1.21 −1.52 1056.46 NON-TOXIN WLLWPVTLA −1.18 −1.62 1098.49 NON-TOXIN VYQLRARSV −1.07 −0.12 1091.40 NON-TOXIN Ramachandran plot analysis, Fig. 8 A and B suggest that most of the residues are allowed in a favored region; this gives more confidence in the structural conformation for targeted T-Cell Epitopes. Fig. 8 A. 99.8% residues of the ITLCFTLKR Epitope were in the allowed and favored region under Ramachandran Plot analysis. B. 99.8% residues of the VYQLRARSV Epitope were in the allowed and favored region under Ramachandran Plot analysis. MHCPred results (Table 7 ) indicate quantitative estimation of IC50 values for both MHC I and MHC II alleles for respective Epitopes shows elicitation of an immune response when this data is deployed in a population coverage analysis. Table 7 MHCPred results depict IC50 Values for HLA Alleles and confidence of the prediction. HLA Alleles Amino acid groups Predicted -logIC50 (M) Predicted IC50 Value (nM) Confidence of prediction (Max = 1) HLA-A*68:01 FTIGTVTLK 7.116 76.56 1.00 HLA-A*11:01 ITLCFTLKR 7.028 93.76 1.00 HLA-A*68:01 ITLCFTLKR 6.282 522.40 0.78 HLA-DRB1*01:01 VYQLRARSV 7.624 23.77 0.89 HLA-DRB1*07:01 VYQLRARSV 6.734 184.50 0.89 IEDB population coverage analysis suggests that ITLCFTLKR and VYQLRARSV epitopes exhibit a suitable population coverage, as depicted in the graphical representation of Fig. 9 A and B. This allows only two probable Epitopes for the final selection of vaccine crafting. Fig. 9 A. Graphical representation of population conservancy analysis of ITLCFTLKR Epitope. B. Graphical representation of population conservancy analysis of VYQLRARSV Epitope. In Table 8 , ProtParam analysis further reveals the stability of the considered epitopes and final revelation of one epitope ITLCFTLKR is screened out. This particular Epitope exhibits an instability index of 35.68, with a grand average of hydropathicity (GRAVY) calculated was 0.844, and the estimated half-life for this peptide was determined to be 20 h for mammalian reticulocytes. Table 8 ProtParam analysis for selected epitopes. Selected Epitope GRAVY Score Instability Index (Indication) Estimated Half-Life(Mammalian reticulocytes) Theoretical pI Aliphatic Index ITLCFTLKR 0.844 35.68(Stable) 20 Hours 9.51 130.00 VYQLRARSV −0.067 70.73(Unstable) 100 Hours 10.83 118.89 4 Discussion In this study, SARS-COV-2 virus proteins were analyzed by using In-silico methods, and can be further utilized for vaccine trials as per earlier successes in the case of similar SARS-COV studies, and later observed in the development of polyclonal antibodies [27]. Here we obtained two epitopes ITLCFTLKR and VYQLRARSV after successful docking and molecular dynamics simulation; furthermore, these two epitopes were subjected to population coverage and toxicity analysis. Similarly, in another study, for MERS-COV, nucleocapsid peptides were used for T-cell epitope prediction, and found to be successful [36]. The IEDB and NCBI-GenBank database were fully deployed to analyze sequence homology, to predict targets for COV-2 in case of viral protein identification as per the related studies [14], as ViPR (Virus Pathogen database analysis resource) are also dependent on IEDB and GenBank primarily [30]. We analyzed five different proteins in SARS-COV-2 for the present study (because of their availability in the NCBI-GenBank database and importance in a structural role in SARS-COV-2 [14] and finally revealed T–Cell epitopes that can be used for wet lab considerations and time savings. In a very recent study, different epitopes were found for SARS-COV-2, based on In-silico approaches and focused on only surface glycoprotein [3], but in our research study there are many differences as we analyzed a different group of proteins from SARS-COV-2 to sort out short length T-Cell epitopes specific to MHC I as well as MHC II diversified HLA-Alleles. It is reported for SARS-CoV HLA-B*4601, HLA-B*0703, HLA-DR B1*1202 are activated [26], interaction with different MHC I and II allelic forms namely HLA-A*11:01, HLA-A*68:01, HLA-DRB1*01:01 and HLA-DRB1*07:01. CD4+ and CD8+ memory T cells. Based on prior literature, it is anticipated that it can persist for four years as in the case of SARS-CoV recovered individuals, show T-cell proliferation, DTH response, and production of IFN-γ [12]. We surmise that our screen can be more effective and useful. Primarily molecular docking reveals three Epitopes, but as we proceed to Molecular dynamic simulations, it reveals best interactions for two epitopes i.e., ITLCFTLKR and VYQLRARSV, with acceptable stability analyzed with the help of MDWeb and identified by using best available tools with easy-to-apply methods. One recent study was found to be focused on developing monoclonal antibodies like CR-3022 against the Spike protein of SARS-COV-2 that also exhibits interaction with ACE (Angiotensin Converting Enzyme) enzyme of the Human respiratory epithelium and requires complex neutralizing mechanisms for several binding domains [35], whereas in our study the putative T-cell epitopes can directly interact with MHC-Allelic sets that can be useful for developing immunization against SARS-COV-2. ProtParam [42] analysis further reveals the stability of the considered epitopes, and final revelation of one epitope ITLCFTLKR is screened out. This particular Epitope shows an instability index of 35.68 with a grand average of hydropathicity (GRAVY) calculated as 0.844, and the estimated half-life for this peptide was determined to be 20 h for mammalian reticulocytes. Satisfactory population coverage was observed for targeted epitopes - HLA allelic complexes at the worldwide, South Asia, and India level. The biochemical integrity in epitope structure was further evident by deploying Ramachandran plot analysis. Both epitopes were non-toxic, non-allergenic, and possess good antigenicity. In a similar study of the preliminary analysis of COVID-19 vaccine targets [1] the investigators tried to use the spike protein and nucleo-capsid protein sequences of SARS-COV that are homologous to some extant with SARS-COV-2 proteins to determine multiple different epitopes for Vaccine prediction, but in our study out of five two proteins namely ORF-3a and ORF-7a specific to SARS-COV-2 were found to be putative T-cell epitope determinants that create useful information; these proteins are also important for viral replication [41]. Both biochemical parameters, as well as an advanced HMM and ANN based algorithm in selected Immuno-informatics tools, were very useful to present a clear picture of predicted epitopes for crafting vaccine against SARS-COV-2. The only limitation that can be considered as future scope is that these easily synthesized peptides should be tested with In-vitro study for more practical validation. 5 Conclusion ITLCFTLKR epitope was selected for crafting and designing a vaccine against SARS-COV-2. This particular epitope has good antigenicity, exhibits active binding with MHC HLA-Alleles, and has maximum population coverage for different geographical regions. Therefore, this peptide can be further used in vaccine design against SARS-COV-2 after wet lab verification. This novel approach can also assist life science research groups to reduce time, monetary expenditures, as well as physical hit-trial efforts. Ethical approval I confirm that authors did not perform any experiments on human or animals. Declaration of competing interest I confirm that the authors hereby declare they that have no conflict of interest. |
sec | 1 Introduction The outbreak of COVID-19 in the Hubei region of the Chinese city of Wuhan [37] has resulted in a difficult situation for the global populace and for the World Health Organization (WHO). On 30 Jan 2020, WHO declared an emergency regarding COVID-19 spread, prevention, and control [28,29]. As of 20 March 2020, there are over 240,000 cases and over 10,000 confirmed deaths, affecting 181 countries [9]. Present updates indicate that there are over 1 million confirmed cases and over 100,000 deaths worldwide [39] (information is obtained from https://www.worldometers.info/coronavirus/). The coronavirus is an RNA type virus with positive sense strand feature and is associated with the Coronaviridae family under order Nidovirales, and is found to be dispersed among Primate order, members of class Mammalia, and specifically in humans [31]. Severe acute respiratory syndrome coronavirus (SARS-CoV) [11,21,22] and Middle East respiratory disorder coronavirus (MERS-CoV) [7,40], classified as β-coronaviruses, are found to be related to the novel SARS-COV-2. Until now, there has been no efficacious therapy to regulate its spread [24]. Investigating its spread across continents and the vulnerability, there is an urgent need to craft vaccines for reinforcing immune defense against the SARS-COV-2 virus. One of the strategies to combat it is the development of vaccines that can initiate an adaptive immune response in humans. Here an attempt has been made to design an epitope based vaccine directed at SARS-COV-2, by analyzing the proteome of the virus by using Immuno-informatics tools. In this study, we deployed the use of various bioinformatics servers and Immuno-informatics tools for identifying and recognizing the T-cell epitopes from the intensive study of available protein sequences and structures that are related to SARS-COV-2. These epitope stretches can interact with MHC Class I and Class II HLA alleles; further validation of epitopes was analyzed by Ramachandran Plot analysis, Antigenicity parameters evaluation, Toxicity analysis, Population coverage, Molecular dynamics, and ProtParam analysis [19]. This approach is an excellent method in modern vaccine design, as it provides a lead over classical trial and error methods of wet labs [23]. We tried to identify T-cell epitopes that can elicit a robust immune response in the global human population and act as potential vaccine candidates. However, the ability of these epitopes to act as a vaccine candidate needs to be analyzed in Molecular biology lab studies. Our investigation can open new dimensions in crafting peptide-based vaccine regimens for Novel SARS-COV2. The greatest decline in virus expansion was noted following ORF3a removal. ORF7a encodes a 122-amino-acid type I transmembrane protein and structural studies disclose a packed seven-stranded β sandwich comparable in fold and topology to members of the immunoglobulin super family. SARS-CoV is an enveloped, positive-stranded RNA virus with a genome of around 29,700 bases. The genome incorporates at least 14 open reading frames (ORFs) that encode 28 proteins in three distinct classes: two large polyproteins P1a and P1ab that are cleaved into 16 non-structural proteins (nsp1–nsp16) during viral RNA synthesis; four structural proteins (S, E, M and N) that are necessary for viral entrance and gathering; and eight accessory proteins that are assumed to be dispensable for viral replication, but may facilitate viral assembly and take part in virulence and pathogenesis (Fig. 1 ). Fig. 1 Genome organization and viral proteins of SARS-CoV [44]. In our investigatory study, out of five, two proteins namely ORF-3a and ORF-7a specific to SARS-COV-2 were found to be putative T-cell epitope determinants that create useful information to distinguish, and these proteins are also important for viral replication and growth [41]. Both of these proteins may influence in viral pathogenesis and disease spread, although the literature lacks unanimity [43]. B-cell epitopes prediction is still considered to be untrustworthy for both linear and conformational epitopes as compared to T-cell epitopes. Furthermore, the B-cell epitopes do not elicit a strong antibody response. For this reason, only T-cell epitopes are considered in the present study. It is capable to produce CD4+ and CD8+ T-cells with long-lasting response [45]. . In one of the recent studies, epitopes were designed, but they have focused on the single protein (i.e. SPIKE protein) to generate multiple epitopes like 13 for MHC I and 3 for MHC II [46]. We have analyzed multiple proteins to screen only effective epitopes based on various in-silico filters, to provide the most appropriate and authentic epitopes, which can be further tested in a wet lab. |
label | 1 |
title | Introduction |
p | The outbreak of COVID-19 in the Hubei region of the Chinese city of Wuhan [37] has resulted in a difficult situation for the global populace and for the World Health Organization (WHO). On 30 Jan 2020, WHO declared an emergency regarding COVID-19 spread, prevention, and control [28,29]. As of 20 March 2020, there are over 240,000 cases and over 10,000 confirmed deaths, affecting 181 countries [9]. Present updates indicate that there are over 1 million confirmed cases and over 100,000 deaths worldwide [39] (information is obtained from https://www.worldometers.info/coronavirus/). The coronavirus is an RNA type virus with positive sense strand feature and is associated with the Coronaviridae family under order Nidovirales, and is found to be dispersed among Primate order, members of class Mammalia, and specifically in humans [31]. Severe acute respiratory syndrome coronavirus (SARS-CoV) [11,21,22] and Middle East respiratory disorder coronavirus (MERS-CoV) [7,40], classified as β-coronaviruses, are found to be related to the novel SARS-COV-2. Until now, there has been no efficacious therapy to regulate its spread [24]. Investigating its spread across continents and the vulnerability, there is an urgent need to craft vaccines for reinforcing immune defense against the SARS-COV-2 virus. One of the strategies to combat it is the development of vaccines that can initiate an adaptive immune response in humans. Here an attempt has been made to design an epitope based vaccine directed at SARS-COV-2, by analyzing the proteome of the virus by using Immuno-informatics tools. In this study, we deployed the use of various bioinformatics servers and Immuno-informatics tools for identifying and recognizing the T-cell epitopes from the intensive study of available protein sequences and structures that are related to SARS-COV-2. These epitope stretches can interact with MHC Class I and Class II HLA alleles; further validation of epitopes was analyzed by Ramachandran Plot analysis, Antigenicity parameters evaluation, Toxicity analysis, Population coverage, Molecular dynamics, and ProtParam analysis [19]. This approach is an excellent method in modern vaccine design, as it provides a lead over classical trial and error methods of wet labs [23]. We tried to identify T-cell epitopes that can elicit a robust immune response in the global human population and act as potential vaccine candidates. However, the ability of these epitopes to act as a vaccine candidate needs to be analyzed in Molecular biology lab studies. Our investigation can open new dimensions in crafting peptide-based vaccine regimens for Novel SARS-COV2. The greatest decline in virus expansion was noted following ORF3a removal. ORF7a encodes a 122-amino-acid type I transmembrane protein and structural studies disclose a packed seven-stranded β sandwich comparable in fold and topology to members of the immunoglobulin super family. SARS-CoV is an enveloped, positive-stranded RNA virus with a genome of around 29,700 bases. The genome incorporates at least 14 open reading frames (ORFs) that encode 28 proteins in three distinct classes: two large polyproteins P1a and P1ab that are cleaved into 16 non-structural proteins (nsp1–nsp16) during viral RNA synthesis; four structural proteins (S, E, M and N) that are necessary for viral entrance and gathering; and eight accessory proteins that are assumed to be dispensable for viral replication, but may facilitate viral assembly and take part in virulence and pathogenesis (Fig. 1 ). Fig. 1 Genome organization and viral proteins of SARS-CoV [44]. |
figure | Fig. 1 Genome organization and viral proteins of SARS-CoV [44]. |
label | Fig. 1 |
caption | Genome organization and viral proteins of SARS-CoV [44]. |
p | Genome organization and viral proteins of SARS-CoV [44]. |
p | In our investigatory study, out of five, two proteins namely ORF-3a and ORF-7a specific to SARS-COV-2 were found to be putative T-cell epitope determinants that create useful information to distinguish, and these proteins are also important for viral replication and growth [41]. Both of these proteins may influence in viral pathogenesis and disease spread, although the literature lacks unanimity [43]. B-cell epitopes prediction is still considered to be untrustworthy for both linear and conformational epitopes as compared to T-cell epitopes. Furthermore, the B-cell epitopes do not elicit a strong antibody response. For this reason, only T-cell epitopes are considered in the present study. It is capable to produce CD4+ and CD8+ T-cells with long-lasting response [45]. . In one of the recent studies, epitopes were designed, but they have focused on the single protein (i.e. SPIKE protein) to generate multiple epitopes like 13 for MHC I and 3 for MHC II [46]. We have analyzed multiple proteins to screen only effective epitopes based on various in-silico filters, to provide the most appropriate and authentic epitopes, which can be further tested in a wet lab. |
sec | 2 Methodology 2.1 Protein retrieval and allergenicity analysis Five protein sequences were selected from NCBI-GenBank, listed in Table 1 , for SARS-COV-2 based on their Allergenicity that relies on the Tanimoto similarity index score produced by AllergenFP 1.0 [8]. The selected proteins were the envelope protein, ORF3a protein, nucleocapsid phosphoprotein, ORF7a protein, and membrane glycoprotein, which are crucial for the structural integrity and functionality of the virus [38]. Table 1 List of proteins selected for SARS COV-2 with allergenicity. Protein Gene Bank accession no. Protein name Allergen FP Score (Tanimoto similarity Index) Allergen/Non-Allergen QHD43418.1 Envelop Protein 0.87 Non-Allergen QHD43417.1 ORF3a Protein 0.84 Non-Allergen QHD43423.2 Nulceocapsid Phosphoprotein 0.85 Non-Allergen QHD43421.1 ORF7a protein 0.80 Non-Allergen QHD43419.1 Membrane Glycoprotein 0.83 Non-Allergen 2.2 T-cell epitope prediction for MHC HLA alleles IEDB (Immune epitope database) [20] along with NetMHCII PAN 3.2 and NETMHC 4.0 servers [18] were effectively used for finding putative peptide sequences that were aimed to interact with the MHC Class II and I HLA alleles, respectively (because of efficient algorithms based on artificial neural networks). The VaxiJen score is determined for screening the best antigenic epitopes using the VaxiJen online tool [10] with a threshold ≥ 1.0 for the viruses’ domain. 2.3 Structural prediction: Putative epitopes and MHC HLA alleles The epitopes 3D structural findings were conducted by using the PEP-FOLD-3.5 server [25,33,34] and MHC HLA Allelic peptides tertiary or 3D structure were obtained from the RCSB-PDB database [4]. 2.4 Molecular docking analysis The selected epitopes and HLA complexes were docked for calculating refined interactions and binding energies, along with atomic contact energy (ACE), by using two docking web-servers: DINC 2.0 [2] and PatchDock [32]. 2.5 Molecular dynamics-simulation analysis of docked complex Molecular dynamics study was conducted to analyze RMSD values and atomic fluctuations for all amino acids under the 100 ps time frame by deploying the MDWeb server. MDWeb server was deployed to analyze Coarse grained MD Brownian dynamics (C-alpha) with specifications → Time: 100 ps, output frequency (steps) = 10, force constant (kcal/mol Ǻ2) = 40, distance between alpha carbon atoms(Ǻ) = 3.8 for both the interacting epitopes, and it was based on a GROMACS MD setup with solvation using an Amber-99sb* force-field [17]. 2.6 Toxicity, Ramachandran-plot, and population coverage analysis The ToxinPred server [16] is utilized for determining the toxicity scoring of Epitopes for selecting non-toxic ones; also, the Ramachandran plot analysis was deployed by using the MolProbity 4.2 server [6] to analyze the quantitative presence of residues in the favorable region. The Immune Epitope Database (IED) resource web-server of population coverage was used to predict population coverage of the MHC II and MHC I alleles that interact with screened out epitopes based on their restriction database [5]. The MHCPred web-server was effectively used in quantitative prediction of sorted out epitopes interacting with HLA alleles of MHC II and MHC I [15]. Thereafter the ProtParam tool [42] of the ExPASy server was used to screen final stable epitopes based on the instability index and half-life. |
label | 2 |
title | Methodology |
sec | 2.1 Protein retrieval and allergenicity analysis Five protein sequences were selected from NCBI-GenBank, listed in Table 1 , for SARS-COV-2 based on their Allergenicity that relies on the Tanimoto similarity index score produced by AllergenFP 1.0 [8]. The selected proteins were the envelope protein, ORF3a protein, nucleocapsid phosphoprotein, ORF7a protein, and membrane glycoprotein, which are crucial for the structural integrity and functionality of the virus [38]. Table 1 List of proteins selected for SARS COV-2 with allergenicity. Protein Gene Bank accession no. Protein name Allergen FP Score (Tanimoto similarity Index) Allergen/Non-Allergen QHD43418.1 Envelop Protein 0.87 Non-Allergen QHD43417.1 ORF3a Protein 0.84 Non-Allergen QHD43423.2 Nulceocapsid Phosphoprotein 0.85 Non-Allergen QHD43421.1 ORF7a protein 0.80 Non-Allergen QHD43419.1 Membrane Glycoprotein 0.83 Non-Allergen |
label | 2.1 |
title | Protein retrieval and allergenicity analysis |
p | Five protein sequences were selected from NCBI-GenBank, listed in Table 1 , for SARS-COV-2 based on their Allergenicity that relies on the Tanimoto similarity index score produced by AllergenFP 1.0 [8]. The selected proteins were the envelope protein, ORF3a protein, nucleocapsid phosphoprotein, ORF7a protein, and membrane glycoprotein, which are crucial for the structural integrity and functionality of the virus [38]. Table 1 List of proteins selected for SARS COV-2 with allergenicity. Protein Gene Bank accession no. Protein name Allergen FP Score (Tanimoto similarity Index) Allergen/Non-Allergen QHD43418.1 Envelop Protein 0.87 Non-Allergen QHD43417.1 ORF3a Protein 0.84 Non-Allergen QHD43423.2 Nulceocapsid Phosphoprotein 0.85 Non-Allergen QHD43421.1 ORF7a protein 0.80 Non-Allergen QHD43419.1 Membrane Glycoprotein 0.83 Non-Allergen |
table-wrap | Table 1 List of proteins selected for SARS COV-2 with allergenicity. Protein Gene Bank accession no. Protein name Allergen FP Score (Tanimoto similarity Index) Allergen/Non-Allergen QHD43418.1 Envelop Protein 0.87 Non-Allergen QHD43417.1 ORF3a Protein 0.84 Non-Allergen QHD43423.2 Nulceocapsid Phosphoprotein 0.85 Non-Allergen QHD43421.1 ORF7a protein 0.80 Non-Allergen QHD43419.1 Membrane Glycoprotein 0.83 Non-Allergen |
label | Table 1 |
caption | List of proteins selected for SARS COV-2 with allergenicity. |
p | List of proteins selected for SARS COV-2 with allergenicity. |
table | Protein Gene Bank accession no. Protein name Allergen FP Score (Tanimoto similarity Index) Allergen/Non-Allergen QHD43418.1 Envelop Protein 0.87 Non-Allergen QHD43417.1 ORF3a Protein 0.84 Non-Allergen QHD43423.2 Nulceocapsid Phosphoprotein 0.85 Non-Allergen QHD43421.1 ORF7a protein 0.80 Non-Allergen QHD43419.1 Membrane Glycoprotein 0.83 Non-Allergen |
tr | Protein Gene Bank accession no. Protein name Allergen FP Score (Tanimoto similarity Index) Allergen/Non-Allergen |
th | Protein Gene Bank accession no. |
th | Protein name |
th | Allergen FP Score (Tanimoto similarity Index) |
th | Allergen/Non-Allergen |
tr | QHD43418.1 Envelop Protein 0.87 Non-Allergen |
td | QHD43418.1 |
td | Envelop Protein |
td | 0.87 |
td | Non-Allergen |
tr | QHD43417.1 ORF3a Protein 0.84 Non-Allergen |
td | QHD43417.1 |
td | ORF3a Protein |
td | 0.84 |
td | Non-Allergen |
tr | QHD43423.2 Nulceocapsid Phosphoprotein 0.85 Non-Allergen |
td | QHD43423.2 |
td | Nulceocapsid Phosphoprotein |
td | 0.85 |
td | Non-Allergen |
tr | QHD43421.1 ORF7a protein 0.80 Non-Allergen |
td | QHD43421.1 |
td | ORF7a protein |
td | 0.80 |
td | Non-Allergen |
tr | QHD43419.1 Membrane Glycoprotein 0.83 Non-Allergen |
td | QHD43419.1 |
td | Membrane Glycoprotein |
td | 0.83 |
td | Non-Allergen |
sec | 2.2 T-cell epitope prediction for MHC HLA alleles IEDB (Immune epitope database) [20] along with NetMHCII PAN 3.2 and NETMHC 4.0 servers [18] were effectively used for finding putative peptide sequences that were aimed to interact with the MHC Class II and I HLA alleles, respectively (because of efficient algorithms based on artificial neural networks). The VaxiJen score is determined for screening the best antigenic epitopes using the VaxiJen online tool [10] with a threshold ≥ 1.0 for the viruses’ domain. |
label | 2.2 |
title | T-cell epitope prediction for MHC HLA alleles |
p | IEDB (Immune epitope database) [20] along with NetMHCII PAN 3.2 and NETMHC 4.0 servers [18] were effectively used for finding putative peptide sequences that were aimed to interact with the MHC Class II and I HLA alleles, respectively (because of efficient algorithms based on artificial neural networks). The VaxiJen score is determined for screening the best antigenic epitopes using the VaxiJen online tool [10] with a threshold ≥ 1.0 for the viruses’ domain. |
sec | 2.3 Structural prediction: Putative epitopes and MHC HLA alleles The epitopes 3D structural findings were conducted by using the PEP-FOLD-3.5 server [25,33,34] and MHC HLA Allelic peptides tertiary or 3D structure were obtained from the RCSB-PDB database [4]. |
label | 2.3 |
title | Structural prediction: Putative epitopes and MHC HLA alleles |
p | The epitopes 3D structural findings were conducted by using the PEP-FOLD-3.5 server [25,33,34] and MHC HLA Allelic peptides tertiary or 3D structure were obtained from the RCSB-PDB database [4]. |
sec | 2.4 Molecular docking analysis The selected epitopes and HLA complexes were docked for calculating refined interactions and binding energies, along with atomic contact energy (ACE), by using two docking web-servers: DINC 2.0 [2] and PatchDock [32]. |
label | 2.4 |
title | Molecular docking analysis |
p | The selected epitopes and HLA complexes were docked for calculating refined interactions and binding energies, along with atomic contact energy (ACE), by using two docking web-servers: DINC 2.0 [2] and PatchDock [32]. |
sec | 2.5 Molecular dynamics-simulation analysis of docked complex Molecular dynamics study was conducted to analyze RMSD values and atomic fluctuations for all amino acids under the 100 ps time frame by deploying the MDWeb server. MDWeb server was deployed to analyze Coarse grained MD Brownian dynamics (C-alpha) with specifications → Time: 100 ps, output frequency (steps) = 10, force constant (kcal/mol Ǻ2) = 40, distance between alpha carbon atoms(Ǻ) = 3.8 for both the interacting epitopes, and it was based on a GROMACS MD setup with solvation using an Amber-99sb* force-field [17]. |
label | 2.5 |
title | Molecular dynamics-simulation analysis of docked complex |
p | Molecular dynamics study was conducted to analyze RMSD values and atomic fluctuations for all amino acids under the 100 ps time frame by deploying the MDWeb server. MDWeb server was deployed to analyze Coarse grained MD Brownian dynamics (C-alpha) with specifications → Time: 100 ps, output frequency (steps) = 10, force constant (kcal/mol Ǻ2) = 40, distance between alpha carbon atoms(Ǻ) = 3.8 for both the interacting epitopes, and it was based on a GROMACS MD setup with solvation using an Amber-99sb* force-field [17]. |
sec | 2.6 Toxicity, Ramachandran-plot, and population coverage analysis The ToxinPred server [16] is utilized for determining the toxicity scoring of Epitopes for selecting non-toxic ones; also, the Ramachandran plot analysis was deployed by using the MolProbity 4.2 server [6] to analyze the quantitative presence of residues in the favorable region. The Immune Epitope Database (IED) resource web-server of population coverage was used to predict population coverage of the MHC II and MHC I alleles that interact with screened out epitopes based on their restriction database [5]. The MHCPred web-server was effectively used in quantitative prediction of sorted out epitopes interacting with HLA alleles of MHC II and MHC I [15]. Thereafter the ProtParam tool [42] of the ExPASy server was used to screen final stable epitopes based on the instability index and half-life. |
label | 2.6 |
title | Toxicity, Ramachandran-plot, and population coverage analysis |
p | The ToxinPred server [16] is utilized for determining the toxicity scoring of Epitopes for selecting non-toxic ones; also, the Ramachandran plot analysis was deployed by using the MolProbity 4.2 server [6] to analyze the quantitative presence of residues in the favorable region. |
p | The Immune Epitope Database (IED) resource web-server of population coverage was used to predict population coverage of the MHC II and MHC I alleles that interact with screened out epitopes based on their restriction database [5]. The MHCPred web-server was effectively used in quantitative prediction of sorted out epitopes interacting with HLA alleles of MHC II and MHC I [15]. Thereafter the ProtParam tool [42] of the ExPASy server was used to screen final stable epitopes based on the instability index and half-life. |
sec | 3 Results 3.1 T-cell epitopes prediction and VaxiJen scoring Non-allergen proteins selected (Table 1) based on allergenicity scores (by deploying AllergenFP 1.0) New to NetMHCII PAN 3.2 and NETMHC 4.0 servers for determining 1-log50k value and affinity values for selecting the best possible pair of epitopes with their corresponding HLA alleles. In Table 2 and Table 3 , results for MHC Class I HLA and MHC Class II HLA alleles paired epitopes along with their VaxiJen scores were represented respectively, to obtain putative epitopes. The results are self-explanatory in Fig. 2 . Graphical representation of selected peptides for docking are based on their interaction with MHC Class I and Class II HLA alleles along with their antigenicity. Table 2 Probable antigenic epitopes and MHC I Allele interaction based on NETMHC 4.0 server and VaxiJen 2.0 scores (≥1.0) for antigenicity prediction. (* - sign is used for no interaction any HLA Allele under consideration). NCBI-GenBank ID MHC I Allele POS Core peptide 1-LOG50K Affinity(nM) Vaxijen score Antigenicity QHD43417.1 HLA-A*68:01 7 FTIGTVTLK 0.853 4.9 2.0317 ANTIGEN HLA-A*31:01 125 RLWLCWKCR 0.74 16.59 1.1604 ANTIGEN QHD43418.1 – – – – – – NO INTERACTION QHD43419.1 HLA-A*11:01 5 GTITVEELK 0.719 20.98 1.0976 ANTIGEN QHD43421.1 HLA-A*11:01 109 ITLCFTLKR 0.71 22.97 2.0208 ANTIGEN HLA-A*68:01 109 ITLCFTLKR 0.695 27.24 2.0208 ANTIGEN HLA-A*23:01 107 VFITLCFTL 0.621 60.42 1.2490 ANTIGEN QHD43423.2 – – – – – – NO INTERACTION Table 3 Probable antigenic epitopes and MHC II Allele interaction based on NETMHC II PRED 3.2 server and VaxiJen 2.0 scores (≥1.0) for antigenicity prediction. (* - sign is used for no interaction any HLA Allele under consideration). NCBI-GenBank ID MHC II Allele POS Core peptide 1- LOG50K Affinity(nM) Vaxijen score Antigenicity QHD43417.1 – – – – – – NO INTERACTION QHD43418.1 – – – – – – NO INTERACTION QHD43419.1 HLA-DRB1*04:01 55 WLLWPVTA 0.282 2375.78 1.0631 ANTIGEN QHD43421.1 HLA-DRB1*01:01 74 VYQLRARSV 0.484 267.05 1.3108 ANTIGEN HLA-DRB1*07:01 74 VYQLRARSV 0.342 1235.08 1.3108 ANTIGEN QHD43423.2 – – – – – – NO INTERACTION Fig. 2 Graphical representation of selected peptides for docking based on their interaction with MHC Class I and Class II HLA alleles along with their Antigenicity. 3.2 Structural findings of epitope and MHC HLA-Alleles Epitope structures were obtained by using the PEP-FOLD-3.5 web-server; the HLA allele's structures were retrieved from the RCSB-PDB database. In Table 4 the crystal structure/model structure details with reference PDB-Id is provided. Table 4 Listing of MHC HLA-Alleles respective Crystal structures/Models with the PDB ID. Allele Name Template structure (PDB-ID) Crystal structure/Model HLA-A*11:01 2HN7 CRYSTAL STRUCTURE HLA-A*23:01 3I6L CRYSTAL STRUCTURE HLA-A*31:01 3RL1 CRYSTAL STRUCTURE HLA-A*68:01 6PBH CRYSTAL STRUCTURE HLA-DRB1*01:01 4AH2 CRYSTAL STRUCTURE HLA-DRB1*04:01 5LAX CRYSTAL STRUCTURE HLA-DRB1*07:01 6BIJ CRYSTAL STRUCTURE 3.3 Molecular docking analysis It was found that FTIGTVTLK, ITLCFTLKR epitopes interacted with MHC class I HLA Alleles, and VYQLRARSV epitope interacted with MHC Class II HLA alleles with a perfect binding score and ACE values as shown in Table 5 . The ITLCFTLKR Epitope of the ORF-7A protein exhibits binding with 2 HLA alleles (HLA-A*11:01, HLA-A*68:01) of MHC Class I, while FTIGTVTLK Epitope of ORF-3a protein interact with 1 HLA Allele (HLA-A*68:01) of MHC Class I. The VYQLRARSV Epitope of ORF-7a protein interacts clearly with 2 HLA Alleles (HLA-DRB1*01:01, HLA-DRB1*07:01) of the MHC Class II domain. Table 5 Binding Energies, Ace Values for Docked Complexes based on DINC Server and PatchDock Analysis for Putative Epitopes. Epitope HLA Allele Binding score (kcal/mol) Patch dock score ACE Selection FTIGTVTLK HLA-A*68:01 −8.80 8066 178.91 Selected RLWLCWKCR HLA-A*31:01 −4.80 8916 117.53 Rejected GTITVEELK HLA-A*11:01 −3.80 8040 78.78 Rejected ITLCFTLKR HLA-A*11:01 −3.70 8206 −25.88 Selected ITLCFTLKR HLA-A*68:01 −7.60 8136 184.55 Selected VFITLCFTL HLA-A*23:01 −4.40 7706 −134.91 Rejected WLLWPVTLA HLADRB1*04:01 −8.60 9432 −150.96 Rejected VYQLRARSV HLADRB1*01:01 −6.20 6874 −168.74 Selected VYQLRARSV HLADRB1*07:01 −6.20 6842 262.74 Selected In Fig. 3, Fig. 4, Fig. 5 , interactions between a selected three T-Cell epitopes with respective MHC Class I and II HLA-Alleles via hydrogen bond formation and van der Waals interactions is depicted. After positive docking results, these epitopes were subjected to further Molecular dynamic simulation and biochemical parameters assessment. Fig. 6 represents a graphical plot of binding scores for epitopes interacting with HLA-Alleles. Fig. 3 FTIGTVTLK Epitope interaction with an antigen-binding pocket of HLA-A*68:01, of MHC I-HLA Allele. Here, Threonine at 2nd,5th, and 7th position in the epitope generate preferably a hydrogen bond due to the presence of partially charged positive and negative atoms, and also 4th position Glutamic acid and lysine at 8th position side chains can form a salt bridge, while other amino acids result in van der Waals interactions. Fig. 4 ITLCFTLKR Epitope interaction with an antigen-binding pocket of HLA-A*68:01, of MHC I-HLA Allele. Here, Threonine at 2nd and 6th position, as well as cysteine at the 4th position in epitope, generate preferably a hydrogen bond due to the presence of partially charged positive and negative atoms, while other amino acids result in van der Waals interactions. Fig. 5 VYQLRARSV Epitope interaction with an antigen-binding pocket of HLA-DRB1*07:01, of MHCII-HLA Allele, Here, Tyrosine at 2nd, Glutamine at 3rd position and Serine at 8th position in epitope generate preferably a hydrogen bond due to the presence of partially charged positive and negative atoms, and 5th and 7tharginine residue side chains can form a salt bridge, while other amino acids result in van der Waals interactions. Fig. 6 Binding energy graphical plot for selected Epitope and HLA-Allelic pair. 3.4 Molecular dynamics and simulation analysis RMSD values and Atomic fluctuation per amino acid residue were obtained for Epitopes interacting with the HLA-Allele structure; this analysis allows a perfect pair selection and validation. Moreover, only two Epitope pairs, i.e., ITLCFTLKR and VYQLRARSV, were identified as probable T-cell epitopes and as putative vaccine specimens. Fig. 7 shows the RMSD Plot and Atomic fluctuation per residue for the ITLCFTLKR- HLA-A*68:01 complex, the RMSD Plot and Atomic fluctuation per residue for the VYQLRARSV- HLA-DRB1*07:01 complex. Both results were positive as best interactions, for protein-ligand docked complexes must possess RMSD values from 0 to 1.0 Å as a preferred range [13]. Fig. 7 A. RMSD Plot for ITLCFTLKR- HLA-A*68:01 complex, for each amino acid residue by Molecular dynamics analysis, B. B-Factor (atomic fluctuation) values per amino acid residue for Epitope ITLCFTLKR- HLA-A*68:01 docked complex, C. RMSD Plot for VYQLRARSV- HLA-DRB1*07:01, for each amino acid residue by Molecular dynamics analysis, D. B-Factor (atomic fluctuation) values per amino acid residue for Epitope VYQLRARSV- HLA-DRB1*07:01 docked complex. 3.5 Toxicity analysis, Ramachandran Plot analysis, and population coverage results ToxinPred 4.0 server results (in Table 6 .) represent Finalized T-cell Epitopes that were nontoxic from the biochemical perspective. Table 6 Results of ToxinPred on probable antigens. Peptide/Probable antigen SVM score Hydrophilicity Molecular weight Toxicity FTIGTVTLK −1.36 −1.23 979.32 NON-TOXIN GTITVEELK −0.98 0.34 989.27 NON-TOXIN ITLCFTLKR −1.32 −0.41 1094.51 NON-TOXIN VFITLCFTL −1.21 −1.52 1056.46 NON-TOXIN WLLWPVTLA −1.18 −1.62 1098.49 NON-TOXIN VYQLRARSV −1.07 −0.12 1091.40 NON-TOXIN Ramachandran plot analysis, Fig. 8 A and B suggest that most of the residues are allowed in a favored region; this gives more confidence in the structural conformation for targeted T-Cell Epitopes. Fig. 8 A. 99.8% residues of the ITLCFTLKR Epitope were in the allowed and favored region under Ramachandran Plot analysis. B. 99.8% residues of the VYQLRARSV Epitope were in the allowed and favored region under Ramachandran Plot analysis. MHCPred results (Table 7 ) indicate quantitative estimation of IC50 values for both MHC I and MHC II alleles for respective Epitopes shows elicitation of an immune response when this data is deployed in a population coverage analysis. Table 7 MHCPred results depict IC50 Values for HLA Alleles and confidence of the prediction. HLA Alleles Amino acid groups Predicted -logIC50 (M) Predicted IC50 Value (nM) Confidence of prediction (Max = 1) HLA-A*68:01 FTIGTVTLK 7.116 76.56 1.00 HLA-A*11:01 ITLCFTLKR 7.028 93.76 1.00 HLA-A*68:01 ITLCFTLKR 6.282 522.40 0.78 HLA-DRB1*01:01 VYQLRARSV 7.624 23.77 0.89 HLA-DRB1*07:01 VYQLRARSV 6.734 184.50 0.89 IEDB population coverage analysis suggests that ITLCFTLKR and VYQLRARSV epitopes exhibit a suitable population coverage, as depicted in the graphical representation of Fig. 9 A and B. This allows only two probable Epitopes for the final selection of vaccine crafting. Fig. 9 A. Graphical representation of population conservancy analysis of ITLCFTLKR Epitope. B. Graphical representation of population conservancy analysis of VYQLRARSV Epitope. In Table 8 , ProtParam analysis further reveals the stability of the considered epitopes and final revelation of one epitope ITLCFTLKR is screened out. This particular Epitope exhibits an instability index of 35.68, with a grand average of hydropathicity (GRAVY) calculated was 0.844, and the estimated half-life for this peptide was determined to be 20 h for mammalian reticulocytes. Table 8 ProtParam analysis for selected epitopes. Selected Epitope GRAVY Score Instability Index (Indication) Estimated Half-Life(Mammalian reticulocytes) Theoretical pI Aliphatic Index ITLCFTLKR 0.844 35.68(Stable) 20 Hours 9.51 130.00 VYQLRARSV −0.067 70.73(Unstable) 100 Hours 10.83 118.89 |
label | 3 |
title | Results |
sec | 3.1 T-cell epitopes prediction and VaxiJen scoring Non-allergen proteins selected (Table 1) based on allergenicity scores (by deploying AllergenFP 1.0) New to NetMHCII PAN 3.2 and NETMHC 4.0 servers for determining 1-log50k value and affinity values for selecting the best possible pair of epitopes with their corresponding HLA alleles. In Table 2 and Table 3 , results for MHC Class I HLA and MHC Class II HLA alleles paired epitopes along with their VaxiJen scores were represented respectively, to obtain putative epitopes. The results are self-explanatory in Fig. 2 . Graphical representation of selected peptides for docking are based on their interaction with MHC Class I and Class II HLA alleles along with their antigenicity. Table 2 Probable antigenic epitopes and MHC I Allele interaction based on NETMHC 4.0 server and VaxiJen 2.0 scores (≥1.0) for antigenicity prediction. (* - sign is used for no interaction any HLA Allele under consideration). NCBI-GenBank ID MHC I Allele POS Core peptide 1-LOG50K Affinity(nM) Vaxijen score Antigenicity QHD43417.1 HLA-A*68:01 7 FTIGTVTLK 0.853 4.9 2.0317 ANTIGEN HLA-A*31:01 125 RLWLCWKCR 0.74 16.59 1.1604 ANTIGEN QHD43418.1 – – – – – – NO INTERACTION QHD43419.1 HLA-A*11:01 5 GTITVEELK 0.719 20.98 1.0976 ANTIGEN QHD43421.1 HLA-A*11:01 109 ITLCFTLKR 0.71 22.97 2.0208 ANTIGEN HLA-A*68:01 109 ITLCFTLKR 0.695 27.24 2.0208 ANTIGEN HLA-A*23:01 107 VFITLCFTL 0.621 60.42 1.2490 ANTIGEN QHD43423.2 – – – – – – NO INTERACTION Table 3 Probable antigenic epitopes and MHC II Allele interaction based on NETMHC II PRED 3.2 server and VaxiJen 2.0 scores (≥1.0) for antigenicity prediction. (* - sign is used for no interaction any HLA Allele under consideration). NCBI-GenBank ID MHC II Allele POS Core peptide 1- LOG50K Affinity(nM) Vaxijen score Antigenicity QHD43417.1 – – – – – – NO INTERACTION QHD43418.1 – – – – – – NO INTERACTION QHD43419.1 HLA-DRB1*04:01 55 WLLWPVTA 0.282 2375.78 1.0631 ANTIGEN QHD43421.1 HLA-DRB1*01:01 74 VYQLRARSV 0.484 267.05 1.3108 ANTIGEN HLA-DRB1*07:01 74 VYQLRARSV 0.342 1235.08 1.3108 ANTIGEN QHD43423.2 – – – – – – NO INTERACTION Fig. 2 Graphical representation of selected peptides for docking based on their interaction with MHC Class I and Class II HLA alleles along with their Antigenicity. |
label | 3.1 |
title | T-cell epitopes prediction and VaxiJen scoring |
p | Non-allergen proteins selected (Table 1) based on allergenicity scores (by deploying AllergenFP 1.0) New to NetMHCII PAN 3.2 and NETMHC 4.0 servers for determining 1-log50k value and affinity values for selecting the best possible pair of epitopes with their corresponding HLA alleles. In Table 2 and Table 3 , results for MHC Class I HLA and MHC Class II HLA alleles paired epitopes along with their VaxiJen scores were represented respectively, to obtain putative epitopes. The results are self-explanatory in Fig. 2 . Graphical representation of selected peptides for docking are based on their interaction with MHC Class I and Class II HLA alleles along with their antigenicity. Table 2 Probable antigenic epitopes and MHC I Allele interaction based on NETMHC 4.0 server and VaxiJen 2.0 scores (≥1.0) for antigenicity prediction. (* - sign is used for no interaction any HLA Allele under consideration). NCBI-GenBank ID MHC I Allele POS Core peptide 1-LOG50K Affinity(nM) Vaxijen score Antigenicity QHD43417.1 HLA-A*68:01 7 FTIGTVTLK 0.853 4.9 2.0317 ANTIGEN HLA-A*31:01 125 RLWLCWKCR 0.74 16.59 1.1604 ANTIGEN QHD43418.1 – – – – – – NO INTERACTION QHD43419.1 HLA-A*11:01 5 GTITVEELK 0.719 20.98 1.0976 ANTIGEN QHD43421.1 HLA-A*11:01 109 ITLCFTLKR 0.71 22.97 2.0208 ANTIGEN HLA-A*68:01 109 ITLCFTLKR 0.695 27.24 2.0208 ANTIGEN HLA-A*23:01 107 VFITLCFTL 0.621 60.42 1.2490 ANTIGEN QHD43423.2 – – – – – – NO INTERACTION Table 3 Probable antigenic epitopes and MHC II Allele interaction based on NETMHC II PRED 3.2 server and VaxiJen 2.0 scores (≥1.0) for antigenicity prediction. (* - sign is used for no interaction any HLA Allele under consideration). NCBI-GenBank ID MHC II Allele POS Core peptide 1- LOG50K Affinity(nM) Vaxijen score Antigenicity QHD43417.1 – – – – – – NO INTERACTION QHD43418.1 – – – – – – NO INTERACTION QHD43419.1 HLA-DRB1*04:01 55 WLLWPVTA 0.282 2375.78 1.0631 ANTIGEN QHD43421.1 HLA-DRB1*01:01 74 VYQLRARSV 0.484 267.05 1.3108 ANTIGEN HLA-DRB1*07:01 74 VYQLRARSV 0.342 1235.08 1.3108 ANTIGEN QHD43423.2 – – – – – – NO INTERACTION Fig. 2 Graphical representation of selected peptides for docking based on their interaction with MHC Class I and Class II HLA alleles along with their Antigenicity. |
table-wrap | Table 2 Probable antigenic epitopes and MHC I Allele interaction based on NETMHC 4.0 server and VaxiJen 2.0 scores (≥1.0) for antigenicity prediction. (* - sign is used for no interaction any HLA Allele under consideration). NCBI-GenBank ID MHC I Allele POS Core peptide 1-LOG50K Affinity(nM) Vaxijen score Antigenicity QHD43417.1 HLA-A*68:01 7 FTIGTVTLK 0.853 4.9 2.0317 ANTIGEN HLA-A*31:01 125 RLWLCWKCR 0.74 16.59 1.1604 ANTIGEN QHD43418.1 – – – – – – NO INTERACTION QHD43419.1 HLA-A*11:01 5 GTITVEELK 0.719 20.98 1.0976 ANTIGEN QHD43421.1 HLA-A*11:01 109 ITLCFTLKR 0.71 22.97 2.0208 ANTIGEN HLA-A*68:01 109 ITLCFTLKR 0.695 27.24 2.0208 ANTIGEN HLA-A*23:01 107 VFITLCFTL 0.621 60.42 1.2490 ANTIGEN QHD43423.2 – – – – – – NO INTERACTION |
label | Table 2 |
caption | Probable antigenic epitopes and MHC I Allele interaction based on NETMHC 4.0 server and VaxiJen 2.0 scores (≥1.0) for antigenicity prediction. (* - sign is used for no interaction any HLA Allele under consideration). |
p | Probable antigenic epitopes and MHC I Allele interaction based on NETMHC 4.0 server and VaxiJen 2.0 scores (≥1.0) for antigenicity prediction. (* - sign is used for no interaction any HLA Allele under consideration). |
table | NCBI-GenBank ID MHC I Allele POS Core peptide 1-LOG50K Affinity(nM) Vaxijen score Antigenicity QHD43417.1 HLA-A*68:01 7 FTIGTVTLK 0.853 4.9 2.0317 ANTIGEN HLA-A*31:01 125 RLWLCWKCR 0.74 16.59 1.1604 ANTIGEN QHD43418.1 – – – – – – NO INTERACTION QHD43419.1 HLA-A*11:01 5 GTITVEELK 0.719 20.98 1.0976 ANTIGEN QHD43421.1 HLA-A*11:01 109 ITLCFTLKR 0.71 22.97 2.0208 ANTIGEN HLA-A*68:01 109 ITLCFTLKR 0.695 27.24 2.0208 ANTIGEN HLA-A*23:01 107 VFITLCFTL 0.621 60.42 1.2490 ANTIGEN QHD43423.2 – – – – – – NO INTERACTION |
tr | NCBI-GenBank ID MHC I Allele POS Core peptide 1-LOG50K Affinity(nM) Vaxijen score Antigenicity |
th | NCBI-GenBank ID |
th | MHC I Allele |
th | POS |
th | Core peptide |
th | 1-LOG50K |
th | Affinity(nM) |
th | Vaxijen score |
th | Antigenicity |
tr | QHD43417.1 HLA-A*68:01 7 FTIGTVTLK 0.853 4.9 2.0317 ANTIGEN |
td | QHD43417.1 |
td | HLA-A*68:01 |
td | 7 |
td | FTIGTVTLK |
td | 0.853 |
td | 4.9 |
td | 2.0317 |
td | ANTIGEN |
tr | HLA-A*31:01 125 RLWLCWKCR 0.74 16.59 1.1604 ANTIGEN |
td | HLA-A*31:01 |
td | 125 |
td | RLWLCWKCR |
td | 0.74 |
td | 16.59 |
td | 1.1604 |
td | ANTIGEN |
tr | QHD43418.1 – – – – – – NO INTERACTION |
td | QHD43418.1 |
td | – |
td | – |
td | – |
td | – |
td | – |
td | – |
td | NO INTERACTION |
tr | QHD43419.1 HLA-A*11:01 5 GTITVEELK 0.719 20.98 1.0976 ANTIGEN |
td | QHD43419.1 |
td | HLA-A*11:01 |
td | 5 |
td | GTITVEELK |
td | 0.719 |
td | 20.98 |
td | 1.0976 |
td | ANTIGEN |
tr | QHD43421.1 HLA-A*11:01 109 ITLCFTLKR 0.71 22.97 2.0208 ANTIGEN |
td | QHD43421.1 |
td | HLA-A*11:01 |
td | 109 |
td | ITLCFTLKR |
td | 0.71 |
td | 22.97 |
td | 2.0208 |
td | ANTIGEN |
tr | HLA-A*68:01 109 ITLCFTLKR 0.695 27.24 2.0208 ANTIGEN |
td | HLA-A*68:01 |
td | 109 |
td | ITLCFTLKR |
td | 0.695 |
td | 27.24 |
td | 2.0208 |
td | ANTIGEN |
tr | HLA-A*23:01 107 VFITLCFTL 0.621 60.42 1.2490 ANTIGEN |
td | HLA-A*23:01 |
td | 107 |
td | VFITLCFTL |
td | 0.621 |
td | 60.42 |
td | 1.2490 |
td | ANTIGEN |
tr | QHD43423.2 – – – – – – NO INTERACTION |
td | QHD43423.2 |
td | – |
td | – |
td | – |
td | – |
td | – |
td | – |
td | NO INTERACTION |
table-wrap | Table 3 Probable antigenic epitopes and MHC II Allele interaction based on NETMHC II PRED 3.2 server and VaxiJen 2.0 scores (≥1.0) for antigenicity prediction. (* - sign is used for no interaction any HLA Allele under consideration). NCBI-GenBank ID MHC II Allele POS Core peptide 1- LOG50K Affinity(nM) Vaxijen score Antigenicity QHD43417.1 – – – – – – NO INTERACTION QHD43418.1 – – – – – – NO INTERACTION QHD43419.1 HLA-DRB1*04:01 55 WLLWPVTA 0.282 2375.78 1.0631 ANTIGEN QHD43421.1 HLA-DRB1*01:01 74 VYQLRARSV 0.484 267.05 1.3108 ANTIGEN HLA-DRB1*07:01 74 VYQLRARSV 0.342 1235.08 1.3108 ANTIGEN QHD43423.2 – – – – – – NO INTERACTION |
label | Table 3 |
caption | Probable antigenic epitopes and MHC II Allele interaction based on NETMHC II PRED 3.2 server and VaxiJen 2.0 scores (≥1.0) for antigenicity prediction. (* - sign is used for no interaction any HLA Allele under consideration). |
p | Probable antigenic epitopes and MHC II Allele interaction based on NETMHC II PRED 3.2 server and VaxiJen 2.0 scores (≥1.0) for antigenicity prediction. (* - sign is used for no interaction any HLA Allele under consideration). |
table | NCBI-GenBank ID MHC II Allele POS Core peptide 1- LOG50K Affinity(nM) Vaxijen score Antigenicity QHD43417.1 – – – – – – NO INTERACTION QHD43418.1 – – – – – – NO INTERACTION QHD43419.1 HLA-DRB1*04:01 55 WLLWPVTA 0.282 2375.78 1.0631 ANTIGEN QHD43421.1 HLA-DRB1*01:01 74 VYQLRARSV 0.484 267.05 1.3108 ANTIGEN HLA-DRB1*07:01 74 VYQLRARSV 0.342 1235.08 1.3108 ANTIGEN QHD43423.2 – – – – – – NO INTERACTION |
tr | NCBI-GenBank ID MHC II Allele POS Core peptide 1- LOG50K Affinity(nM) Vaxijen score Antigenicity |
th | NCBI-GenBank ID |
th | MHC II Allele |
th | POS |
th | Core peptide |
th | 1- LOG50K |
th | Affinity(nM) |
th | Vaxijen score |
th | Antigenicity |
tr | QHD43417.1 – – – – – – NO INTERACTION |
td | QHD43417.1 |
td | – |
td | – |
td | – |
td | – |
td | – |
td | – |
td | NO INTERACTION |
tr | QHD43418.1 – – – – – – NO INTERACTION |
td | QHD43418.1 |
td | – |
td | – |
td | – |
td | – |
td | – |
td | – |
td | NO INTERACTION |
tr | QHD43419.1 HLA-DRB1*04:01 55 WLLWPVTA 0.282 2375.78 1.0631 ANTIGEN |
td | QHD43419.1 |
td | HLA-DRB1*04:01 |
td | 55 |
td | WLLWPVTA |
td | 0.282 |
td | 2375.78 |
td | 1.0631 |
td | ANTIGEN |
tr | QHD43421.1 HLA-DRB1*01:01 74 VYQLRARSV 0.484 267.05 1.3108 ANTIGEN |
td | QHD43421.1 |
td | HLA-DRB1*01:01 |
td | 74 |
td | VYQLRARSV |
td | 0.484 |
td | 267.05 |
td | 1.3108 |
td | ANTIGEN |
tr | HLA-DRB1*07:01 74 VYQLRARSV 0.342 1235.08 1.3108 ANTIGEN |
td | HLA-DRB1*07:01 |
td | 74 |
td | VYQLRARSV |
td | 0.342 |
td | 1235.08 |
td | 1.3108 |
td | ANTIGEN |
tr | QHD43423.2 – – – – – – NO INTERACTION |
td | QHD43423.2 |
td | – |
td | – |
td | – |
td | – |
td | – |
td | – |
td | NO INTERACTION |
figure | Fig. 2 Graphical representation of selected peptides for docking based on their interaction with MHC Class I and Class II HLA alleles along with their Antigenicity. |
label | Fig. 2 |
caption | Graphical representation of selected peptides for docking based on their interaction with MHC Class I and Class II HLA alleles along with their Antigenicity. |
p | Graphical representation of selected peptides for docking based on their interaction with MHC Class I and Class II HLA alleles along with their Antigenicity. |
sec | 3.2 Structural findings of epitope and MHC HLA-Alleles Epitope structures were obtained by using the PEP-FOLD-3.5 web-server; the HLA allele's structures were retrieved from the RCSB-PDB database. In Table 4 the crystal structure/model structure details with reference PDB-Id is provided. Table 4 Listing of MHC HLA-Alleles respective Crystal structures/Models with the PDB ID. Allele Name Template structure (PDB-ID) Crystal structure/Model HLA-A*11:01 2HN7 CRYSTAL STRUCTURE HLA-A*23:01 3I6L CRYSTAL STRUCTURE HLA-A*31:01 3RL1 CRYSTAL STRUCTURE HLA-A*68:01 6PBH CRYSTAL STRUCTURE HLA-DRB1*01:01 4AH2 CRYSTAL STRUCTURE HLA-DRB1*04:01 5LAX CRYSTAL STRUCTURE HLA-DRB1*07:01 6BIJ CRYSTAL STRUCTURE |
label | 3.2 |
title | Structural findings of epitope and MHC HLA-Alleles |
p | Epitope structures were obtained by using the PEP-FOLD-3.5 web-server; the HLA allele's structures were retrieved from the RCSB-PDB database. In Table 4 the crystal structure/model structure details with reference PDB-Id is provided. Table 4 Listing of MHC HLA-Alleles respective Crystal structures/Models with the PDB ID. Allele Name Template structure (PDB-ID) Crystal structure/Model HLA-A*11:01 2HN7 CRYSTAL STRUCTURE HLA-A*23:01 3I6L CRYSTAL STRUCTURE HLA-A*31:01 3RL1 CRYSTAL STRUCTURE HLA-A*68:01 6PBH CRYSTAL STRUCTURE HLA-DRB1*01:01 4AH2 CRYSTAL STRUCTURE HLA-DRB1*04:01 5LAX CRYSTAL STRUCTURE HLA-DRB1*07:01 6BIJ CRYSTAL STRUCTURE |
table-wrap | Table 4 Listing of MHC HLA-Alleles respective Crystal structures/Models with the PDB ID. Allele Name Template structure (PDB-ID) Crystal structure/Model HLA-A*11:01 2HN7 CRYSTAL STRUCTURE HLA-A*23:01 3I6L CRYSTAL STRUCTURE HLA-A*31:01 3RL1 CRYSTAL STRUCTURE HLA-A*68:01 6PBH CRYSTAL STRUCTURE HLA-DRB1*01:01 4AH2 CRYSTAL STRUCTURE HLA-DRB1*04:01 5LAX CRYSTAL STRUCTURE HLA-DRB1*07:01 6BIJ CRYSTAL STRUCTURE |
label | Table 4 |
caption | Listing of MHC HLA-Alleles respective Crystal structures/Models with the PDB ID. |
p | Listing of MHC HLA-Alleles respective Crystal structures/Models with the PDB ID. |
table | Allele Name Template structure (PDB-ID) Crystal structure/Model HLA-A*11:01 2HN7 CRYSTAL STRUCTURE HLA-A*23:01 3I6L CRYSTAL STRUCTURE HLA-A*31:01 3RL1 CRYSTAL STRUCTURE HLA-A*68:01 6PBH CRYSTAL STRUCTURE HLA-DRB1*01:01 4AH2 CRYSTAL STRUCTURE HLA-DRB1*04:01 5LAX CRYSTAL STRUCTURE HLA-DRB1*07:01 6BIJ CRYSTAL STRUCTURE |
tr | Allele Name Template structure (PDB-ID) Crystal structure/Model |
th | Allele Name |
th | Template structure (PDB-ID) |
th | Crystal structure/Model |
tr | HLA-A*11:01 2HN7 CRYSTAL STRUCTURE |
td | HLA-A*11:01 |
td | 2HN7 |
td | CRYSTAL STRUCTURE |
tr | HLA-A*23:01 3I6L CRYSTAL STRUCTURE |
td | HLA-A*23:01 |
td | 3I6L |
td | CRYSTAL STRUCTURE |
tr | HLA-A*31:01 3RL1 CRYSTAL STRUCTURE |
td | HLA-A*31:01 |
td | 3RL1 |
td | CRYSTAL STRUCTURE |
tr | HLA-A*68:01 6PBH CRYSTAL STRUCTURE |
td | HLA-A*68:01 |
td | 6PBH |
td | CRYSTAL STRUCTURE |
tr | HLA-DRB1*01:01 4AH2 CRYSTAL STRUCTURE |
td | HLA-DRB1*01:01 |
td | 4AH2 |
td | CRYSTAL STRUCTURE |
tr | HLA-DRB1*04:01 5LAX CRYSTAL STRUCTURE |
td | HLA-DRB1*04:01 |
td | 5LAX |
td | CRYSTAL STRUCTURE |
tr | HLA-DRB1*07:01 6BIJ CRYSTAL STRUCTURE |
td | HLA-DRB1*07:01 |
td | 6BIJ |
td | CRYSTAL STRUCTURE |
sec | 3.3 Molecular docking analysis It was found that FTIGTVTLK, ITLCFTLKR epitopes interacted with MHC class I HLA Alleles, and VYQLRARSV epitope interacted with MHC Class II HLA alleles with a perfect binding score and ACE values as shown in Table 5 . The ITLCFTLKR Epitope of the ORF-7A protein exhibits binding with 2 HLA alleles (HLA-A*11:01, HLA-A*68:01) of MHC Class I, while FTIGTVTLK Epitope of ORF-3a protein interact with 1 HLA Allele (HLA-A*68:01) of MHC Class I. The VYQLRARSV Epitope of ORF-7a protein interacts clearly with 2 HLA Alleles (HLA-DRB1*01:01, HLA-DRB1*07:01) of the MHC Class II domain. Table 5 Binding Energies, Ace Values for Docked Complexes based on DINC Server and PatchDock Analysis for Putative Epitopes. Epitope HLA Allele Binding score (kcal/mol) Patch dock score ACE Selection FTIGTVTLK HLA-A*68:01 −8.80 8066 178.91 Selected RLWLCWKCR HLA-A*31:01 −4.80 8916 117.53 Rejected GTITVEELK HLA-A*11:01 −3.80 8040 78.78 Rejected ITLCFTLKR HLA-A*11:01 −3.70 8206 −25.88 Selected ITLCFTLKR HLA-A*68:01 −7.60 8136 184.55 Selected VFITLCFTL HLA-A*23:01 −4.40 7706 −134.91 Rejected WLLWPVTLA HLADRB1*04:01 −8.60 9432 −150.96 Rejected VYQLRARSV HLADRB1*01:01 −6.20 6874 −168.74 Selected VYQLRARSV HLADRB1*07:01 −6.20 6842 262.74 Selected In Fig. 3, Fig. 4, Fig. 5 , interactions between a selected three T-Cell epitopes with respective MHC Class I and II HLA-Alleles via hydrogen bond formation and van der Waals interactions is depicted. After positive docking results, these epitopes were subjected to further Molecular dynamic simulation and biochemical parameters assessment. Fig. 6 represents a graphical plot of binding scores for epitopes interacting with HLA-Alleles. Fig. 3 FTIGTVTLK Epitope interaction with an antigen-binding pocket of HLA-A*68:01, of MHC I-HLA Allele. Here, Threonine at 2nd,5th, and 7th position in the epitope generate preferably a hydrogen bond due to the presence of partially charged positive and negative atoms, and also 4th position Glutamic acid and lysine at 8th position side chains can form a salt bridge, while other amino acids result in van der Waals interactions. Fig. 4 ITLCFTLKR Epitope interaction with an antigen-binding pocket of HLA-A*68:01, of MHC I-HLA Allele. Here, Threonine at 2nd and 6th position, as well as cysteine at the 4th position in epitope, generate preferably a hydrogen bond due to the presence of partially charged positive and negative atoms, while other amino acids result in van der Waals interactions. Fig. 5 VYQLRARSV Epitope interaction with an antigen-binding pocket of HLA-DRB1*07:01, of MHCII-HLA Allele, Here, Tyrosine at 2nd, Glutamine at 3rd position and Serine at 8th position in epitope generate preferably a hydrogen bond due to the presence of partially charged positive and negative atoms, and 5th and 7tharginine residue side chains can form a salt bridge, while other amino acids result in van der Waals interactions. Fig. 6 Binding energy graphical plot for selected Epitope and HLA-Allelic pair. |
label | 3.3 |
title | Molecular docking analysis |
p | It was found that FTIGTVTLK, ITLCFTLKR epitopes interacted with MHC class I HLA Alleles, and VYQLRARSV epitope interacted with MHC Class II HLA alleles with a perfect binding score and ACE values as shown in Table 5 . The ITLCFTLKR Epitope of the ORF-7A protein exhibits binding with 2 HLA alleles (HLA-A*11:01, HLA-A*68:01) of MHC Class I, while FTIGTVTLK Epitope of ORF-3a protein interact with 1 HLA Allele (HLA-A*68:01) of MHC Class I. The VYQLRARSV Epitope of ORF-7a protein interacts clearly with 2 HLA Alleles (HLA-DRB1*01:01, HLA-DRB1*07:01) of the MHC Class II domain. Table 5 Binding Energies, Ace Values for Docked Complexes based on DINC Server and PatchDock Analysis for Putative Epitopes. Epitope HLA Allele Binding score (kcal/mol) Patch dock score ACE Selection FTIGTVTLK HLA-A*68:01 −8.80 8066 178.91 Selected RLWLCWKCR HLA-A*31:01 −4.80 8916 117.53 Rejected GTITVEELK HLA-A*11:01 −3.80 8040 78.78 Rejected ITLCFTLKR HLA-A*11:01 −3.70 8206 −25.88 Selected ITLCFTLKR HLA-A*68:01 −7.60 8136 184.55 Selected VFITLCFTL HLA-A*23:01 −4.40 7706 −134.91 Rejected WLLWPVTLA HLADRB1*04:01 −8.60 9432 −150.96 Rejected VYQLRARSV HLADRB1*01:01 −6.20 6874 −168.74 Selected VYQLRARSV HLADRB1*07:01 −6.20 6842 262.74 Selected |
table-wrap | Table 5 Binding Energies, Ace Values for Docked Complexes based on DINC Server and PatchDock Analysis for Putative Epitopes. Epitope HLA Allele Binding score (kcal/mol) Patch dock score ACE Selection FTIGTVTLK HLA-A*68:01 −8.80 8066 178.91 Selected RLWLCWKCR HLA-A*31:01 −4.80 8916 117.53 Rejected GTITVEELK HLA-A*11:01 −3.80 8040 78.78 Rejected ITLCFTLKR HLA-A*11:01 −3.70 8206 −25.88 Selected ITLCFTLKR HLA-A*68:01 −7.60 8136 184.55 Selected VFITLCFTL HLA-A*23:01 −4.40 7706 −134.91 Rejected WLLWPVTLA HLADRB1*04:01 −8.60 9432 −150.96 Rejected VYQLRARSV HLADRB1*01:01 −6.20 6874 −168.74 Selected VYQLRARSV HLADRB1*07:01 −6.20 6842 262.74 Selected |
label | Table 5 |
caption | Binding Energies, Ace Values for Docked Complexes based on DINC Server and PatchDock Analysis for Putative Epitopes. |
p | Binding Energies, Ace Values for Docked Complexes based on DINC Server and PatchDock Analysis for Putative Epitopes. |
table | Epitope HLA Allele Binding score (kcal/mol) Patch dock score ACE Selection FTIGTVTLK HLA-A*68:01 −8.80 8066 178.91 Selected RLWLCWKCR HLA-A*31:01 −4.80 8916 117.53 Rejected GTITVEELK HLA-A*11:01 −3.80 8040 78.78 Rejected ITLCFTLKR HLA-A*11:01 −3.70 8206 −25.88 Selected ITLCFTLKR HLA-A*68:01 −7.60 8136 184.55 Selected VFITLCFTL HLA-A*23:01 −4.40 7706 −134.91 Rejected WLLWPVTLA HLADRB1*04:01 −8.60 9432 −150.96 Rejected VYQLRARSV HLADRB1*01:01 −6.20 6874 −168.74 Selected VYQLRARSV HLADRB1*07:01 −6.20 6842 262.74 Selected |
tr | Epitope HLA Allele Binding score (kcal/mol) Patch dock score ACE Selection |
th | Epitope |
th | HLA Allele |
th | Binding score (kcal/mol) |
th | Patch dock score |
th | ACE |
th | Selection |
tr | FTIGTVTLK HLA-A*68:01 −8.80 8066 178.91 Selected |
td | FTIGTVTLK |
td | HLA-A*68:01 |
td | −8.80 |
td | 8066 |
td | 178.91 |
td | Selected |
tr | RLWLCWKCR HLA-A*31:01 −4.80 8916 117.53 Rejected |
td | RLWLCWKCR |
td | HLA-A*31:01 |
td | −4.80 |
td | 8916 |
td | 117.53 |
td | Rejected |
tr | GTITVEELK HLA-A*11:01 −3.80 8040 78.78 Rejected |
td | GTITVEELK |
td | HLA-A*11:01 |
td | −3.80 |
td | 8040 |
td | 78.78 |
td | Rejected |
tr | ITLCFTLKR HLA-A*11:01 −3.70 8206 −25.88 Selected |
td | ITLCFTLKR |
td | HLA-A*11:01 |
td | −3.70 |
td | 8206 |
td | −25.88 |
td | Selected |
tr | ITLCFTLKR HLA-A*68:01 −7.60 8136 184.55 Selected |
td | ITLCFTLKR |
td | HLA-A*68:01 |
td | −7.60 |
td | 8136 |
td | 184.55 |
td | Selected |
tr | VFITLCFTL HLA-A*23:01 −4.40 7706 −134.91 Rejected |
td | VFITLCFTL |
td | HLA-A*23:01 |
td | −4.40 |
td | 7706 |
td | −134.91 |
td | Rejected |
tr | WLLWPVTLA HLADRB1*04:01 −8.60 9432 −150.96 Rejected |
td | WLLWPVTLA |
td | HLADRB1*04:01 |
td | −8.60 |
td | 9432 |
td | −150.96 |
td | Rejected |
tr | VYQLRARSV HLADRB1*01:01 −6.20 6874 −168.74 Selected |
td | VYQLRARSV |
td | HLADRB1*01:01 |
td | −6.20 |
td | 6874 |
td | −168.74 |
td | Selected |
tr | VYQLRARSV HLADRB1*07:01 −6.20 6842 262.74 Selected |
td | VYQLRARSV |
td | HLADRB1*07:01 |
td | −6.20 |
td | 6842 |
td | 262.74 |
td | Selected |
p | In Fig. 3, Fig. 4, Fig. 5 , interactions between a selected three T-Cell epitopes with respective MHC Class I and II HLA-Alleles via hydrogen bond formation and van der Waals interactions is depicted. After positive docking results, these epitopes were subjected to further Molecular dynamic simulation and biochemical parameters assessment. Fig. 6 represents a graphical plot of binding scores for epitopes interacting with HLA-Alleles. Fig. 3 FTIGTVTLK Epitope interaction with an antigen-binding pocket of HLA-A*68:01, of MHC I-HLA Allele. Here, Threonine at 2nd,5th, and 7th position in the epitope generate preferably a hydrogen bond due to the presence of partially charged positive and negative atoms, and also 4th position Glutamic acid and lysine at 8th position side chains can form a salt bridge, while other amino acids result in van der Waals interactions. Fig. 4 ITLCFTLKR Epitope interaction with an antigen-binding pocket of HLA-A*68:01, of MHC I-HLA Allele. Here, Threonine at 2nd and 6th position, as well as cysteine at the 4th position in epitope, generate preferably a hydrogen bond due to the presence of partially charged positive and negative atoms, while other amino acids result in van der Waals interactions. Fig. 5 VYQLRARSV Epitope interaction with an antigen-binding pocket of HLA-DRB1*07:01, of MHCII-HLA Allele, Here, Tyrosine at 2nd, Glutamine at 3rd position and Serine at 8th position in epitope generate preferably a hydrogen bond due to the presence of partially charged positive and negative atoms, and 5th and 7tharginine residue side chains can form a salt bridge, while other amino acids result in van der Waals interactions. Fig. 6 Binding energy graphical plot for selected Epitope and HLA-Allelic pair. |
figure | Fig. 3 FTIGTVTLK Epitope interaction with an antigen-binding pocket of HLA-A*68:01, of MHC I-HLA Allele. Here, Threonine at 2nd,5th, and 7th position in the epitope generate preferably a hydrogen bond due to the presence of partially charged positive and negative atoms, and also 4th position Glutamic acid and lysine at 8th position side chains can form a salt bridge, while other amino acids result in van der Waals interactions. |
label | Fig. 3 |
caption | FTIGTVTLK Epitope interaction with an antigen-binding pocket of HLA-A*68:01, of MHC I-HLA Allele. Here, Threonine at 2nd,5th, and 7th position in the epitope generate preferably a hydrogen bond due to the presence of partially charged positive and negative atoms, and also 4th position Glutamic acid and lysine at 8th position side chains can form a salt bridge, while other amino acids result in van der Waals interactions. |
p | FTIGTVTLK Epitope interaction with an antigen-binding pocket of HLA-A*68:01, of MHC I-HLA Allele. Here, Threonine at 2nd,5th, and 7th position in the epitope generate preferably a hydrogen bond due to the presence of partially charged positive and negative atoms, and also 4th position Glutamic acid and lysine at 8th position side chains can form a salt bridge, while other amino acids result in van der Waals interactions. |
figure | Fig. 4 ITLCFTLKR Epitope interaction with an antigen-binding pocket of HLA-A*68:01, of MHC I-HLA Allele. Here, Threonine at 2nd and 6th position, as well as cysteine at the 4th position in epitope, generate preferably a hydrogen bond due to the presence of partially charged positive and negative atoms, while other amino acids result in van der Waals interactions. |
label | Fig. 4 |
caption | ITLCFTLKR Epitope interaction with an antigen-binding pocket of HLA-A*68:01, of MHC I-HLA Allele. Here, Threonine at 2nd and 6th position, as well as cysteine at the 4th position in epitope, generate preferably a hydrogen bond due to the presence of partially charged positive and negative atoms, while other amino acids result in van der Waals interactions. |
p | ITLCFTLKR Epitope interaction with an antigen-binding pocket of HLA-A*68:01, of MHC I-HLA Allele. Here, Threonine at 2nd and 6th position, as well as cysteine at the 4th position in epitope, generate preferably a hydrogen bond due to the presence of partially charged positive and negative atoms, while other amino acids result in van der Waals interactions. |
figure | Fig. 5 VYQLRARSV Epitope interaction with an antigen-binding pocket of HLA-DRB1*07:01, of MHCII-HLA Allele, Here, Tyrosine at 2nd, Glutamine at 3rd position and Serine at 8th position in epitope generate preferably a hydrogen bond due to the presence of partially charged positive and negative atoms, and 5th and 7tharginine residue side chains can form a salt bridge, while other amino acids result in van der Waals interactions. |
label | Fig. 5 |
caption | VYQLRARSV Epitope interaction with an antigen-binding pocket of HLA-DRB1*07:01, of MHCII-HLA Allele, Here, Tyrosine at 2nd, Glutamine at 3rd position and Serine at 8th position in epitope generate preferably a hydrogen bond due to the presence of partially charged positive and negative atoms, and 5th and 7tharginine residue side chains can form a salt bridge, while other amino acids result in van der Waals interactions. |
p | VYQLRARSV Epitope interaction with an antigen-binding pocket of HLA-DRB1*07:01, of MHCII-HLA Allele, Here, Tyrosine at 2nd, Glutamine at 3rd position and Serine at 8th position in epitope generate preferably a hydrogen bond due to the presence of partially charged positive and negative atoms, and 5th and 7tharginine residue side chains can form a salt bridge, while other amino acids result in van der Waals interactions. |
figure | Fig. 6 Binding energy graphical plot for selected Epitope and HLA-Allelic pair. |
label | Fig. 6 |
caption | Binding energy graphical plot for selected Epitope and HLA-Allelic pair. |
p | Binding energy graphical plot for selected Epitope and HLA-Allelic pair. |
sec | 3.4 Molecular dynamics and simulation analysis RMSD values and Atomic fluctuation per amino acid residue were obtained for Epitopes interacting with the HLA-Allele structure; this analysis allows a perfect pair selection and validation. Moreover, only two Epitope pairs, i.e., ITLCFTLKR and VYQLRARSV, were identified as probable T-cell epitopes and as putative vaccine specimens. Fig. 7 shows the RMSD Plot and Atomic fluctuation per residue for the ITLCFTLKR- HLA-A*68:01 complex, the RMSD Plot and Atomic fluctuation per residue for the VYQLRARSV- HLA-DRB1*07:01 complex. Both results were positive as best interactions, for protein-ligand docked complexes must possess RMSD values from 0 to 1.0 Å as a preferred range [13]. Fig. 7 A. RMSD Plot for ITLCFTLKR- HLA-A*68:01 complex, for each amino acid residue by Molecular dynamics analysis, B. B-Factor (atomic fluctuation) values per amino acid residue for Epitope ITLCFTLKR- HLA-A*68:01 docked complex, C. RMSD Plot for VYQLRARSV- HLA-DRB1*07:01, for each amino acid residue by Molecular dynamics analysis, D. B-Factor (atomic fluctuation) values per amino acid residue for Epitope VYQLRARSV- HLA-DRB1*07:01 docked complex. |
label | 3.4 |
title | Molecular dynamics and simulation analysis |
p | RMSD values and Atomic fluctuation per amino acid residue were obtained for Epitopes interacting with the HLA-Allele structure; this analysis allows a perfect pair selection and validation. Moreover, only two Epitope pairs, i.e., ITLCFTLKR and VYQLRARSV, were identified as probable T-cell epitopes and as putative vaccine specimens. Fig. 7 shows the RMSD Plot and Atomic fluctuation per residue for the ITLCFTLKR- HLA-A*68:01 complex, the RMSD Plot and Atomic fluctuation per residue for the VYQLRARSV- HLA-DRB1*07:01 complex. Both results were positive as best interactions, for protein-ligand docked complexes must possess RMSD values from 0 to 1.0 Å as a preferred range [13]. Fig. 7 A. RMSD Plot for ITLCFTLKR- HLA-A*68:01 complex, for each amino acid residue by Molecular dynamics analysis, B. B-Factor (atomic fluctuation) values per amino acid residue for Epitope ITLCFTLKR- HLA-A*68:01 docked complex, C. RMSD Plot for VYQLRARSV- HLA-DRB1*07:01, for each amino acid residue by Molecular dynamics analysis, D. B-Factor (atomic fluctuation) values per amino acid residue for Epitope VYQLRARSV- HLA-DRB1*07:01 docked complex. |
figure | Fig. 7 A. RMSD Plot for ITLCFTLKR- HLA-A*68:01 complex, for each amino acid residue by Molecular dynamics analysis, B. B-Factor (atomic fluctuation) values per amino acid residue for Epitope ITLCFTLKR- HLA-A*68:01 docked complex, C. RMSD Plot for VYQLRARSV- HLA-DRB1*07:01, for each amino acid residue by Molecular dynamics analysis, D. B-Factor (atomic fluctuation) values per amino acid residue for Epitope VYQLRARSV- HLA-DRB1*07:01 docked complex. |
label | Fig. 7 |
caption | A. RMSD Plot for ITLCFTLKR- HLA-A*68:01 complex, for each amino acid residue by Molecular dynamics analysis, B. B-Factor (atomic fluctuation) values per amino acid residue for Epitope ITLCFTLKR- HLA-A*68:01 docked complex, C. RMSD Plot for VYQLRARSV- HLA-DRB1*07:01, for each amino acid residue by Molecular dynamics analysis, D. B-Factor (atomic fluctuation) values per amino acid residue for Epitope VYQLRARSV- HLA-DRB1*07:01 docked complex. |
p | A. RMSD Plot for ITLCFTLKR- HLA-A*68:01 complex, for each amino acid residue by Molecular dynamics analysis, B. B-Factor (atomic fluctuation) values per amino acid residue for Epitope ITLCFTLKR- HLA-A*68:01 docked complex, C. RMSD Plot for VYQLRARSV- HLA-DRB1*07:01, for each amino acid residue by Molecular dynamics analysis, D. B-Factor (atomic fluctuation) values per amino acid residue for Epitope VYQLRARSV- HLA-DRB1*07:01 docked complex. |
sec | 3.5 Toxicity analysis, Ramachandran Plot analysis, and population coverage results ToxinPred 4.0 server results (in Table 6 .) represent Finalized T-cell Epitopes that were nontoxic from the biochemical perspective. Table 6 Results of ToxinPred on probable antigens. Peptide/Probable antigen SVM score Hydrophilicity Molecular weight Toxicity FTIGTVTLK −1.36 −1.23 979.32 NON-TOXIN GTITVEELK −0.98 0.34 989.27 NON-TOXIN ITLCFTLKR −1.32 −0.41 1094.51 NON-TOXIN VFITLCFTL −1.21 −1.52 1056.46 NON-TOXIN WLLWPVTLA −1.18 −1.62 1098.49 NON-TOXIN VYQLRARSV −1.07 −0.12 1091.40 NON-TOXIN Ramachandran plot analysis, Fig. 8 A and B suggest that most of the residues are allowed in a favored region; this gives more confidence in the structural conformation for targeted T-Cell Epitopes. Fig. 8 A. 99.8% residues of the ITLCFTLKR Epitope were in the allowed and favored region under Ramachandran Plot analysis. B. 99.8% residues of the VYQLRARSV Epitope were in the allowed and favored region under Ramachandran Plot analysis. MHCPred results (Table 7 ) indicate quantitative estimation of IC50 values for both MHC I and MHC II alleles for respective Epitopes shows elicitation of an immune response when this data is deployed in a population coverage analysis. Table 7 MHCPred results depict IC50 Values for HLA Alleles and confidence of the prediction. HLA Alleles Amino acid groups Predicted -logIC50 (M) Predicted IC50 Value (nM) Confidence of prediction (Max = 1) HLA-A*68:01 FTIGTVTLK 7.116 76.56 1.00 HLA-A*11:01 ITLCFTLKR 7.028 93.76 1.00 HLA-A*68:01 ITLCFTLKR 6.282 522.40 0.78 HLA-DRB1*01:01 VYQLRARSV 7.624 23.77 0.89 HLA-DRB1*07:01 VYQLRARSV 6.734 184.50 0.89 IEDB population coverage analysis suggests that ITLCFTLKR and VYQLRARSV epitopes exhibit a suitable population coverage, as depicted in the graphical representation of Fig. 9 A and B. This allows only two probable Epitopes for the final selection of vaccine crafting. Fig. 9 A. Graphical representation of population conservancy analysis of ITLCFTLKR Epitope. B. Graphical representation of population conservancy analysis of VYQLRARSV Epitope. In Table 8 , ProtParam analysis further reveals the stability of the considered epitopes and final revelation of one epitope ITLCFTLKR is screened out. This particular Epitope exhibits an instability index of 35.68, with a grand average of hydropathicity (GRAVY) calculated was 0.844, and the estimated half-life for this peptide was determined to be 20 h for mammalian reticulocytes. Table 8 ProtParam analysis for selected epitopes. Selected Epitope GRAVY Score Instability Index (Indication) Estimated Half-Life(Mammalian reticulocytes) Theoretical pI Aliphatic Index ITLCFTLKR 0.844 35.68(Stable) 20 Hours 9.51 130.00 VYQLRARSV −0.067 70.73(Unstable) 100 Hours 10.83 118.89 |
label | 3.5 |
title | Toxicity analysis, Ramachandran Plot analysis, and population coverage results |
p | ToxinPred 4.0 server results (in Table 6 .) represent Finalized T-cell Epitopes that were nontoxic from the biochemical perspective. Table 6 Results of ToxinPred on probable antigens. Peptide/Probable antigen SVM score Hydrophilicity Molecular weight Toxicity FTIGTVTLK −1.36 −1.23 979.32 NON-TOXIN GTITVEELK −0.98 0.34 989.27 NON-TOXIN ITLCFTLKR −1.32 −0.41 1094.51 NON-TOXIN VFITLCFTL −1.21 −1.52 1056.46 NON-TOXIN WLLWPVTLA −1.18 −1.62 1098.49 NON-TOXIN VYQLRARSV −1.07 −0.12 1091.40 NON-TOXIN |
table-wrap | Table 6 Results of ToxinPred on probable antigens. Peptide/Probable antigen SVM score Hydrophilicity Molecular weight Toxicity FTIGTVTLK −1.36 −1.23 979.32 NON-TOXIN GTITVEELK −0.98 0.34 989.27 NON-TOXIN ITLCFTLKR −1.32 −0.41 1094.51 NON-TOXIN VFITLCFTL −1.21 −1.52 1056.46 NON-TOXIN WLLWPVTLA −1.18 −1.62 1098.49 NON-TOXIN VYQLRARSV −1.07 −0.12 1091.40 NON-TOXIN |
label | Table 6 |
caption | Results of ToxinPred on probable antigens. |
p | Results of ToxinPred on probable antigens. |
table | Peptide/Probable antigen SVM score Hydrophilicity Molecular weight Toxicity FTIGTVTLK −1.36 −1.23 979.32 NON-TOXIN GTITVEELK −0.98 0.34 989.27 NON-TOXIN ITLCFTLKR −1.32 −0.41 1094.51 NON-TOXIN VFITLCFTL −1.21 −1.52 1056.46 NON-TOXIN WLLWPVTLA −1.18 −1.62 1098.49 NON-TOXIN VYQLRARSV −1.07 −0.12 1091.40 NON-TOXIN |
tr | Peptide/Probable antigen SVM score Hydrophilicity Molecular weight Toxicity |
th | Peptide/Probable antigen |
th | SVM score |
th | Hydrophilicity |
th | Molecular weight |
th | Toxicity |
tr | FTIGTVTLK −1.36 −1.23 979.32 NON-TOXIN |
td | FTIGTVTLK |
td | −1.36 |
td | −1.23 |
td | 979.32 |
td | NON-TOXIN |
tr | GTITVEELK −0.98 0.34 989.27 NON-TOXIN |
td | GTITVEELK |
td | −0.98 |
td | 0.34 |
td | 989.27 |
td | NON-TOXIN |
tr | ITLCFTLKR −1.32 −0.41 1094.51 NON-TOXIN |
td | ITLCFTLKR |
td | −1.32 |
td | −0.41 |
td | 1094.51 |
td | NON-TOXIN |
tr | VFITLCFTL −1.21 −1.52 1056.46 NON-TOXIN |
td | VFITLCFTL |
td | −1.21 |
td | −1.52 |
td | 1056.46 |
td | NON-TOXIN |
tr | WLLWPVTLA −1.18 −1.62 1098.49 NON-TOXIN |
td | WLLWPVTLA |
td | −1.18 |
td | −1.62 |
td | 1098.49 |
td | NON-TOXIN |
tr | VYQLRARSV −1.07 −0.12 1091.40 NON-TOXIN |
td | VYQLRARSV |
td | −1.07 |
td | −0.12 |
td | 1091.40 |
td | NON-TOXIN |
p | Ramachandran plot analysis, Fig. 8 A and B suggest that most of the residues are allowed in a favored region; this gives more confidence in the structural conformation for targeted T-Cell Epitopes. Fig. 8 A. 99.8% residues of the ITLCFTLKR Epitope were in the allowed and favored region under Ramachandran Plot analysis. B. 99.8% residues of the VYQLRARSV Epitope were in the allowed and favored region under Ramachandran Plot analysis. |
figure | Fig. 8 A. 99.8% residues of the ITLCFTLKR Epitope were in the allowed and favored region under Ramachandran Plot analysis. B. 99.8% residues of the VYQLRARSV Epitope were in the allowed and favored region under Ramachandran Plot analysis. |
label | Fig. 8 |
caption | A. 99.8% residues of the ITLCFTLKR Epitope were in the allowed and favored region under Ramachandran Plot analysis. B. 99.8% residues of the VYQLRARSV Epitope were in the allowed and favored region under Ramachandran Plot analysis. |
p | A. 99.8% residues of the ITLCFTLKR Epitope were in the allowed and favored region under Ramachandran Plot analysis. B. 99.8% residues of the VYQLRARSV Epitope were in the allowed and favored region under Ramachandran Plot analysis. |
p | MHCPred results (Table 7 ) indicate quantitative estimation of IC50 values for both MHC I and MHC II alleles for respective Epitopes shows elicitation of an immune response when this data is deployed in a population coverage analysis. Table 7 MHCPred results depict IC50 Values for HLA Alleles and confidence of the prediction. HLA Alleles Amino acid groups Predicted -logIC50 (M) Predicted IC50 Value (nM) Confidence of prediction (Max = 1) HLA-A*68:01 FTIGTVTLK 7.116 76.56 1.00 HLA-A*11:01 ITLCFTLKR 7.028 93.76 1.00 HLA-A*68:01 ITLCFTLKR 6.282 522.40 0.78 HLA-DRB1*01:01 VYQLRARSV 7.624 23.77 0.89 HLA-DRB1*07:01 VYQLRARSV 6.734 184.50 0.89 |
table-wrap | Table 7 MHCPred results depict IC50 Values for HLA Alleles and confidence of the prediction. HLA Alleles Amino acid groups Predicted -logIC50 (M) Predicted IC50 Value (nM) Confidence of prediction (Max = 1) HLA-A*68:01 FTIGTVTLK 7.116 76.56 1.00 HLA-A*11:01 ITLCFTLKR 7.028 93.76 1.00 HLA-A*68:01 ITLCFTLKR 6.282 522.40 0.78 HLA-DRB1*01:01 VYQLRARSV 7.624 23.77 0.89 HLA-DRB1*07:01 VYQLRARSV 6.734 184.50 0.89 |
label | Table 7 |
caption | MHCPred results depict IC50 Values for HLA Alleles and confidence of the prediction. |
p | MHCPred results depict IC50 Values for HLA Alleles and confidence of the prediction. |
table | HLA Alleles Amino acid groups Predicted -logIC50 (M) Predicted IC50 Value (nM) Confidence of prediction (Max = 1) HLA-A*68:01 FTIGTVTLK 7.116 76.56 1.00 HLA-A*11:01 ITLCFTLKR 7.028 93.76 1.00 HLA-A*68:01 ITLCFTLKR 6.282 522.40 0.78 HLA-DRB1*01:01 VYQLRARSV 7.624 23.77 0.89 HLA-DRB1*07:01 VYQLRARSV 6.734 184.50 0.89 |
tr | HLA Alleles Amino acid groups Predicted -logIC50 (M) Predicted IC50 Value (nM) Confidence of prediction (Max = 1) |
th | HLA Alleles |
th | Amino acid groups |
th | Predicted -logIC50 (M) |
th | Predicted IC50 Value (nM) |
th | Confidence of prediction (Max = 1) |
tr | HLA-A*68:01 FTIGTVTLK 7.116 76.56 1.00 |
td | HLA-A*68:01 |
td | FTIGTVTLK |
td | 7.116 |
td | 76.56 |
td | 1.00 |
tr | HLA-A*11:01 ITLCFTLKR 7.028 93.76 1.00 |
td | HLA-A*11:01 |
td | ITLCFTLKR |
td | 7.028 |
td | 93.76 |
td | 1.00 |
tr | HLA-A*68:01 ITLCFTLKR 6.282 522.40 0.78 |
td | HLA-A*68:01 |
td | ITLCFTLKR |
td | 6.282 |
td | 522.40 |
td | 0.78 |
tr | HLA-DRB1*01:01 VYQLRARSV 7.624 23.77 0.89 |
td | HLA-DRB1*01:01 |
td | VYQLRARSV |
td | 7.624 |
td | 23.77 |
td | 0.89 |
tr | HLA-DRB1*07:01 VYQLRARSV 6.734 184.50 0.89 |
td | HLA-DRB1*07:01 |
td | VYQLRARSV |
td | 6.734 |
td | 184.50 |
td | 0.89 |
p | IEDB population coverage analysis suggests that ITLCFTLKR and VYQLRARSV epitopes exhibit a suitable population coverage, as depicted in the graphical representation of Fig. 9 A and B. This allows only two probable Epitopes for the final selection of vaccine crafting. Fig. 9 A. Graphical representation of population conservancy analysis of ITLCFTLKR Epitope. B. Graphical representation of population conservancy analysis of VYQLRARSV Epitope. |
figure | Fig. 9 A. Graphical representation of population conservancy analysis of ITLCFTLKR Epitope. B. Graphical representation of population conservancy analysis of VYQLRARSV Epitope. |
label | Fig. 9 |
caption | A. Graphical representation of population conservancy analysis of ITLCFTLKR Epitope. B. Graphical representation of population conservancy analysis of VYQLRARSV Epitope. |
p | A. Graphical representation of population conservancy analysis of ITLCFTLKR Epitope. B. Graphical representation of population conservancy analysis of VYQLRARSV Epitope. |
p | In Table 8 , ProtParam analysis further reveals the stability of the considered epitopes and final revelation of one epitope ITLCFTLKR is screened out. This particular Epitope exhibits an instability index of 35.68, with a grand average of hydropathicity (GRAVY) calculated was 0.844, and the estimated half-life for this peptide was determined to be 20 h for mammalian reticulocytes. Table 8 ProtParam analysis for selected epitopes. Selected Epitope GRAVY Score Instability Index (Indication) Estimated Half-Life(Mammalian reticulocytes) Theoretical pI Aliphatic Index ITLCFTLKR 0.844 35.68(Stable) 20 Hours 9.51 130.00 VYQLRARSV −0.067 70.73(Unstable) 100 Hours 10.83 118.89 |
table-wrap | Table 8 ProtParam analysis for selected epitopes. Selected Epitope GRAVY Score Instability Index (Indication) Estimated Half-Life(Mammalian reticulocytes) Theoretical pI Aliphatic Index ITLCFTLKR 0.844 35.68(Stable) 20 Hours 9.51 130.00 VYQLRARSV −0.067 70.73(Unstable) 100 Hours 10.83 118.89 |
label | Table 8 |
caption | ProtParam analysis for selected epitopes. |
p | ProtParam analysis for selected epitopes. |
table | Selected Epitope GRAVY Score Instability Index (Indication) Estimated Half-Life(Mammalian reticulocytes) Theoretical pI Aliphatic Index ITLCFTLKR 0.844 35.68(Stable) 20 Hours 9.51 130.00 VYQLRARSV −0.067 70.73(Unstable) 100 Hours 10.83 118.89 |
tr | Selected Epitope GRAVY Score Instability Index (Indication) Estimated Half-Life(Mammalian reticulocytes) Theoretical pI Aliphatic Index |
th | Selected Epitope |
th | GRAVY Score |
th | Instability Index (Indication) |
th | Estimated Half-Life(Mammalian reticulocytes) |
th | Theoretical pI |
th | Aliphatic Index |
tr | ITLCFTLKR 0.844 35.68(Stable) 20 Hours 9.51 130.00 |
td | ITLCFTLKR |
td | 0.844 |
td | 35.68(Stable) |
td | 20 Hours |
td | 9.51 |
td | 130.00 |
tr | VYQLRARSV −0.067 70.73(Unstable) 100 Hours 10.83 118.89 |
td | VYQLRARSV |
td | −0.067 |
td | 70.73(Unstable) |
td | 100 Hours |
td | 10.83 |
td | 118.89 |
sec | 4 Discussion In this study, SARS-COV-2 virus proteins were analyzed by using In-silico methods, and can be further utilized for vaccine trials as per earlier successes in the case of similar SARS-COV studies, and later observed in the development of polyclonal antibodies [27]. Here we obtained two epitopes ITLCFTLKR and VYQLRARSV after successful docking and molecular dynamics simulation; furthermore, these two epitopes were subjected to population coverage and toxicity analysis. Similarly, in another study, for MERS-COV, nucleocapsid peptides were used for T-cell epitope prediction, and found to be successful [36]. The IEDB and NCBI-GenBank database were fully deployed to analyze sequence homology, to predict targets for COV-2 in case of viral protein identification as per the related studies [14], as ViPR (Virus Pathogen database analysis resource) are also dependent on IEDB and GenBank primarily [30]. We analyzed five different proteins in SARS-COV-2 for the present study (because of their availability in the NCBI-GenBank database and importance in a structural role in SARS-COV-2 [14] and finally revealed T–Cell epitopes that can be used for wet lab considerations and time savings. In a very recent study, different epitopes were found for SARS-COV-2, based on In-silico approaches and focused on only surface glycoprotein [3], but in our research study there are many differences as we analyzed a different group of proteins from SARS-COV-2 to sort out short length T-Cell epitopes specific to MHC I as well as MHC II diversified HLA-Alleles. It is reported for SARS-CoV HLA-B*4601, HLA-B*0703, HLA-DR B1*1202 are activated [26], interaction with different MHC I and II allelic forms namely HLA-A*11:01, HLA-A*68:01, HLA-DRB1*01:01 and HLA-DRB1*07:01. CD4+ and CD8+ memory T cells. Based on prior literature, it is anticipated that it can persist for four years as in the case of SARS-CoV recovered individuals, show T-cell proliferation, DTH response, and production of IFN-γ [12]. We surmise that our screen can be more effective and useful. Primarily molecular docking reveals three Epitopes, but as we proceed to Molecular dynamic simulations, it reveals best interactions for two epitopes i.e., ITLCFTLKR and VYQLRARSV, with acceptable stability analyzed with the help of MDWeb and identified by using best available tools with easy-to-apply methods. One recent study was found to be focused on developing monoclonal antibodies like CR-3022 against the Spike protein of SARS-COV-2 that also exhibits interaction with ACE (Angiotensin Converting Enzyme) enzyme of the Human respiratory epithelium and requires complex neutralizing mechanisms for several binding domains [35], whereas in our study the putative T-cell epitopes can directly interact with MHC-Allelic sets that can be useful for developing immunization against SARS-COV-2. ProtParam [42] analysis further reveals the stability of the considered epitopes, and final revelation of one epitope ITLCFTLKR is screened out. This particular Epitope shows an instability index of 35.68 with a grand average of hydropathicity (GRAVY) calculated as 0.844, and the estimated half-life for this peptide was determined to be 20 h for mammalian reticulocytes. Satisfactory population coverage was observed for targeted epitopes - HLA allelic complexes at the worldwide, South Asia, and India level. The biochemical integrity in epitope structure was further evident by deploying Ramachandran plot analysis. Both epitopes were non-toxic, non-allergenic, and possess good antigenicity. In a similar study of the preliminary analysis of COVID-19 vaccine targets [1] the investigators tried to use the spike protein and nucleo-capsid protein sequences of SARS-COV that are homologous to some extant with SARS-COV-2 proteins to determine multiple different epitopes for Vaccine prediction, but in our study out of five two proteins namely ORF-3a and ORF-7a specific to SARS-COV-2 were found to be putative T-cell epitope determinants that create useful information; these proteins are also important for viral replication [41]. Both biochemical parameters, as well as an advanced HMM and ANN based algorithm in selected Immuno-informatics tools, were very useful to present a clear picture of predicted epitopes for crafting vaccine against SARS-COV-2. The only limitation that can be considered as future scope is that these easily synthesized peptides should be tested with In-vitro study for more practical validation. |
label | 4 |
title | Discussion |
p | In this study, SARS-COV-2 virus proteins were analyzed by using In-silico methods, and can be further utilized for vaccine trials as per earlier successes in the case of similar SARS-COV studies, and later observed in the development of polyclonal antibodies [27]. Here we obtained two epitopes ITLCFTLKR and VYQLRARSV after successful docking and molecular dynamics simulation; furthermore, these two epitopes were subjected to population coverage and toxicity analysis. Similarly, in another study, for MERS-COV, nucleocapsid peptides were used for T-cell epitope prediction, and found to be successful [36]. The IEDB and NCBI-GenBank database were fully deployed to analyze sequence homology, to predict targets for COV-2 in case of viral protein identification as per the related studies [14], as ViPR (Virus Pathogen database analysis resource) are also dependent on IEDB and GenBank primarily [30]. We analyzed five different proteins in SARS-COV-2 for the present study (because of their availability in the NCBI-GenBank database and importance in a structural role in SARS-COV-2 [14] and finally revealed T–Cell epitopes that can be used for wet lab considerations and time savings. In a very recent study, different epitopes were found for SARS-COV-2, based on In-silico approaches and focused on only surface glycoprotein [3], but in our research study there are many differences as we analyzed a different group of proteins from SARS-COV-2 to sort out short length T-Cell epitopes specific to MHC I as well as MHC II diversified HLA-Alleles. |
p | It is reported for SARS-CoV HLA-B*4601, HLA-B*0703, HLA-DR B1*1202 are activated [26], interaction with different MHC I and II allelic forms namely HLA-A*11:01, HLA-A*68:01, HLA-DRB1*01:01 and HLA-DRB1*07:01. CD4+ and CD8+ memory T cells. Based on prior literature, it is anticipated that it can persist for four years as in the case of SARS-CoV recovered individuals, show T-cell proliferation, DTH response, and production of IFN-γ [12]. We surmise that our screen can be more effective and useful. Primarily molecular docking reveals three Epitopes, but as we proceed to Molecular dynamic simulations, it reveals best interactions for two epitopes i.e., ITLCFTLKR and VYQLRARSV, with acceptable stability analyzed with the help of MDWeb and identified by using best available tools with easy-to-apply methods. One recent study was found to be focused on developing monoclonal antibodies like CR-3022 against the Spike protein of SARS-COV-2 that also exhibits interaction with ACE (Angiotensin Converting Enzyme) enzyme of the Human respiratory epithelium and requires complex neutralizing mechanisms for several binding domains [35], whereas in our study the putative T-cell epitopes can directly interact with MHC-Allelic sets that can be useful for developing immunization against SARS-COV-2. ProtParam [42] analysis further reveals the stability of the considered epitopes, and final revelation of one epitope ITLCFTLKR is screened out. This particular Epitope shows an instability index of 35.68 with a grand average of hydropathicity (GRAVY) calculated as 0.844, and the estimated half-life for this peptide was determined to be 20 h for mammalian reticulocytes. |
p | Satisfactory population coverage was observed for targeted epitopes - HLA allelic complexes at the worldwide, South Asia, and India level. The biochemical integrity in epitope structure was further evident by deploying Ramachandran plot analysis. Both epitopes were non-toxic, non-allergenic, and possess good antigenicity. In a similar study of the preliminary analysis of COVID-19 vaccine targets [1] the investigators tried to use the spike protein and nucleo-capsid protein sequences of SARS-COV that are homologous to some extant with SARS-COV-2 proteins to determine multiple different epitopes for Vaccine prediction, but in our study out of five two proteins namely ORF-3a and ORF-7a specific to SARS-COV-2 were found to be putative T-cell epitope determinants that create useful information; these proteins are also important for viral replication [41]. Both biochemical parameters, as well as an advanced HMM and ANN based algorithm in selected Immuno-informatics tools, were very useful to present a clear picture of predicted epitopes for crafting vaccine against SARS-COV-2. The only limitation that can be considered as future scope is that these easily synthesized peptides should be tested with In-vitro study for more practical validation. |
sec | 5 Conclusion ITLCFTLKR epitope was selected for crafting and designing a vaccine against SARS-COV-2. This particular epitope has good antigenicity, exhibits active binding with MHC HLA-Alleles, and has maximum population coverage for different geographical regions. Therefore, this peptide can be further used in vaccine design against SARS-COV-2 after wet lab verification. This novel approach can also assist life science research groups to reduce time, monetary expenditures, as well as physical hit-trial efforts. |
label | 5 |
title | Conclusion |
p | ITLCFTLKR epitope was selected for crafting and designing a vaccine against SARS-COV-2. This particular epitope has good antigenicity, exhibits active binding with MHC HLA-Alleles, and has maximum population coverage for different geographical regions. Therefore, this peptide can be further used in vaccine design against SARS-COV-2 after wet lab verification. This novel approach can also assist life science research groups to reduce time, monetary expenditures, as well as physical hit-trial efforts. |
sec | Ethical approval I confirm that authors did not perform any experiments on human or animals. |
title | Ethical approval |
p | I confirm that authors did not perform any experiments on human or animals. |
sec | Declaration of competing interest I confirm that the authors hereby declare they that have no conflict of interest. |
title | Declaration of competing interest |
p | I confirm that the authors hereby declare they that have no conflict of interest. |
back | Appendix A Supplementary data The following is the Supplementary data to this article:Multimedia component 1 Acknowledgement We acknowledge the support provided by Computational lab of Lovely professional University. Appendix A Supplementary data to this article can be found online at https://doi.org/10.1016/j.imu.2020.100338. |
sec | Appendix A Supplementary data The following is the Supplementary data to this article:Multimedia component 1 |
label | Appendix A |
title | Supplementary data |
p | The following is the Supplementary data to this article:Multimedia component 1 |
caption | Multimedia component 1 |
title | Multimedia component 1 |
ack | Acknowledgement We acknowledge the support provided by Computational lab of Lovely professional University. |
title | Acknowledgement |
p | We acknowledge the support provided by Computational lab of Lovely professional University. |
footnote | Appendix A Supplementary data to this article can be found online at https://doi.org/10.1016/j.imu.2020.100338. |
label | Appendix A |
p | Supplementary data to this article can be found online at https://doi.org/10.1016/j.imu.2020.100338. |
Annnotations TAB TSV DIC JSON TextAE
last updated at 2021-04-26 02:08:56 UTC
- Denotations: 1
- Blocks: 0
- Relations: 0