Results and discussion Virtual screening and bioavailability analysis The present study initiated by extracting structurally similar compounds to crizotinib from the Pubchem database. The crizotinib was used was used as a query molecule. About 99 % similarity cutoff was maintained in the analysis. The results yield a total of 63 compounds. These compounds were utilized for our further study. Molinspiration program was used to predict the bioavailability of crizotinib and the lead compounds. Initially, crizotinib properties were calculated with the help of Molinspiration program (Fig. 1) and used as a control for screening the other lead compounds. The result is shown in Table 1. It is clear from the table that 3 compound such as CID: 11656144, CID: 11502981 and CID: 58659185 showed violations for the rule of five. The remaining 60 compounds have zero violations for the rule of five. This brings to the conclusion that bioavailability of these 60 compounds was significantly better in our dataset. Fig. 1 Molinspiration property explorer showing molecular properties of crizotinib Table 1 Calculations of molecular properties of crizotinib and lead compound using molinspiration S. no Compound miLogP TPSA MW nON nOHNH nviolations Volume 1 Crizotinib 4.006 78.002 450.345 6 3 0 375.175 2 CID:11597571 4.006 78.002 450.345 6 3 0 375.175 3 CID: 11626560 4.006 78.002 450.345 6 3 0 375.175 4 CID: 53234260 4.006 78.002 450.345 6 3 0 375.175 5 CID: 53234326 4.006 78.002 450.345 6 3 0 375.175 6 CID: 56671814 4.006 78.002 450.345 6 3 0 375.175 7 CID: 60197531 4.006 78.002 450.345 6 3 0 375.175 8 CID: 60197626 4.006 78.002 450.345 6 3 0 375.175 9 CID: 60198523 4.006 78.002 450.345 6 3 0 375.175 10 CID: 60198524 4.006 78.002 450.345 6 3 0 375.175 11 CID: 60198525 4.006 78.002 450.345 6 3 0 375.175 12 CID: 60199015 4.006 78.002 450.345 6 3 0 375.175 13 CID: 60199016 4.006 78.002 450.345 6 3 0 375.175 14 CID: 60199073 4.006 78.002 450.345 6 3 0 375.175 15 CID: 60199075 4.006 78.002 450.345 6 3 0 375.175 16 CID: 60199076 4.006 78.002 450.345 6 3 0 375.175 17 CID: 60199077 4.006 78.002 450.345 6 3 0 375.175 18 CID: 62705017 4.006 78.002 450.345 6 3 0 375.175 19 CID: 68625002 4.752 78.002 478.399 6 3 0 408.564 20 CID: 54613769 4.006 78.002 450.345 6 3 0 375.175 21 CID: 11662380 4.006 78.002 450.345 6 3 0 375.175 22 CID: 11626823 4.389 78.002 464.372 6 3 0 391.977 23 CID: 58659191 4.098 78.002 468.335 6 3 0 380.107 24 CID: 44560358 3.643 78.002 436.318 6 3 0 358.589 25 CID: 71239831 4.479 78.002 490.41 6 3 0 414.441 26 CID: 71239833 4.479 78.002 490.41 6 3 0 414.441 27 CID: 71240010 4.479 78.002 490.41 6 3 0 414.441 28 CID: 71240011 4.479 78.002 490.41 6 3 0 414.441 29 CID: 11496366 4.602 69.213 464.372 6 2 0 392.118 30 CID: 11562021 4.978 69.213 478.399 6 2 0 408.92 31 CID: 11626824 4.602 69.213 464.372 6 2 0 392.118 32 CID: 11656144 5.275 69.213 492.426 6 2 1 425.507 33 CID: 11598102 4.734 78.002 476.383 6 3 0 397.989 34 CID: 11641497 3.508 81.24 479.387 7 3 0 404.735 35 CID: 11690598 3.492 78.002 433.89 6 3 0 366.571 36 CID: 68563708 3.492 78.002 433.89 6 3 0 366.571 37 CID: 11562217 4.387 93.005 489.382 7 2 0 409.218 38 CID: 11612136 4.556 75.209 451.329 6 2 0 371.758 39 CID: 58659130 3.492 78.002 433.89 6 3 0 366.571 40 CID: 11625675 4.921 65.975 409.292 5 2 0 339.53 41 CID: 67084493 4.58 78.002 476.383 6 3 0 398.204 42 CID: 11676204 3.967 78.002 424.307 6 3 0 352.147 43 CID: 11684380 4.985 69.213 478.399 6 2 0 408.92 44 CID: 58659192 4.825 78.002 494.373 6 3 0 402.92 45 CID: 59599446 3.445 98.230 480.371 7 4 0 399.671 46 CID: 11503318 4.357 78.002 450.345 6 3 0 375.175 47 CID: 11510387 4.086 78.002 436.318 6 3 0 358.374 48 CID: 11568619 4.357 78.002 450.345 6 3 0 375.175 49 CID: 11575401 3.816 78.002 422.291 6 3 0 341.572 50 CID: 11647760 4.086 78.002 436.318 6 3 0 358.374 51 CID: 58659136 4.086 78.002 436.318 6 3 0 358.374 52 CID: 58659189 4.291 78.002 446.382 6 3 0 386.805 53 CID: 72986690 4.357 78.002 450.345 6 3 0 375.175 54 CID: 11502981 5.581 65.975 435.33 5 2 1 362.773 55 CID: 11676140 4.842 65.975 421.303 5 2 0 345.971 56 CID: 58659141 4.939 75.209 465.356 6 2 0 388.56 57 CID: 11705849 4.978 69.213 490.41 6 2 0 414.932 58 CID: 11719356 3.956 78.002 450.345 6 3 0 375.175 59 CID: 11647759 4.199 78.002 436.318 6 3 0 358.374 60 CID: 21110753 4.058 78.447 480.371 7 2 0 401.318 61 CID: 58659185 5.304 65.975 423.319 5 2 1 356.331 62 CID: 21110757 4.182 65.975 381.238 5 2 0 306.141 63 CID: 73386634 4.182 65.975 381.238 5 2 0 306.141 64 CID: 11647795 4.285 75.209 437.302 6 2 0 354.956 Bold indicates ADME screened compounds based on Lipinsiki rule of 5 It is bare that for passing oral bioavailability criteria, number of rotatable bond should be <10 (Oprea 2000). Therefore, we have made the further refinement of these hits by restricting the number of rotatable bonds to 10. The result is presented in Table 2. It is clear from the Table 2 that almost all the 60 compounds screened from the ADME analysis possess reasonable number of rotatable bonds (<10). This result indicates that these compounds may have the potential to become a lead compound. However, toxicity is also one of the important issue could be addressed for all the lead compounds before its selection. Table 2 Details of number of rotatable bonds S. no Compound nrotb 1 Crizotinib 5 2 CID: 11597571 5 3 CID: 11626560 5 4 CID: 53234260 5 5 CID: 53234326 5 6 CID: 56671814 5 7 CID: 60197531 5 8 CID: 60197626 5 9 CID: 60198523 5 10 CID: 60198524 5 11 CID: 60198525 5 12 CID: 60199015 5 13 CID: 60199016 5 14 CID: 60199073 5 15 CID: 60199075 5 16 CID: 60199076 5 17 CID: 60199077 5 18 CID: 62705017 5 19 CID: 68625002 6 20 CID: 54613769 5 21 CID: 11662380 5 22 CID: 11626823 6 23 CID: 58659191 5 24 CID: 44560358 5 25 CID: 71239831 5 26 CID: 71239833 5 27 CID: 71240010 5 28 CID: 71240011 5 29 CID: 11496366 5 30 CID: 11562021 6 31 CID: 11626824 5 32 CID: 11598102 5 33 CID: 11641497 7 34 CID: 11690598 5 35 CID: 68563708 5 36 CID: 11562217 5 37 CID: 11612136 5 38 CID: 58659130 5 39 CID: 11625675 5 40 CID: 67084493 6 41 CID: 11676204 7 42 CID: 11684380 6 43 CID: 58659192 5 44 CID: 58659228 5 45 CID: 11503318 6 46 CID: 11510387 5 47 CID: 11568619 6 48 CID: 11575401 5 49 CID: 11647760 5 50 CID: 58659136 5 51 CID: 58659189 5 52 CID: 72986690 6 53 CID: 11676140 5 54 CID: 58659141 6 55 CID: 11705849 5 56 CID: 11719356 5 57 CID: 11647759 6 58 CID: 21110753 7 59 CID: 21110757 4 60 CID: 73386634 4 61 CID: 11647795 5 Number of rotatable bonds <10 Toxicity analysis The primary objective behind the failure of the majority of compounds in drug discovery process is the issues related to pharmacokinetics and toxicity. In the present investigation, these issues were addressed with the help of OSIRIS property explorer program. The pharmacokinetic property of a lead compound can be investigated by utilizing the parameters such as clogP and logS. The result is shown in Table 3. clogP is an entrenched measure of the compound’s hydrophilicity. The high log P values may cause poor retention because of the compound’s low hydrophilicity. It has been demonstrated that for compounds to have a reasonable probability of being well absorbed, their log P value must not be greater than 5.0. It is clear from the table that log P values of all the 60 compounds found to be in the acceptable criteria. Table 3 Toxicity risks and physicochemical properties of crizotinib and virtual compounds predicted by OSIRIS property explorer S. no Compound ID Mutagenic Tumorigenic Reproductive effective cLogP Solubility Drug likeness Drug score 1 Crizotinib No No No 3.54 −5.26 3.12 0.52 2 CID: 11597571 No No No 3.54 −5.26 3.12 0.52 3 CID: 11626560 No No No 3.54 −5.26 3.12 0.52 4 CID: 53234260 No No No 3.54 −5.26 3.12 0.52 5 CID: 53234326 No No No 3.54 −5.26 3.12 0.52 6 CID: 56671814 No No No 3.54 −5.26 3.12 0.52 7 CID: 60197531 No No No 3.54 −5.26 3.12 0.52 8 CID: 60197626 No No No 3.54 −5.26 3.12 0.52 9 CID: 60198523 No No No 3.54 −5.26 3.12 0.52 10 CID: 60198524 No No No 3.54 −5.26 3.12 0.52 11 CID: 60198525 No No No 3.54 −5.26 3.12 0.52 12 CID: 60199015 No No No 3.54 −5.26 3.12 0.52 13 CID: 60199016 No No No 3.54 −5.26 3.12 0.52 14 CID: 60199073 No No No 3.54 −5.26 3.12 0.52 15 CID: 60199075 No No No 3.54 −5.26 3.12 0.52 16 CID: 60199076 No No No 3.54 −5.26 3.12 0.52 17 CID: 60199077 No No No 3.54 −5.26 3.12 0.52 18 CID: 62705017 No No No 3.54 −5.26 3.12 0.52 19 CID: 68625002 No No No 3.78 −5.69 3.68 0.46 20 CID: 54613769 No No No 3.54 −5.26 3.22 0.53 21 CID: 11662380 No No Yes 3.54 −5.26 2.78 0.42 22 CID: 11626823 No No No 3.29 −5.78 3.45 0.48 23 CID: 58659191 No Yes No 3.64 −5.58 3.17 0.29 24 CID: 44560358 No No No 3.25 −5.19 2.42 0.54 25 CID: 71239831 No No No 4.19 −5.96 1.79 0.38 26 CID: 71239833 No No No 4.19 −5.96 1.45 0.37 27 CID: 71240010 No No No 4.19 −5.96 1.79 0.38 28 CID: 71240011 No No No 4.19 −5.96 1.79 0.38 29 CID: 11496366 No No No 3.79 −4.90 7.62 0.54 30 CID: 11562021 No No No 4.2 −5.22 7.51 0.48 31 CID: 11626824 No No No 3.79 −4.90 7.62 0.54 32 CID: 11598102 No No No 3.89 −6.11 2.11 0.41 33 CID: 11641497 No No No 2.38 −4.53 4.34 0.49 34 CID: 11690598 No No No 3.03 −4.84 3.12 0.6 35 CID: 68563708 No No No 3.03 −4.84 3.12 0.6 36 CID: 11562217 No No No 3.44 −5.35 2.82 0.29 37 CID: 11612136 No No No 3.68 −5.4 −0.93 0.33 38 CID: 58659130 No No No 3.03 −4.84 3.22 0.60 39 CID: 11625675 No No No 3.75 −5.39 2.56 0.53 40 CID: 67084493 No No No 4.04 −6.15 1.21 0.37 41 CID: 11676204 No No No 2.28 −4.96 3.76 0.62 42 CID: 11684380 No No No 3.55 −5.42 7.62 0.54 43 CID: 58659192 No Yes No 4 −6.42 2.17 0.22 44 CID: 59599446 No No No 2.47 −5.33 4.07 0.53 45 CID: 11503318 No No No 3.01 −5.73 0.63 0.42 46 CID: 11510387 No No No 3.30 −6.39 3.37 0.47 47 CID: 11568619 No No No 3.01 −5.73 0.63 0.42 48 CID: 11575401 No No No 2.85 −4.72 3.34 0.63 49 CID: 11647760 No No No 3.20 −4.99 3.81 0.58 50 CID: 58659136 No No No 3.20 −4.99 3.81 0.48 51 CID: 58659189 Yes No No 3.78 −5.29 4.46 0.31 52 CID: 72986690 No No No 3.01 −5.73 0.63 0.42 53 CID: 11676140 No No No 4.16 −5.75 1.66 0.44 54 CID: 58659141 No No No 3.44 −5.91 −0.32 0.33 55 CID: 11705849 No No No 4.15 −5.74 2.96 0.42 56 CID: 11719356 No No No 3.63 −4.96 2.31 0.53 57 CID: 11647759 No No No 2.61 −5.24 3.45 0.57 58 CID: 21110753 No No No 2.52 −4.67 3.67 0.59 59 CID: 21110757 No No No 3.00 −5.47 2.35 0.56 60 CID: 73386634 No No No 3.00 −5.47 2.35 0.56 61 CID: 11647795 No No No 3.34 −5.13 0.85 0.48 Drug solubility normally affects the absorption and distribution characteristics of a compound. Infact, insufficient solubility of drug can lead to poor absorption (Lipinski et al. 1997). Our evaluated log S worth is a unit stripped logarithm (base 10) of a compound’s dissolvability measured in mol/liter. There are more than 80 % of the drugs available in the market have an (expected) log S value greater than −4. It is clear from the Table 3 that the solubility of the 60 lead compounds was found in the comparable zone with that of standard drugs to fulfill the requirements of solubility and this could be regarded as a candidate drug for oral absorption. Drug likeness The drug likeliness is imperative parameter because drug like molecules exhibit favorable absorption, distribution, metabolism, excretion, toxicological (ADMET) parameters (Tetko 2005). In this study, Osiris program was utilized to calculate the drug-likeness of crizotinib and other virtually screened compounds (Sander 2001). It is worth stressing that the drug likeness value of 60 lead compounds was found to be in acceptable criteria. Drug score and toxicity The information assessed in Table 3 shows that the 57 lead compounds should be non-mutagenic and non-tumorigenic impacts when run through the mutagenicity assessment system comparable with standard drugs used. The compounds such as CID: 11662380, CID: 58659189, CID: 58659191, and CID: 58659192 failed to pass through the Osiris program and showed mutagenic and tumorigenic effects. We have also analyzed the overall drug score (DS) for all the lead compounds and compared with that of crizotinib. The score consolidates drug- likeness, miLogP, logS, molecular weight, and toxicity risks. The DS score could also be an important parameter to judge the compound’s potential to meet all requirements to qualify for a drug. The result is demonstrated in Table 3. The reported lead compounds demonstrated moderate to good DS as compared with standard drug crizotinib. In our dataset, 17 lead compounds showed similar drug score as that of crizotinib. About five compounds such as CID: 11690598, CID: 68563708, CID: 58659130, CID: 11676204 and CID: 11575401 showed a drug score of 0.6 and above. Therefore, further examination was carried out with 57 compounds. Molecular docking Molecular docking program was employed to find out the binding affinity of lead compounds with the target protein. Docking analysis was performed twice to eliminate the false positive. The docking results are shown in Table 4. The docking score of native-type ALK-crizotinib complex was found to be 5312 and for the mutant-type ALK-crizotinib complex was found to be 4602. The lesser docking score of mutant complex clearly indicates that double mutation (L1196M and G1269A) significantly affects the binding of crizotinib with the ALK structures. It is believed that a potential lead compound is the one should have higher docking scoring than the existing drug molecule, crizotinib. Therefore, we have examined docking score for all the 57 hits both with the native type and with mutant type ALK systems. 16 hits showed higher docking score only with mutant type ALK than native type ALK and 17 more hits from our dataset showed similar dock score to that of crizotinib. Most importantly, 10 hits from our dataset showed higher score both in the native type as well as with mutant type. For instance, CID 11562217 molecule showed the highest docking score among the 10 hits in our data set. The docking score of native-type ALK-CID 11562217 complex was found to be 5662 and for the mutant-type ALK-CID 11562217 complex was found to be 5908. This result indicates that CID 11562217 has a better binding affinity not only with the native type but also with mutant ALK as compared to the crizotinib. Table 4 Docking score of the crizotinib and lead compounds obtained from PubChem database against the target structure S. no Compound ID Score 2XP2 4ANS 1 Crizotinib 5312 5226 2 CID: 11597571 5312 5226 3 CID: 11626560 5312 5226 4 CID: 53234260 5312 5226 5 CID: 53234326 5312 5226 6 CID: 56671814 5312 5226 7 CID: 60197531 5312 5226 8 CID: 60197626 5312 5226 9 CID: 60198523 5312 5226 10 CID: 60198524 5312 5226 11 CID: 60198525 5312 5226 12 CID: 60199015 5312 5226 13 CID: 60199016 5312 5226 14 CID: 60199073 5312 5226 15 CID: 60199075 5312 5226 16 CID: 60199076 5312 5226 17 CID: 60199077 5312 5226 18 CID: 62705017 5312 5226 19 CID: 68625002 5200 5342 20 CID: 54613769 5298 5308 21 CID: 11626823 5048 5226 22 CID: 44560358 5012 5386 23 CID: 71239831 5440 5776 24 CID: 71239833 5440 5776 25 CID: 71240010 5426 5504 26 CID: 71240011 5426 5504 27 CID: 11496366 5412 5420 28 CID: 11562021 5510 5492 29 CID: 11626824 5412 5420 30 CID: 11598102 5292 5294 31 CID: 11641497 5450 5138 32 CID: 11690598 4906 5138 33 CID: 68563708 4906 5138 34 CID: 11562217 5662 5908 35 CID: 11612136 5144 5032 36 CID: 58659130 5108 5294 37 CID: 11625675 4746 5052 38 CID: 67084493 4950 5334 39 CID: 11676204 4964 4962 40 CID: 11684380 4964 5424 41 CID: 59599446 5434 5704 42 CID: 11503318 5110 5138 43 CID: 11510387 5124 5372 44 CID: 11568619 5110 5138 45 CID: 11575401 4886 4826 46 CID: 11647760 5124 5372 47 CID: 58659136 5124 5372 48 CID: 72986690 5110 5138 49 CID: 11676140 4906 5484 50 CID: 58659141 5118 5278 51 CID: 11705849 5186 5370 52 CID: 11719356 5040 5238 53 CID: 11647759 5026 5118 54 CID: 21110753 5390 5526 55 CID: 21110757 4408 4604 56 CID: 73386634 4408 4604 57 CID: 11647795 5268 5212 Bold indicates the lead compounds showed higher binding score than crizotinib It is also to be noted that the pharmacokinetic and pharmacodynamic investigation of CID 11562217 indicated better results than the other lead compounds explored in our study (Fig. 2). The two dimensional structure of crizotinib was compared with CID 11562217 to get the structural attributes and the result is demonstrated in Fig. 3. It demonstrates that CID11562217 is a nitrile enhanced crizotinib. It is worth stressing that nitrile compounds with cyanide functional group could possess potential anti-tumor effects (US Patent 20060128724). The literature evidence also highlights that our lead molecule has kinase inhibiting effects. Further, the cyano-containing analogues were able to produce DNA–DNA cross-linking. The reduced DNA cross-linking was paralleled by a similar reduction in cytotoxicity indicating a relationship between cross-linking and anti-tumor effect (Jesson et al. 1987). Therefore, further validation of CID 11562217 compound was done with the help of molecular dynamics simulation study. Fig. 2 Osiris property explorer showing drug-likeliness properties of CID11562217 Fig. 3 Structure comparison between (a) crizotinib and (b) CID11562217 Molecular dynamics simulation Molecular dynamics simulation study was carried out with the help of GROMACS package 4.5.3 to explore the stability of the complex structures. In particular, the parameter, RMSD, was examined from the trajectory file and used for analyzing the complex stability. RMSD investigation can give a thought of how much the three-dimensional structure has deviated over the time. The result is shown in Fig. 4. Native type ALK-crizotinib complex structure acquired ~0.34 nm at 1000 ps during the simulations, while mutant type ALK-crizotinib complex structure acquired ~0.28 nm of backbone RMSD at 1000 ps. On the other hand, native-type ALK-CID11562217 structure acquired ~0.18 nm of backbone RMSD while mutant-type ALK-CID11562217 complex structure acquired ~0.22 nm of backbone RMSD at 1000 ps. Between a period of 2000–5000 ps, native type ALK-crizotinib complex structure maintains a RMSD value of ~0.30 nm whereas mutant type ALK-crizotinib complex structure showed a deviation from ~0.25 to ~0.36 nm. In the virtual complex, native-type ALK-CID11562217 structure showed a RMSD value between ~0.18 and ~0.20 nm and mutant type ALK-CID11562217 complex structure maintains a RMSD value of ~0.24 nm. From the period of 5000–10,000 ps, native-type ALK-crizotinib complex structure maintains a RMSD value of ~0.34 nm while, mutant type ALK-crizotinib complex has deviated from ~0.32 to ~0.36 nm. On the contrary, native-type ALK-CID11562217 complex structure maintains a RMSD value of ~0.25 nm while mutant type ALK-CID11562217 complex structure maintains a RMSD value of ~0.20 to ~0.24 nm. From the beginning of 11,000 ps to the end of 15,000 ps, mutant type ALK-crizotinib complex structure showed higher deviation and attains a RMSD value of ~0.44 nm while native-type ALK-crizotinib complex structure maintains a RMSD value of ~0.23 nm. Mutant type ALK-CID11562217 complex structure maintains a RMSD value of ~0.25 nm in this simulation period. Between a period of 16,000–19,000 ps, native type ALK-crizotinib complex structure maintains a RMSD value of ~0.35 nm whereas mutant type ALK-crizotinib complex structure showed a deviation from ~0.43 to ~0.45 nm. For instance, native-type ALK-CID11562217 structure showed a RMSD value of ~0.25 nm and mutant type ALK-CID11562217 complex structure maintains a RMSD value of ~0.22 nm. At the end of 20,000 ps the mutant type ALK-crizotinib complex structure attained RMSD of ~0.40 nm and native type ALK-crizotinib complex structure attained RMSD of ~0.35 nm. This clearly indicates that ALK double mutation disturb the structural stability and also its function. It is worth stressing that native and mutant type ALK-CID 11562217 able to maintain a RMSD of ~0.24 nm. Overall, significant difference in RMSD value observed between the crizotinib and CID 11562217 complex system. The lesser RMSD value of CID 11562217 complex demonstrates the stable binding of CID 11562217 with both native and mutant type ALK structures. Fig. 4 Root mean square deviations correspond to native-type ALK-crizotinib complex (black), mutant-type ALK-crizotinib complex (red), native-type ALK-CID11562217 complex (green) and mutant-type ALK-CID11562217 complex (blue) along the MD simulation at 300 K