Materials and methods Data set The three-dimensional (3D) structure of native and mutant (L1196M, G1269A) ALK structures were retrieved from the crystal structures of the Brookhaven Protein Data Bank (PDB) for the analysis (Berman et al. 2000). The corresponding PDB codes were 2XP2 and 4ANS for the native and mutant structures, respectively (Cui et al. 2011). Crizotinib was used as the small molecule for our study. The SMILES strings of the crizotinib and the lead molecules were collected from PubChem (Feldman et al. 2006) and submitted to CORINA for constructing the 3D structure of molecule (Gasteiger et al. 1990). The 3D structure of target proteins (2XP2 and 4ANS) drug molecule and lead compounds was energy-minimized using GROMACS package 4.5.3 adopting the GROMOS43a1 force field parameters before performing the computational analysis (Hess et al. 2008; Spoel et al. 2005). Virtual screening Virtual Screening (Shoichet 2004) is an important technique in computer-assisted drug discovery for screening of potential molecule from the database. This approach becomes popular in the pharmaceutical research for lead identification. Diminution of the massive virtual chemical space of small organic molecules and to screen against a specific target protein is the basic goal of the virtual screening (Tondi et al. 1999). In the present study, virtual screening technique performed with the help of PubChem database by employing crizotinib as a query (Bolton et al. 2008). It is worth stressing that PubChem database holds over 27 million records of unique chemical structures of compounds (CID) derived from nearly 70 million substance depositions (SID). The publicly available PubChem database provides great opportunities for scientists to perform VS process (Xie 2010). Several hits were obtained from the PubChem database, which were further analyzed using molecular docking studies. ADME and toxicity The bioavailability of the lead compounds was examined with the help of Lipinski’s rule of five (Lipinski et al. 1997). The molecular properties such as logP (partition coefficient), molecular weight (MW), or counts of hydrogen bond acceptors and donors in a molecule were utilized in formulating ‘‘rule of five’’ (Ertl et al. 2000). The rule states that most molecules with good membrane permeability should have molecular weight ≤500, calculated octanol–water partition coefficient, log P ≤ 5, hydrogen bond donors ≤5, acceptors ≤10 and van der Waals bumps polar surface area (PSA) <120 Å2 (Muegge 2003). In the present study, all the molecular properties for all the lead compounds were estimated by using Molinspiration program (http://www.molinspiration.com/cgi-bin/properties) (Buntrock 2002). Toxicity is the second important parameter need to be considered in the analysis of lead compounds. Infact, toxicity will account the failure of majority of the lead cases. In the present study, toxicity of the lead compound examined with the help of OSIRIS program (http://www.organic-chemistry.org/prog/peo/). The program was also helpful to evaluate the drug likeliness and drug score of the lead compounds. Nearly 5300 distinct substructure fragments created by 3300 traded drugs as well as 15,000 commercially available chemicals yielding a complete list of all available fragments with the associated drug likeliness. The drug score consolidates drug-likeliness, cLogP, logS, molecular weight, and toxicity risks. It is a total value which may be used to judge the compound’s overall potential to qualify for a drug. Molecular docking The docking study is immensely important to understand the bioactivity of the screened lead compounds. Initially, SMILES strings were used for constructing three dimensional structures of all the lead compounds. Subsequently, docking algorithm was performed with the help of Patch dock server (Schneidman et al. 2005). It is a molecular docking algorithm based on geometry. The energy minimized PDB coordinate file corresponds to the protein and the ligand molecule is the input parameters for the docking. This algorithm has three major stages (1) molecular shape representation (2) surface patch matching and (3) filtering and scoring. The Patch Dock services were available at http://bioinfo3d.cs.tau.ac.il/PatchDock/. The docked complexes were ranked based on the geometric matching score with target proteins. The geometric matching score of crizotinib with target proteins (native and mutant structures) were used as reference for filtering the lead compounds. Molecular dynamics simulation GROMACS Package 4.5.3 implemented with Gromos 43a1 force field was utilized to perform molecular dynamics (MD) of docked complexes such as native-type ALK-crizotinib complex, mutant-type ALK-crizotinib complex, native-type ALK-CID11562217 complex and mutant-type ALK-CID11562217 complex (Hess et al. 2008; Spoel et al. 2005). The protein was solvated in cubic 0.9 nm with the help of periodic boundary conditions and the SPC water model (Meagher and Carlson 2005).This resulted in the addition of 22,269 and 23,506 water molecules to the native and mutant complex structures, respectively. PRODRG server was used to generate topology of the ligand (Schuttelkopf and Van Aalten 2004). This server uses the GROMOS force field for generating topology file and assigning atom types. Six sodium (6 Na+ ions) counter ions were added to neutralize the total charge of the system and one thousand steps of steepest descent energy minimization were carried out for the proteins. After the energy minimization step, the system was equilibrated at constant temperature and pressure. Using an atom-based cutoff of 8 Å, the non bonded list was generated. Constrains bond lengths at their equilibrium values were handled by SHAKE algorithm and the long range electrostatic interactions were handled by particle-mesh Ewald algorithm (Darden et al. 1999; Van Gunsteren and Berendsen 1977). The total simulation time was set to 20,000 ps with integration time step of 2 fs. Structural analysis was done at every picosecond and trajectories were stored in traj.trr file. For instance, root mean square deviation (RMSD) was analyzed with the help of Gromacs utilities g_rms.