PMC:4331677 / 2461-7880 JSONTXT

Annnotations TAB JSON ListView MergeView

    2_test

    {"project":"2_test","denotations":[{"id":"25707321-17211405-14839501","span":{"begin":1176,"end":1177},"obj":"17211405"},{"id":"25707321-8780787-14839502","span":{"begin":1178,"end":1179},"obj":"8780787"},{"id":"25707321-15520816-14839503","span":{"begin":1424,"end":1425},"obj":"15520816"},{"id":"25707321-19708682-14839504","span":{"begin":2771,"end":2772},"obj":"19708682"},{"id":"25707321-18676415-14839505","span":{"begin":2921,"end":2922},"obj":"18676415"},{"id":"25707321-21893517-14839506","span":{"begin":3114,"end":3116},"obj":"21893517"},{"id":"25707321-19605421-14839507","span":{"begin":3272,"end":3274},"obj":"19605421"},{"id":"25707321-20840733-14839508","span":{"begin":3353,"end":3355},"obj":"20840733"},{"id":"25707321-21893517-14839509","span":{"begin":3356,"end":3358},"obj":"21893517"},{"id":"25707321-23162055-14839510","span":{"begin":3359,"end":3361},"obj":"23162055"}],"text":"Background\nThrough various high-throughput experimental projects for analyzing the genome, transcriptome and proteome, we are beginning to understand the genomic spaces. Simultaneously, the high-throughput screening of large-scale chemical compound libraries with various biological assays enable us to explore the chemical space. However, our knowledge about the relationship between the chemical and genomic spaces is very limited. For example, the PubChem database at NCBI [1] stores information on millions of chemical compounds, but the number of compounds with information on their target protein is very limited [2]. Therefore, there is a strong incentive to develop new methods capable of detecting these potential target-ligand interactions efficiently.\nDue to time and cost limitations of experimental approaches, a number of predictive approaches attempt to predict target-ligand relationships in silico. The traditional computational predictive methods roughly fall into two categories: target-based approaches and ligand-based approaches [3]. Target-based approaches mainly utilize the target information to predict. Molecular docking is a target-based approach [4,5], which predicts the preferred orientation by conformation searching and energy minimization. Docking could provide excellent conformation, but it is difficult to find a rank/evaluation function to select which orientation is more appropriate [6]. Another target-based method is comparing target similarities, which compares the targets of a given ligand by sequences, EC number, domains, 3D structures, etc. Ligand-based methods compare candidate ligands with the known ligands of a given target to make a prediction [3]. Three-dimensional quantitative structure-activity relationship (3D-QSAR) is a typical ligand-based model [7], which indirectly reflect non-bonding interaction characteristics between the ligand and target. The most widely used 3D-QSAR methods are comparative molecular field analysis (CoMFA) and comparative molecular similarity (CoMSIA). CoMFA first aligns the ligands capable of binding to a given target, and then measure field intensities around the aligned ligands by different atom probes (force field-based). Finally, the measured field intensities are regressed with the active values and the regression equation is applied to predict interactions. Moreover, we can map the coefficients of CoMFA back into 3D space to obtain a 3D-QSAR model, which could guide the optimization of lead compounds [7].\nRecently, some methods considering both the target and ligand information have been proven to be promising in drug design and discovery. Jacob et al. applied the EC Number (Enzyme Commission number) and PubChem fingerprints (a set of molecular substructures) [8] to represent targets and ligands respectively, and proposed a pairwise support vector machine (pSVM) method to predict target-ligand interactions [9]. Laarhoven et al. described the targets and ligands by sequences and compound 3D structures respectively, and introduced the target-ligand interaction network to build the prediction model [10]. Bleakley et al. proposed Bipartite Local Model (BLM), which integrated the ligand-based and target-based methods to generate a comprehensive prediction [11]. BLM has been further studied by Xia et al., Laarhoven et al. and Mei et al.[12,10,13]. The BLM shows a very good predictive ability, however, it cannot deal with the situation that both the ligand and target are unseen in the training set. Yamanishi et al. represented the genome space with sequences and target profiles, and the chemical space with compound 3D structures and ligand profiles, and then generated a uniform \"pharmacological space\" to build the prediction model [2]. Cheng et al. applied the mass distribution property from physics on the target-ligand network to predict the target-ligand interactions [14]. Cao et al. integrated the genome and chemical space into random forests to obtain a better predictive ability [15].\nHowever, most of the existed methods consider the target as a whole, resulting that it is difficult to investigate the latent binding mechanism between target and ligand. In other words, we know little about how the chemical space interacts with the genomic space. In this article, based on the fact that the target-ligand interaction is more of a local event, we use the binding sites (local information) instead of the whole target to describe the genomic space. Furthermore, we assume that the fragment-fragment interactions determine the target-ligand interactions. Thus we break the binding sites and ligands into fragments, and propose fragment interaction model (FIM) to figure out a clean picture of how the chemical space interacts with the genomic space (Figure 1).\nFigure 1 Fragment interaction model (FIM). Each element in the target dictionary is a trimer cluster, and trimers belonging to the same cluster share similar chemical properties; Each element in the ligand dictionary is a chemical substructure. The binding sites of the targets are represented by fragments (trimer clusters) information and the ligands are encoded with substructures (fragments) information. We assume that the (binding site) fragment-(ligand) fragment interactions facilitate the site-ligand binding. The interaction between a binding site fragment and a ligand substructure is determined by their distance."}