Representation of chemical space involves two steps, defining a dictionary and de-scribing ligands as features. We have integrated sever data sources to make the dictionary (data shows in supplementary materials). In PubChem database, there are 881 predefined chemical substructures. We made some modification on the fin-gerprints to gear with our model. First, the single atoms and bonds were removed because they are not in the same structural level with trimers. Second, some sub-structures, such as benzene were removed; because they are too common to serve as a discriminately feature. Third, functional groups/fingerprints of molecular in Check-mol were integrated [22]. Finally, we generated a dictionary with 747 substructures. Based on the dictionary, each ligand was represented by a 747-dimensional binary vector whose element indicates the presence or absence of each substructure by 1 or 0.