In the task of computational protein function prediction, there are two major problems: choice of the classification algorithm and choice of the protein representation. Depending on the choice of protein representation, these computational predictive methods can be classified into two categories: i) analysis from protein structure [19,20,28,30] and ii) prediction from amino acid sequence[11,21,31-33]. In case of structure-based prediction methods, Stawiski et al. [19] examined positively charged patches on the surface of putative DNA-binding proteins in comparison with that on non DNA-binding proteins. They employed 12 features including the patch size, hydrogen-bonding potential, and the fraction of evolutionary conserved positively charged residues and other properties of the protein to train a neural network (NN) for identifying DNA-binding proteins. Ahmad and Sarai [20] trained a NN classifier using three features, including net charge, electric dipole and quadruple moments of the protein. Bhardwaj et al. [15] examined the sizes of positively charged patches on the surface of putative DNA-binding proteins. They based their SVM classifier on the protein's overall charge, overall and surface amino acid composition. Szilágyi and Skolnick [34] previously trained a logistic regression classifier using the amino acid composition, the asymmetry of the spatial distribution of specific residues and the dipole moment of the protein. Guy Nimrod and Andras Szilágyi et al. [23] recently developed a random forest classifier based on the electrostatic potential, cluster-based amino acid conservation patterns and the secondary structure content of the patches, as well as features of the whole protein including its dipole moment. Since the negative samples are much more than real DNA-binding proteins, this is an imbalanced binary classification problem from the view of machine learning. Song et al. [35] employed ensemble classifier [36] to solve this problem and improved the identification. Several methods considering the sequence-order effects were proposed, and the experimental results showed that this information can improve the predictive performance [37,38].