While of interest for comparison with previous studies, this set of species is not representative of the problem of incorporating phylogeny into scanning methods. Furthermore, evaluation of scanning algorithms using real sequence data is difficult, because of the presence of transcription factor binding sites that are likely real, but unreported. That is, because they have not yet been experimentally verified, some predicted sites reported as false positives may, in fact, be true positives. Thus, we generated synthetic data in which we controlled the binding site content. Specifically, as a typical example, we generated four sets of sequence data modeled on the phylogenetic relationship of fourteen prokaryotic species: seven Enterobacteriales (E. coli, S. typhi, Klebsiella pneumoniae, Salmonella bongori, Citrobacter rodentium, Shigella flexneri, & Proteus mirabilis), four Vibrionales (Vibrio cholerae, Vibrio parahaemolyticus, Vibrio vulnificus, & Vibrio fischeri), and three Pasteurellales (Haemophilus influenzae, Haemophilus somnus, & Haemophilus ducreyi).