Constructing multiple alignments With only two closely related species in our set, we chose the Smith-Waterman [46] pairwise, gapped local alignment algorithm (implemented as BestFit in the Wisconsin Package Version 10.3, Accelrys Inc., San Diego, CA) to align their orthologous intergenic regions, using default parameters (match = 10.000; mismatch = -9.000; gap creation penalty = 50; gap extension penalty = 3). The alignment of E. coli and S. typhi orthologous upstream intergenic sequences resulted in 1662 unique aligned sequence pairs. The upstream intergenic sequences for an additional 836 E. coli genes that did not have orthologs in S. typhi remained. The combination of these two datasets (1662 + 836 = 2498) does not equal the above number of E. coli intergenic regions of interest (2379 sequences), due to the complication of divergently transcribed genes. Specifically, we observed that for some divergently transcribed genes in E. coli, the orthologous genes in S. typhi are not syntenic, thus S. typhi provided two separate intergenic regions for alignment to a single intergenic region of E. coli. To perform the real-data tests, three databases representing the reference species clade were generated for scanning: (1) a database containing the 2379 E. coli intergenic regions of interest, (2) a database containing only E. coli data ("E. coli reduced"), where 1662 E. coli intergenic regions have been reduced in sequence space by alignment with S. typhi orthologous data plus an additional 836 E. coli sequences for which there was no orthologous S. typhi data, and (3) a database containing 1662 E. coli-S. typhi aligned orthologous intergenic regions plus an additional 836 E. coli sequences for which there was no orthologous S. typhi data.