In the examples presented in this study, E. coli was the primary species of interest; we therefore identified a set of E. coli promoter-containing sequences by identifying each E. coli protein-coding gene (excluding 111 genes encoded on transposons or prophage elements) that has at least 20 bp of upstream intergenic sequence. By these criteria, there are 2379 E. coli intergenic regions of interest. Orthologous upstream intergenic-sequence data files were then generated for this set of 2379 E. coli regions, using the results from INPARANOID to identify orthologs, and the seven genome annotations to define intergenic boundaries. In the Supplementary Materials are a table with these data [see Additional file 2] and a caption for the table [see Additional file 1].