Combining evidence from more-distant sequences As described above, a Pintergenic p-value is generated for each sequence alignment of an intergenic region, but a true site's value may still be too weak to distinguish that site from the false positives in a vast genome. To address this problem, we combine this p-value with the p-values for the same intergenic region that come from sequence alignments of more distantly-related species. That is, we partition the input sequences for orthologous promoters into clades such that each clade is either an isolated sequence or contains sequences that can be reliably, multiply aligned; we compute the Pintergenic value for each clade as above; and we combine these p-values using the formula of Bailey and Gribskov [32]. When there are n such clades whose Pintergenic values are P1, P2,..., Pn then we compute: P product = ∏ c = 1 n P c P combined = P product ∑ i = 0 n − 1 ( − ln ⁡ ( P product ) ) i i ! . MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaafaqaaeGadaaabaGaemiuaa1aaSbaaSqaaiabbchaWjabbkhaYjabb+gaVjabbsgaKjabbwha1jabbogaJjabbsha0bqabaaakeaacqGH9aqpaeaadaqeWbqaaiabdcfaqnaaBaaaleaacqWGJbWyaeqaaaqaaiabdogaJjabg2da9iabigdaXaqaaiabd6gaUbqdcqGHpis1aaGcbaGaemiuaa1aaSbaaSqaaiabbogaJjabb+gaVjabb2gaTjabbkgaIjabbMgaPjabb6gaUjabbwgaLjabbsgaKbqabaaakeaacqGH9aqpaeaacqWGqbaudaWgaaWcbaGaeeiCaaNaeeOCaiNaee4Ba8MaeeizaqMaeeyDauNaee4yamMaeeiDaqhabeaakmaaqahabaWaaSaaaeaacqGGOaakcqGHsislcyGGSbaBcqGGUbGBcqGGOaakcqWGqbaudaWgaaWcbaGaeeiCaaNaeeOCaiNaee4Ba8MaeeizaqMaeeyDauNaee4yamMaeeiDaqhabeaakiabcMcaPiabcMcaPmaaCaaaleqabaGaemyAaKgaaaGcbaGaemyAaKMaeiyiaecaaiabc6caUaWcbaGaemyAaKMaeyypa0JaeGimaadabaGaemOBa4MaeyOeI0IaeGymaedaniabggHiLdaaaaaa@7A2C@ This formula precisely computes the p-value for the product of n values drawn randomly from the interval [0, 1]. An example of this calculation is available in the Supplementary Materials [see Additional file 1]. PhyloScan allows a p-value cutoff α, which defaults to 0.05, such that sites in a user-specified clade of interest that are worse than this cutoff are not permitted to be strengthened by data from the other species via the combination process. This feature allows the user to concentrate on a single clade or species rather than the entire tree of species. Because of this cutoff, it is appropriate to modify the above formula for sites that survive the cutoff: P combined = P product ∑ i = 0 n − 1 ( − ln ⁡ ( P product α ) ) i i ! . MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaafaqabeqadaaabaGaemiuaa1aaSbaaSqaaiabbogaJjabb+gaVjabb2gaTjabbkgaIjabbMgaPjabb6gaUjabbwgaLjabbsgaKbqabaaakeaacqGH9aqpaeaacqWGqbaudaWgaaWcbaGaeeiCaaNaeeOCaiNaee4Ba8MaeeizaqMaeeyDauNaee4yamMaeeiDaqhabeaakmaaqahabaWaaSaaaeaadaqadaqaaiabgkHiTiGbcYgaSjabc6gaUnaabmaabaWaaSaaaeaacqWGqbaudaWgaaWcbaGaeeiCaaNaeeOCaiNaee4Ba8MaeeizaqMaeeyDauNaee4yamMaeeiDaqhabeaaaOqaaGGaciab=f7aHbaaaiaawIcacaGLPaaaaiaawIcacaGLPaaadaahaaWcbeqaaiabdMgaPbaaaOqaaiabdMgaPjabcgcaHaaacqGGUaGlaSqaaiabdMgaPjabg2da9iabicdaWaqaaiabd6gaUjabgkHiTiabigdaXaqdcqGHris5aaaaaaa@65F7@