Introduction Genes may vary in evolvability for many reasons, including physical susceptibility to mutagenesis. Here we show that a class of genes with distinctive physical features—heat-shock genes—is remarkably prone to mutagenesis by insertion of a specific transposable element (TE), the P element of Drosophila. TEs are mobile, repetitive DNA sequences and a structurally dynamic component of genomes [1]. TEs can cause gene and chromosome evolution in numerous ways, including insertional mutagenesis, retroposition, conveyance of regulatory elements to novel sites, and service as pivotal sites for ectopic recombination, and thus chromosomal rearrangements and gene duplication. For such evolution to occur, however, TEs must first insert into chromatin, which in turn requires that the target site be accessible to the transpositional machinery [2]. Indeed, insertion of Drosophila P elements, among the best-studied of TEs [3], into specific sites is associated with features of local chromatin architecture such as DNase I hypersensitivity, location in 5′-flanking sequence, presence of pre-existing TEs, and physical structure, but only weakly with insertion sites' nucleotide sequence (e.g., [4–6]). These features vary widely and frequently throughout genomes [7], which is consistent with the irregular, but repeated, occurrence of TEs. Entire classes of genes also vary in TE frequency—and hence potentially evolvability via transposition—in laboratory studies [8], but for natural populations, neither the mechanistic basis for this variation nor its relevance for evolvability is clear. In such experimental work with Drosophila, heat-shock genes (e.g., the major heat-shock gene Hsp70) stand out as a class receiving numerous TE insertions [8–10]. (By “gene,” we intend both the transcribed sequence and associated non-transcribed regulatory sequence.) This distinction is not unexpected from two perspectives. First, the local chromatin architecture of heat-shock proximal promoters is peculiar, incorporating constitutively decondensed chromatin and nucleosome-free regions [11,12], and constitutive engagement of the transcriptional machinery. In addition to the 5′ location of these promoters, such features should predispose these regions to TE insertion (see above). Second, TEs segregate at high frequency in natural populations in the 5′-flanking regions of the five genomic copies of Hsp70 [13–17]. This finding is remarkable given that TEs typically are at low allelic frequency in the Drosophila genome, presumably because they are deleterious [18–20]. The Hsp70 intragenic TEs are seemingly adaptive, exhibiting repeatable demographic variation in allelic frequency along natural thermal gradients and beneficial impacts on Hsp70 expression and components of fitness [14–17,21]. Nonetheless, TEs constitute 22% of the Drosophila genome [22] and are numerous (more than 6,000 elements) [23]. Thus TE insertions in heat-shock genes could simply be a manifestation of general patterns because TEs are common in the Drosophila genome, rather than indicative of a specific insertion susceptibility and/or adaptive role. To distinguish between these possibilities, we carried out an unbiased screen with both negative and positive controls. Our working hypotheses were as follows: First, because TE insertion can be mutagenic, naturally occurring transposition into Hsp70 genes could simply reflect that these are multicopy genes [24] and functionally redundant, and thus permit insertional mutagenesis of one to two copies. If so, then TEs occurring in proximal promoter regions should be restricted to multicopy genes like Hsp70 and not widespread in the “heat-shock genome” of natural populations, typically comprising single-copy genes. Second, if as a class, heat-shock genes are especially susceptible to TE insertion in their proximal promoter region, the “heat-shock genome” of natural Drosophila populations should harbor numerous TEs in this region. Accordingly, we screened for TEs in the proximal promoters of 18 heat-shock genes other than Hsp70. This set of genes represents the prototypical heat-shock genes and cognates in Drosophila melanogaster other than Hsp70 (Gene Set I in Table 1). Table 1 Genes Other Than Hsp70 Screened for Transposable Element Insertions in 5′-Flanking Sequence Third, if heat-shock genes' peculiar chromatin architecture and its correlates (see above) predispose the heat-shock genome to TE insertion, then the proximal promoter regions of other genes in the Drosophila genome sharing some or all of these features should likewise harbor numerous TEs in natural populations. Accordingly, we screened for TEs in 18 non–heat-shock genes resembling heat-shock genes in relevant features (Gene Set II in Table 1). Finally, if heat-shock genes' chromatin architecture and its correlates predispose the heat-shock genome to TE insertion, then in natural populations, genes dissimilar to heat-shock genes should less frequently harbor TEs in their proximal promoter regions. Accordingly, we screened for TEs in the proximal promoters of a “negative control” set of 18 such genes (Gene Set III in Table 1). Relevant to all working hypotheses is that a TE in a gene will signify both that the TE has successfully inserted and that the TE has not (yet) been eliminated. Remobilization of TEs, their mutagenesis, and negative selection may all affect TEs' presence at a specific site. Such screens pose a substantial analytical challenge. Only a single D. melanogaster genome has presently been sequenced, and that for an isogenized laboratory strain [25]. Although the sequenced genome is typical of wild D. melanogaster with respect to many TEs, it is intentionally dissimilar with respect to others [26].Moreover, an isogenized strain obviously cannot represent variability present in natural populations. Furthermore, most attempts to characterize the “transposome” of natural Drosophila populations, whether experimentally or in silico, are sequence-based; i.e., they rely on the distinctive canonical sequences of the various TEs for TE recognition and subsequent identification of the gene (or intergenic region) in which TEs have inserted. These methods range from genomic Southern blots to TE-specific PCR to TE display to bioinformatics searches. Our objective, by contrast, is to ascertain how often specified gene regions contain TEs. Given that each region to be screened might contain one of more than 120 different TE families in Drosophila [23,26], a sequence-based screen specific for each possible element in numerous genes and populations would be prohibitively laborious. Furthermore, our region of interest (proximal promoter) is non-coding, which may frustrate simple PCR-based screens when highly variable. For these reasons we have exploited universal fast walking (UFW) [27,28], a method that can report TEs, not by their sequence, but by the size polymorphisms they create. Here we demonstrate, by applying this technique to a screen of 48 natural Drosophila populations from around the world (Figure 1), that heat-shock genes as a class are a distinctive and repeatable natural target for TE insertion, as is predictable from the distinctive characteristic features of these promoters. Remarkably, of the many active TE families that might target heat-shock genes, the vast majority of the naturally occurring TEs that we discovered are P elements, notorious for their recent invasion of the D. melanogaster genome [29–31]. Accordingly, we conclude that the proximal promoters of heat-shock genes in Drosophila are especially conducive to transposition of P elements in nature, which creates significant variation upon which evolutionary processes may act. Furthermore, dissimilarities between frequencies of naturally occurring and experimental P element transpositions into the various classes of promoters imply that weakened purifying selection and/or positive selection may contribute to the persistence of P elements in natural populations—a suggestion that invites future testing. Figure 1 Geographic Origins of D. melanogaster Populations Screened in This Study Screens revealed zero to 14 P elements per population (indicated by the number of squares), distinctive by insertion location, in the proximal promoter regions of genes examined (Table 1). Colors of squares correspond to gene set (see Introduction). Inset: Percentages of distinctive P elements discovered in Hsp70 genes and each of the three gene sets screened. A total of 161 P element insertions (the ten P elements in the coding sequence and the five non–P element insertions are not included in the figure). These tallies potentially under-report the actual number of P elements; see Results. F06 (Celera) is the strain whose genome has been sequenced [25] and is the reference strain for the present study. Populations F18, F50, and F52 (in light gray text) were removed from the analysis after screens failed for multiple genes and primer sets. R