Pleiotropic Associations between DNAm and Gene Expression Although it is widely hypothesized that DNAm influences gene expression, its relationship with transcriptional activity is not fully understood. DNAm across CpG-rich promoter regions, for example, is often assumed to repress gene expression via the blockage of transcription-factor binding and the attraction of methyl-binding proteins.55 DNAm in the gene body, in contrast, is hypothesized to be a marker of active gene transcription5, 56 and to potentially play a role in regulating alternative splicing and isoform diversity. To identify associations between DNAm and gene expression, we applied the SMR approach to DNAm sites identified as being associated with an mQTL at our “discovery” significance threshold, located within a megabase of a gene expression probe included in the eQTL dataset generated by Westra and colleagues.45 In total, we tested 488,342 pairs and explored relationships between 96,694 DNAm sites and 4,721 gene expression probes annotated to 4,049 genes (Figure S15). On average, each DNAm site was tested against a median of four expression probes (interquartile range = 2–7) mapping to a median of three genes (interquartile range = 2–6). In contrast, each expression probe was tested against a median of 85 DNAm sites (interquartile range = 56–130). Of these, 40,404 pairs (8.27%)—comprising 22,007 (22.8%) DNAm sites and 4,201 (89.0%) expression probes mapping to 3,628 (89.6%) genes—were characterized by a significant SMR result (significance threshold corrected for the number of DNAm sites and gene expression probe pairs tested = p < 1.02 × 10−7). 6,798 of these significant SMR pairs—comprising 5,420 (5.61%) DNAm sites and 1,913 (40.5%) expression probes mapping to 1,702 (42.0%) genes—also had a HEIDI p > 0.05 (Table S8; Figure S15). These results suggest that although expression of a large proportion of genes is associated with DNAm sites, not all DNAm sites are associated with gene expression in cis. The majority of significant gene expression probes (n = 1,192; 62.3%) are associated with a median of two DNAm sites (interquartile range = 1–4) spanning a median distance of 66,846 bp (interquartile range = 19,062–155,737) at a median density of 19,959 bp (interquartile range = 6,387–54,445) between sites. Interestingly, DNAm sites pleiotropically associated with gene expression are enriched in the gene body and transcription start sites of genes and depleted intergenically (Chi square test p = 7.08 × 10−133; Figure S16; Table S9). We identified a small but significant enrichment of scenarios where DNAm is negatively associated with gene expression at sites located in the 5′ UTR (mean effect = −0.0211; p = 0.00108), TSS200 (mean effect = −0.0479; p = 6.38 × 10−7), TSS1500 (mean effect = −0.0350; p = 5.82 × 10−11) and 1st exon (mean effect = −0.0506; p = 6.19 × 10−5), consistent with the hypothesis that promoter DNAm often represses gene expression (Figure S17).