Results Identification of eQTL Primary and conditional eQTL were identified using genotype and RNA-seq data from the CommonMind Consortium post-mortem DLPFC samples (467 European-ancestry case and control subjects).16 We identified 12,813 primary and 16,082 conditional eQTL, totaling 28,895 independent eQTL. Of the genes tested, 81% (12,813 of 15,817 autosomal genes) had at least one eQTL and 63% of these (51% of all genes) also had at least one conditional eQTL, with an average of 1.83 independent eQTL per gene (2.26 among those with at least one eQTL) (Figure 1A). Conversely, when examining the distributions for the number of genes whose expression was affected by each eQTL (Table S1), the majority of eQTL were specific for a single gene, and only a small fraction of eQTL, 1.47%, affected more than one gene, with a maximum of six genes affected by a single eQTL. Figure 1 Characterization of Conditional eQTL (A) Counts of the numbers of genes (y axis) regulated by at least N (1 ≤ N ≤ 16) independent eQTL (x axis). (B) Median Tau value (y axis) for genes with N independent eQTL (x axis), colored by Tau type (cell type, developmental time point, or tissue type Tau). (C) Density plot representing the distance from eSNP to eGene transcription start site (TSS), colored by eQTL order. Dashed lines represent the median distance to TSS for each order of eQTL. We tested for replication of conditional eQTL in two independent datasets, the National Institute of Mental Health’s Human Brain Collection Core (HBCC, n = 279, microarray expression data) and the Religious Orders Study/Memory and Aging Project22 (ROSMAP, n = 494, RNA-seq expression). For each gene the same models were evaluated that were identified in forward-stepwise conditional analysis in the CMC data. We observed significant evidence of replication for both primary and conditional eQTL in the HBCC and ROSMAP post-mortem brain cohorts (Table S2). The estimated proportion of true associations (π1) in ROSMAP was 0.57 and 0.26 for primary and conditional eQTL, respectively; in HBCC π1 was 0.46 and 0.20 for primary and conditional eQTL. Therefore, replication was stronger for primary than for conditional eQTL, as expected given their stronger effect sizes. Replication rates were somewhat higher in the RNA-seq ROSMAP data than in HBCC. Genomic Characterization of Primary and Conditional eQTL The features for which primary and conditional eQTL and their respective eGenes displayed identifiable differences included distance from eQTL to its gene’s transcription start site (TSS), gene length, LD blocks per genic cis-region, genic constraint score, and genic cis-SNP-heritability. According to prior results, eQTL that are shared across tissues and cell types tend to be located closer to transcription start sites than context-specific eQTL;13, 14 we therefore first examined the relationship between primary or conditional eQTL status and distance to genic TSS. Primary eQTL fall closer to the TSS than conditional eQTL (Figure 1C): primary eQTL occur at a median distance of 70.4 kb from the TSS versus a median distance of 302 kb for conditional eQTL. This difference holds true even more proximally to the TSS (Figure S2); 8.1% and 2.5% of primary and conditional eQTL, respectively, fall within 3 kb of the TSS. We next characterized the relationship between the number of independent eQTL per gene and three different genomic features: gene length, number of LD blocks48 in the gene’s cis-region (±1 Mb), and Exome Aggregation Consortium (ExAC) genic constraint score,49 including possible interactions. The best multivariate model for eQTL number included gene length, number of LD blocks, and genic constraint as predictors, as well as a gene length-LD blocks interaction (Table 1). The number of independent eQTL was positively correlated with gene length and number of LD blocks and negatively correlated with genic constraint score (Figure S3). We then examined the variance of gene expression explained by cis-region SNPs, or cis-SNP-heritability, estimated by linear mixed model variance component analysis25 (Figure S4). We found a strong effect of estimated cis-heritability on number of independent eQTL (Table 1, Figure S5). In a joint model with cis-SNP-heritability, the main effects of gene length, number of LD blocks, and genic constraint on eQTL number remained at least nominally significant. Table 1 Number of eQTL per Gene Modeled on Genomic Features Predictor Model 1 Estimate Model 1 Robust SE Model 1 Pr(> |z|) Model 2 Estimate Model 2 Robust SE Model 2 Pr(> |z|) Model 3 Estimate Model 3 Robust SE Model 3 Pr(> |z|) log(Gene length) 0.27 0.04 5.16E−12 0.16 0.03 2.20E−06 0.17 0.03 9.87E−07 LD blocks 0.59 0.17 6.47E−04 0.33 0.15 2.92E−02 0.37 0.15 1.55E−02 log(Gene length): LD blocks −0.03 0.02 7.77E−02 −0.01 0.01 5.65E−01 −0.01 0.01 4.11E−01 Constraint −0.61 0.03 5.93E−85 −0.20 0.03 2.93E−13 −0.15 0.03 5.41E−08 cis-heritability – – – 7.03 0.18 0.00 7.02 0.18 0.00 Tau (tissue) – – – – – – 0.08 0.08 2.76E−01 Tau (DLPFC cell type) – – – – – – 0.20 0.09 3.69E−02 Tau (developmental time point) – – – – – – 0.17 0.09 5.99E−02 We then addressed whether genes with conditional eQTL exhibit greater context specificity as measured by the robust expression specificity metric Tau.26, 27 We calculated Tau across 53 tissues from the Genotype-Tissue Expression (GTEx) project, across 6 DLPFC cell types (astrocytes, endothelial cells, microglia, neurons, oligodendrocytes, and oligodendrocyte progenitor cells) from single-cell RNA-seq,29 and across 8 developmental periods30 (early prenatal, early mid-prenatal, late mid-prenatal, late prenatal, infant, child, adolescent, and adult) from the BrainSpan atlas DLPFC RNA-seq data. We confirmed that higher values of Tau reflect expression specificity by comparing the distributions of all three Tau measures for all genes with the distributions for a subset of housekeeping genes50 (Figure S6). We found positive correlations between eQTL number and tissue, cell type, and developmental time point specificities (Figure 1B, Table 1, Table S3, Figure S7). In a joint model, the strongest correlation was with DLPFC cell type Tau, which is consistent with previous data demonstrating tissue-specific, cell type-dependent expression in blood;12 however, we note that all three Tau sets were inter-correlated (Table S3). Epigenetic Enrichment Analyses One way in which eQTL may affect gene expression is through alteration of cis-regulatory elements such as promoters and enhancers. Putative causal eSNPs have been shown to be enriched in genomic regions containing functional annotations such as DNase hypersensitive sites, transcription factor binding sites, promoters, and enhancers.51, 52, 53, 54 Our observation that conditional eQTL fall farther from transcription start sites than primary eQTL led us to hypothesize that primary eQTL may affect transcription levels by altering functional sites in promoters whereas conditional eQTL may do so by altering more distal regulatory elements such as enhancers. We therefore assessed enrichment of primary and conditional eQTL in brain active promoter (TssA) and enhancer (merged Enh and EnhG) states derived from the NIH Roadmap Epigenomics Project,32, 33 and in H3K4me3 and H3K27ac neuronal (NeuN+) and non-neuronal (NeuN−) ChIP-seq peaks from a subset of the CMC post-mortem DLPFC samples. The overlap of H3K4me3 and H3K27ac ChIP-seq peaks was used as a proxy for active promoters, and H3K27ac peaks that do not overlap H3K4me3 peaks were used as a (relatively non-specific) proxy for enhancers.33 We performed logistic regression of SNP status (eQTL versus random matched SNP) on overlap with functional annotations, separately for each eQTL order (primary, secondary, and greater than secondary). Primary and conditional eQTL were significantly enriched in both promoter and enhancer chromatin states from REMC brain and CMC DLPFC tissues, with greatest enrichments overall observed in PFC neuronal (NeuN+) promoters and enhancers (Figure 2, Table S4). We found that whereas active promoter enrichments in all tissue/cell types markedly decreased with higher conditional order of eQTL, enhancer enrichments either only slightly decreased (REMC brain and PFC NeuN+, Figures 2A and 2C) or remained level (REMC brain-specific, Figure 2B). Though there was also significant enrichment of eQTL in non-neuronal nuclei (NeuN−) promoters and enhancers, this trend of a marked decrease in active promoters but steady levels of enhancer enrichment with greater eQTL order was not observed for non-neuronal PFC nuclei (Figure 2D). This greater decrease in enrichment for promoters compared to enhancers with increasing eQTL order was not confounded by an excess of eQTL near brain-expressed genes in comparison to matched SNPs (Figure S8, Table S5) and furthermore was not an artifact of varying effect size with eQTL order; the same overall pattern was observed when stratifying eQTL by variance in expression explained (R2) and comparing enrichment across eQTL order, within each R2 bin (Figures S9–S12, Table S6). Figure 2 Enrichments of Primary and Conditional eQTL in Active Regulatory Annotations Plotted are enrichments (regression coefficient estimate ± 95% CI from logistic regression, y axes) of primary (x axis eQTL order = 1) and conditional (eQTL order = 2, ≥ 3) eQTL in functional annotations. (A and B) Enrichment in brain (union of all individual brain regions) and brain-specific (present in brain but not in seven other non-brain tissues) active promoter (green) and enhancer (orange) ChromHMM states from the NIH Roadmap Epigenomics Project. (C) Enrichment in neuronal nuclei (NeuN+) for active promoters (intersection of DLPFC H3K4me3 and H3K27ac ChIP-seq peaks, green) and enhancers (H3K27 peaks that do not overlap H3K4me3 peaks, orange). (D) Enrichments in the same annotations, but for DLPFC non-neuronal nuclei (NeuN−). eQTL Co-localization with SCZ GWAS We performed co-localization analyses in order to evaluate the extent of overlap between eQTL and GWAS signatures in schizophrenia and to identify putative causal genes from GWAS associations. Considering 217 loci (Table S7) with lead SNPs reaching a significance threshold of p < 1 × 10−6 from the 2014 Psychiatric Genomics Consortium (PGC) schizophrenia GWAS,35 we tabulated the number of primary and conditional eQTL falling within GWAS loci. A total of 114 out of 217 loci contained primary and/or conditional eQTL for 346 genes; 110 of these genes had one eQTL only and 236 genes had more than one independent eQTL. To quantitatively compare the SCZ GWAS and eQTL association signatures, we modified the R package coloc39 for Bayesian inference of co-localization between the two sets of summary statistics across each gene’s cis-region. Coloc2, our modified implementation of coloc, analyzes the hierarchical model of gwas-pw,43 with likelihood-based estimation of dataset-wide probabilities of five hypotheses (H0, no association; H1, GWAS association only; H2, eQTL association only; H3, both but not co-localized; and H4, both and co-localized). We then used these probabilities as priors to calculate empirical Bayesian posterior probabilities for the five hypotheses for each locus, in particular PPH4 for co-localization. For genes with conditional eQTL overlapping SCZ GWAS loci, summary statistics from all-but-one conditional eQTL analyses were assessed for co-localization with the GWAS signature (Figure 3). To illustrate this analytical strategy, we show eQTL results for the iron responsive element binding protein 2 gene IREB2 (MIM: 147582, chr15:78729773–78793798) as an example (Figure 4). Forward stepwise selection analysis identified two independent cis-eQTL for IREB2. In order to generate summary statistics for each eQTL in isolation, we conducted two all-but-one conditional analyses, in each analysis conditioning on all but a focal independent eQTL (for IREB2 this entailed conditioning on only one eQTL per conditional analysis, but involved conditioning on up to six eQTL per gene across all genes considered in the SCZ co-localization analysis). We then tested for co-localization between the GWAS and all of the eQTL summary statistics resulting from the above conditioning analysis using coloc2 (Table S12). In the case of IREB2, the conditional eQTL (rs7171869) was implicated as co-localized with the GWAS signal at this locus with a posterior probability for co-localization (PPH4) of 0.94. A qualitative examination of the IREB2 locus supported the coloc2 results: the correlation between the GWAS p values and conditional eQTL p values was higher than that between the GWAS and primary eQTL p values (Figure 4A). In addition, the GWAS signature for the locus more closely resembled the conditional eQTL signature than either the non-conditional eQTL signature or the primary eQTL signature (Figure 4B). Figure 3 All-but-One Conditional Analysis to Isolate Independent eQTL Signatures (A) Hypothetical GWAS signature (top, green) at a given locus and an overlapping hypothetical eQTL signature (bottom, purple), which comprises two independent eQTL. (B) Same hypothetical GWAS and eQTL signatures after the all-but-one conditional eQTL analysis isolating the primary (red) and secondary (blue) eQTL signatures. Before conditional analysis there is a lack of co-localization between the GWAS signature and eQTL signature. After all-but-one conditional analysis, there is evidence for co-localization between the conditional (secondary) eQTL and GWAS signatures. Figure 4 GWAS Signature for IREB2 Co-localizes with the Conditional eQTL Signature (A) P-P plots comparing −log10 p values from GWAS (y axes) and all-but-one conditional eQTL analysis (x axes), which show the highest correlation to be between the GWAS and the conditional eQTL rs7171869 (blue, bottom). (B) LocusZoom plots for the IREB2 locus, where the GWAS signal (top) more closely resembles the conditional eQTL signal (rs7171869, bottom) than the primary eQTL signal (rs11639224, third from top) or non-conditional eQTL signal (second from top). For all LocusZoom plots, LD is colored with respect to the GWAS lead SNP (rs8042374, labeled). We found that 40 loci contained genes with strong evidence of co-localization between eQTL and GWAS signatures, with posterior probability of H4 (PPH4) ≥ 0.8 (Table 2). When restricting to genome-wide significance for the GWAS, we found co-localization in 24 of the 108 loci. Given the correlations between number of independent eQTL and expression specificity scores (Tau) across tissues, cell types, and development, we tabulated the reported genes’ Tau percentiles and expression levels, to highlight contexts in which the genes are specifically expressed (Table 2, Table S8). We acknowledge that while posterior probability PPH4 ≥ 0.8 demonstrates strong Bayesian evidence for co-localization, it is an arbitrary threshold for characterizing loci as GWAS-eQTL co-localized; we find that many loci with PPH4 ≥ 0.5 appear qualitatively consistent with co-localization. Table 2 GWAS-eQTL Co-localized Loci Chr GWAS Locus Start GWAS Locus End GWAS Lead SNP GWAS p Value eSNP eSNP p Value Primary/Conditional PPH4 Gene Relevant Tissue/Cell Type/Developmental Period 1 2372401 2402501 rs4648845 4.03E−09 rs12037821 4.9E−04 conditional 0.87 SLC35E2 –/–/early mid-prenatal 1 8355697 8638984 rs301797 2.03E−09 rs138050288 1.8E−04 primary 0.95 RERE –/–/– 1 30412551 30443951 rs1498232 1.28E−09 rs2015244 1.8E−08 primary 0.99 PTPRU –/neurons /early mid-prenatal 1 163582923 163766623 rs7521492 5.64E−07 rs10799961 3.18E−11 primary 0.91 PBX1 –/–/early prenatal 1 205015255 205189455 rs16937 8.69E−07 rs12724651 7.31E−07 primary 0.89 TMEM81 –/neurons/– rs12031350 8.15E−06 conditional 0.87 RBBP5 –/–/– 1 214137889 214163689 rs7529073 9.69E−07 rs1431983 1.67E−04 conditional 0.93 PROX1-AS1 cerebellar hemisphere/neurons/adult 2 73194203 73900439 rs56145559 8.42E−08 rs11679809 1.85E−34 primary 0.86 ALMS1P testis/–/– 2 110262036 110398236 rs9330316 7.69E−08 rs892464 2.35E−26 primary 0.92 SEPT10 –/–/late prenatal 2 198148577 198835577 rs6434928 1.48E−11 rs12621129 6.06E−12 primary 0.94 SF3B1 –/–/– 2 200715237 201247789 rs281768 1.78E−14 rs35220450 3.46E−14 primary 0.95 FTCDNL1, AC073043.2 –/–/adult rs186546506 8.77E−04 conditional 0.83 LINC01792, AC007163.3 putamen (basal ganglia)/ –/adult 2 208371631 208531731 rs2709410 5.75E−07 rs34171849 5.86E−17 primary 0.88 METTL21A –/–/– rs2551656 2.85E−09 primary 0.86 CREB1 –/–/early prenatal 2 220033801 220071601 rs6707588 9.51E−07 rs13404754 1.08E−09 primary 0.92 CNPPD1 –/–/– 3 36843183 36945783 rs75968099 3.39E−12 rs9834970 1.88E−05 primary 0.94 DCLK3 nerve - tibial /neurons/infant 3 52281078 53539269 rs2535627 3.96E−11 rs6801235 2.81E−08 conditional 0.86 PPM1M –/neurons/late prenatal 3 63792650 64004050 rs832187 2.58E−08 rs113386200 1.95E−12 primary 0.98 THOC7 –/–/– 3 135807405 136615405 rs7432375 5.27E−11 rs10935184 7.71E−25 primary 0.93 PCCB –/–/– 4 170357552 170646052 rs10520163 1.02E−08 rs7438 1.02E−09 primary 0.97 CLCN3 –/–/– 5 45291475 46404116 rs1501357 1.24E−08 rs9292918 4.45E−05 primary 0.94 BRCAT54, RP11-53O19.1 –/–/adult 6 83779798 84407274 rs3798869 8.57E−10 rs2016358 1.19E−09 primary 0.90 SNAP91 cerebellar hemisphere/–/– 6 108875527 109019327 rs9398171 3.37E−08 rs111727905 3.84E−06 primary 0.97 ZNF259P1 –/–/early mid-prenatal 7 21485312 21545712 rs73060317 6.60E−07 rs141984481 3.59E−05 primary 0.92 SP4 –/–/early prenatal 8 8088038 10056127 rs2945232 2.03E−08 rs2980441 7.68E−69 primary 0.82 FAM86B3P –/–/adolescent 8 26181524 26279124 rs1042992 2.27E−07 rs17055186 3.06E−24 conditional 0.91 SDAD1P1 testis/–/adult 8 38020424 38310924 rs57709857 2.32E−07 rs201999919 1.70E−07 primary 0.88 WHSC1L1 –/–/early prenatal 8 144822546 144871746 rs11784536 1.83E−07 rs12541792 6.45E−35 primary 0.90 FAM83H esophagus - mucosa/oligodendrocytes/adolescent 9 26839508 26909408 rs10967586 4.75E−07 rs12345197 3.90E−06 primary 0.80 IFT74 –/–/– 11 46340213 46751213 rs7951870 1.97E−11 rs16938506 5.08E−05 primary 0.88 MDK –/–/early mid-prenatal 12 57428314 57497814 rs324017 2.13E−07 rs4559 2.02E−05 conditional 0.91 STAT6 –/microglia/adolescent 14 35421614 35847614 rs77477310 1.52E−07 rs1028449 8.09E−04 primary 0.84 RP11-85K15.2 –/–/– 15 78803032 78926732 rs8042374 1.87E−12 rs7171869 1.44E−04 conditional 0.94 IREB2 –/–/early prenatal 15 84661161 85153461 rs950169 7.62E−11 rs35677834 1.54E−34 primary 0.80 LOC101929479, RP11-561C5.3 ovary/–/early mid-prenatal 15 91416560 91436560 rs4702 2.30E−12 rs4702 4.49E−13 primary 1.00 FURIN –/endothelial cells/adolescent 16 4447751 4596451 rs6500602 2.79E−07 rs3747580 4.75E−16 primary 0.90 CORO7 –/–/– rs8046295 2.68E−11 primary 0.89 NMRAL1 –/–/– 16 29924377 30144877 rs12691307 1.30E−10 rs4788203 1.95E−05 primary 0.88 TMEM219 –/–/– rs3935873 7.46E−14 primary 0.87 INO80E –/neurons/– rs4787491 1.60E−04 conditional 0.82 DOC2A brain - cortex/neurons/adolescent 16 58669293 58691393 rs12325245 1.15E−08 rs11647976 4.83E−04 primary 0.94 CNOT1 –/–/– 17 17722402 18030202 rs8082590 6.84E−09 rs4072739 4.74E−13 primary 0.92 DRG2 –/–/– 19 11839736 11859736 rs72986630 4.64E−08 rs72986630 2.20E−14 primary 1.00 ZNF823 –/endothelial cells/early prenatal 19 19374022 19658022 rs2905426 6.92E−09 rs2965199 9.22E−36 primary 0.87 GATAD2A –/–/– 19 50067499 50135399 rs56873913 2.19E−07 rs5023763 9.32E−05 primary 0.93 SNRNP70 –/–/– 22 41408556 42689414 rs9607782 6.76E−12 rs200447424 1.87E−04 primary 0.96 RANGAP1 –/–/– Importantly, for 6 of the 40 co-localizing loci, a conditional rather than primary eQTL co-localized with the GWAS with compelling qualitative support (Table 2, Figure 4, Table S11, Figures S13–S17). The genes showing strong evidence for conditional eQTL co-localization include SLC35E2, PROX1-AS1 (MIM: 601546), PPM1M (MIM: 608979), SDAD1P1, STAT6 (MIM: 601512), and IREB2. Also notable are the occurrences of complex patterns of co-localization for some loci; for example, three loci showed evidence for co-localization with a primary eQTL for one gene and a conditional eQTL for another. Comparison with Previous Co-localization Analyses In the prior CMC study, a GWAS-eQTL co-localization analysis implemented in Sherlock and using non-conditional eQTL summary statistics reported a total of 18 co-localized loci, representing 17% of the 108 genome-wide significant loci examined. Through our all-but-one conditional co-localization analysis, we replicate the majority of their findings and detect an additional 13 instances of co-localization, bringing the total number of co-localizations when considering only the genome-wide significant (and not including the MHC) loci up to 24 (representing 22% of these 108 loci) (Table S9). These 13 comprise instances of conditional eQTL co-localization (for genes SLC35E2 and IREB2) and improved detection of primary eQTL co-localization due to isolation of independent eQTL signatures and our choice of co-localization software (coloc2). Of the six co-localized loci identified in the previous but not current analysis, three resulted from differences in study design such as GWAS locus definition and eQTL overlap criteria, and two were suggestive in the current analysis (0.65 < PPH4 < 0.8). The one remaining discrepant locus (chr8:143302933–143403527) was found to co-localize with TSNARE1 eQTL previously (Sherlock p = 8.24 × 10−7) but not here (coloc2 primary eQTL PPH4 = 0.074, PPH3 = 0.93). A qualitative comparison of the eQTL and GWAS data (Figure S18) did not appear to support co-localization; while the strongest GWAS association and the strongest eQTL are in close physical proximity, the LD between the two index SNPs is low (r2∼0.2–0.4). Additionally, our attempts to disentangle independent eQTL signal via conditional analysis do not reveal the GWAS index SNP to be in high LD with any of the conditionally independent eQTL peaks. We also compared our conditional co-localization results with those from non-conditional eQTL analysis, using coloc2 and the same SCZ GWAS loci (Table S10). Conditional and non-conditional coloc2 results were highly concordant, with slightly higher PPH4s resulting from the same WABFs due to a higher prior probability of co-localization estimated in the non-conditional coloc2 analysis. Thirty-five loci were co-localized in both analyses; five loci that were co-localized in the non-conditional analysis only were highly suggestive in the conditional analysis (0.65 < PPH4 < 0.8), and the five loci that were co-localized only in the conditional coloc2 analysis involved conditional eQTL, emphasizing the utility of the conditional analysis. This conditional eQTL co-localization represents a substantial proportion (∼15%) of all instances of co-localization, and furthermore could reflect context-specific differential expression that has the potential to implicate cell types, tissue types, and developmental stages that are relevant to disease etiology.