Material and Methods

Genetic Model and Notation
Let Y be a quantitative trait and G be the minor allele count for the SNP under investigation (G = 0, 1, or 2); the additive assumption is not critical to the method development. Let E be an exposure variable interacting with the genetic factor. This exposure E could reflect continuous or categorical measures of environmental or genetic background. The underlying true genetic model can include main effects of both G (βG) and E (βE) on Y, as well as the interaction effect (βGE):(Equation 1) Y∼β0+βGG+βEE+βGEGE+ε.We assume that the trait Y is normally distributed with unit variance conditional upon G and E, in other words, Var(Y |G = g, E = e) = 1 and ε∼N(0,1).
When considering only G, the working model would reduce to(Equation 2) Y∼β0+βGG+εG.Pare et al.13 showed that the conditional variance of Y conditional on G alone could be expressed as σG2 = Var(Y|G = g) = (βE+βGEg)2+1. Thus, if an interaction effect was present (i.e., βGE≠0), the trait variance would differ between genotypes.

Joint Location-Scale Testing Procedure for Single-SNP Analysis
Our proposed JLS testing framework, based on the working model of Equation 2, tests the following null hypothesis:H0joint:βG=0andσi=σjforalli≠j,i,j=0,1,2.The alternative hypothesis of interest isH1joint:βG≠0orσi≠σjforsomei≠j.For a SNP under study, different JLS test statistics can be considered. Let pL be the p value for the location test of choice (i.e., testing H0location:βG=0 using, for example, ordinary least-squares regression), and pS be the p value for the scale test of choice (i.e., testing H0scale:σi=σjforalli≠j using, for example, Levene’s test). We first consider Fisher’s method (JLS-Fisher) to combine the association evidence from the individual location and scale tests. The JLS-Fisher statistic is defined asWF=−2(log(pL)+log(pS)).
Large values of WF correspond to small values of pL and/or pS and provide evidence against the null H0joint. If pL and pS are independent under H0joint, WF is distributed as a χ42 random variable. Although Fisher’s method here is used to combine evidence from two tests applied to the same sample, the assumption of independence between pL and pS under H0joint holds theoretically for a normally distributed trait (Appendix A, Lemma 1), as well as empirically for approximately normally distributed traits in finite samples (Figures S1 and S2, Tables S1 and S2).
One can also consider the minimum p value (JLS-minP) approach, or various alternatives based on combining the individual test statistics themselves with or without weights.20–23 The JLS-minP statistic is defined asWM=min(pL,pS).If pL and pS are independent under H0joint, WM is distributed as a Beta random variable (with shape parameters 1 and 2) where small values of WM correspond to small values of pL and/or pS and evidence against the null.

Joint Location-Scale Testing Procedure for Gene-Set Analysis
The chosen JLS test statistic (e.g., WF) for single-SNP analysis can then be used for implementing gene-based, gene-set, or pathway analysis in a direct fashion.
Assume that J SNPs have been annotated to a gene or gene-set of interest. For each SNP j, the JLS-Fisher test statistic (e.g., WF,j) is first obtained and then the association evidence can be aggregated across the SNPs by considering, for example, the sum statistic,5 ∑jWF,j. To account for LD between SNPs, the overall association evidence can be evaluated by a phenotype-permutation approach where the empirical p value is the proportion of K permutation replicates with sum statistics more extreme than the observed value. Because this multivariate method analyzes all J SNPs simultaneously, the number of permutations need not be exceedingly large and K = 10,000 provides accurate estimates for p values in the range of 0.05. If multiple genes or gene-sets are of interest, more replicates would be required to adjust for the corresponding number of hypothesis tests.
To compare strength of association evidence between sets of variants within the same gene or across different genes, an extension of the gene-set approach can be implemented. Sum statistics are obtained as previously described for each group of variants, then calibrated by the respective number of variants. The difference between the two proportional sum statistics is the test statistic of interest,DF=1J∑jWF,j−1K∑kWF,k,where the j = 1,…,J and k = 1,…,K subscripts index the competing sets of variants, and the F subscript indicates that the variant-specific joint location-scale statistics are obtained by Fisher’s method (although other methods of combining such as minP could be used as well). The significance of DF can be evaluated with the phenotype-permutation approach as described above.
In all applications, genotypes were coded additively (G = 0, 1 or 2), and for X chromosome SNPs, female and male genotypes were analyzed together and coded as G = 0, 1, or 2 and G = 0 or 2, respectively.

Application Data: HbA1c Levels in Type 1 Diabetes and Cystic Fibrosis Lung Disease
We tested the proposed JLS approach with applications to genetic association studies of HbA1c levels in type 1 diabetes and lung disease in cystic fibrosis.
The T1D application used the Diabetes Control and Complications Trial (DCCT) sample in which 667 individuals were conventionally treated and 637 intensively treated.18 In this sample, an earlier GWAS of HbA1c levels in type 1 diabetes18 showed that rs1358030 near SORCS1 (10q25.1 [MIM: 606283]) interacts with treatment type (conventional versus intensive) on HbA1c levels (quarterly measured values spanning 6.5 years). To demonstrate that the JLS testing framework could leverage the interaction effect without knowledge of the interacting variable, we analyzed the association of rs1358030 with the average, inverse normal transformed HbA1c value, assuming the treatment variable was not available.
The CF application involves association studies of an averaged lung function measure, forced expiratory volume in 1 s, adjusted for sex, age, height, and mortality, and normalized (SaKnorm),24 using data from the Canadian Cystic Fibrosis Gene Modifier Study (CGS).25 To reduce the duration of heterogeneous environmental exposures that were not measured, and recognizing that age can serve as a surrogate for these exposures, we19 previously restricted our (location-only) analysis to lung function measures from pediatric ages (<18 years, n = 815 subjects from 753 unique families), analyzing eight SNPs in three genes, SLC9A3 (MIM: 182307), SLC6A14 (MIM: 300444), and SLC26A9 (MIM: 608481), previously identified as associated with meconium ileus in a hypothesis-driven GWAS (GWAS-HD).5 This approach involved a 46% reduction in sample size, and that the age restriction be fixed for all variants, despite the possibility that the optimal exposure (here, age) is gene specific. Here we re-analyzed the SNPs with the individual location-only and scale-only tests, as well as the joint JLS-Fisher, JLS-minP, LRT,17 and distribution16 tests, removing the age restriction and using the full CGS sample (n = 1,409 unrelated subjects). For comparison, the location-only analyses restricted to the pediatric population using different age cut-off points were also investigated.
With the full CGS sample, we further tested the hypothesis that multiple proteins present on the apical plasma membrane contribute to lung disease severity as measured by SakNorm; the hypothesis was considered previously for meconium ileus susceptibility in CF.5 In total, 3,814 GWAS SNPs (MAF > 0.02) were annotated to within ±10 kb of 155 apical genes obtained from the Gene Ontology project.5 The JLS-Fisher test was first applied to each SNP, and the SNP-specific test statistic was then aggregated across all SNPs to perform the multivariate apical gene-set analysis. We then used an independent French sample (n = 1,232) for replication. Imputation based on 1000 Genomes26,27 (as outlined in the Online Methods of Sun et al.5) was used to assess regional association within the SLC9A3R1, SLC9A3R2, and EZR binding sites of SLC9A3.
Institutional review committees at all participating institutions for the DCCT T1D study as well as all Canadian CF clinics approved this study. Consent was also obtained for participants from France with procedural approval (CPP 2004/15) and information collection approval by CNIL (04.404). Data collection, genotyping, and quality-control procedures are reported elsewhere for the T1D18 and CF (Canadian and French) studies.5,25

JLS Method Evaluation by Simulation
We conducted extensive simulation analyses to evaluate the performance of the proposed JLS-Fisher and JLS-minP tests for single-variant analysis and compared them with the individual location-only and scale-only tests, as well as the distribution test16 and the LRT.17 We also conducted a sensitivity analysis on the various tests, studying the impact of poorly captured genotypes that can be expected from imputed datasets. Simulation details are provided in Appendix A.