Genotyping and linkage statistics. Genomic DNA was isolated from kidney by phenol-chloroform extraction. An examination of existing databases identified over 1,300 SNPs that showed variation between the B6 and C3H strains, and a complete linkage map for all 19 autosomes was constructed using 1,032 of these SNPs at an average density of 1.5 cM. Genotyping was conducted by ParAllele (South San Francisco, California, United States) using the molecular-inversion probe (MIB) multiplex technique [37]. Testing for linkage of both clinical traits and gene expression (using mlratio) was conducted using a linear model. Consider a phenotype denoted by y. The linear model that relates variation in y to QTLs and other covariates (e.g., sex) is given by the general form where μ is the trait mean, X′ a vector of covariates, β being the associated vector of regression coefficients, and e the residual error. Linkage was computed for over 20 clinical traits as well as 23,574 liver transcripts. Standard genome scans calculate linkage by comparing the linear model to the null model where β1 and β2 are the regression coefficients of the additive and dominant parameters, respectively. The LOD score represents the difference in the log10 of the likelihood of the above two equations, where the individual model likelihoods are maximized with respect to the model parameters, given the marker genotype and phenotype data. If a trait y differs on average between the two sexes but the QTL has the same effect in both males and females, we can model this interaction by including sex as an additive covariate in the above models, resulting in the new model which is then compared to the null model where β3 is the regression coefficient of the sex parameter. The effect of a QTL may be dependent on the state of a covariate; for instance, a QTL may have an effect specific to one sex, or may have opposite effects in the two sexes. This interaction can be modeled using a full model, which accounts for all additive covariates, as well as interactions between the covariates: which is compared to the above null model (Equation 5). The full model (Equation 6) allows us to model all heritable and sex-specific interactions in a single equation and maximally powers us to detect significant QTLs when the sex and sex-interaction terms are significant, given all 334 animals are included in the analysis. Furthermore, using the full model obviates the need for modeling QTLs in one sex only, a procedure that could decrease our power by halving the sample size, rendering it impossible to detect interactions between QTLs and sex. However, because the full model contains two more parameters than the model that treats sex as an additive covariate (Equation 4), the LOD score threshold for significant linkage is higher (using the convention of Lander and Kruglyak [20]), with QTL-specific significance levels of 2 × 10−3 and 5 × 10−5 equivalent to LOD scores of 4.0 (suggestive linkage) and 5.4 (significant linkage, equivalent to genome-wide p < 0.05), respectively. As a result, if there are no significant sex-additive or sex-dominant interactions, the full model will actually reduce power to detect linkage. To minimize the loss in power of fitting the full model when the sex-interaction terms are not significant, we employed a model selection procedure that introduces sex-interaction terms only if they add significantly to the overall QTL model. The model selection procedure makes use of forward stepwise regression techniques to determine whether it is beneficial to include the sex-interaction terms, conditional on realizing a significant additive effect (p < 0.001). That is, the data are fitted to Equation 4 for a given marker, and if the add term is significant at the 0.001 significance level, then we attempt to add the sex*add term into the model. The sex*add term is retained in the model if the Bayesian information criterion (BIC) for this model is smaller than the BIC for Equation 4. If sex*add is included in the model as a result of this procedure, then we again use BIC to consider including the sex*dom term in the model. To determine which of the four models (Equations 2, 4, 6, and the model selection) is optimal, we empirically estimated the FDR for each model over a set of 3,000 genes randomly selected from the set of all genes detected as expressed in the liver samples. For each gene and for each marker we fitted each of the four models to the data. We performed this same analysis on ten separate permutation sets in which each of the 3,000 gene expression trait vectors was permuted such that the correlation structure among the genes was preserved. The FDR for a given LOD score threshold was then computed as the ratio of the mean number of QTL detected in the permuted datasets (the mean taken over the ten permuted datasets) and the total number of QTL detected in the observed 3,000-gene dataset. We then generated ROC-like curves by varying the LOD score threshold, resulting in a range of FDRs (from 0% to more than 50%). To simplify this type of summary, we considered no more than one QTL per chromosome per gene expression trait by considering only the QTL with the max LOD score on each chromosome for each trait. The ROC curves for each of the four models are shown in Figure S1, with the black curve corresponding to Equation 2, the blue curve corresponding to Equation 4, the red curve corresponding to Equation 6, and the green curve corresponding to the model selection procedure. It is clear from the ROC curves that the sex and sex-interaction terms are significant in the gene expression dataset. For example, at an FDR of 5% we note that 800 QTLs were detected in the set of 3,000 genes for model 1, while 968 (21% increase) and 1,159 (45% increase) QTLs were detected for Equations 4 and 6, respectively. These results demonstrate significantly increased power to detect QTLs when sex is taken into account. We further note that the stepwise selection procedure captures more information than Equation 4, indicating a significant interaction signature in this dataset and also demonstrating that this simple statistical procedure is capable of identifying significant interaction events even at conservative FDR thresholds. Finally, it is of particular note that Equation 4 performed better than Equation 6, the model that incorporated the interaction terms at all times. That is, despite there being a significant interaction signature, the signature was not large enough to justify including interaction terms for every expression trait and at every marker tested. This fact motivated the need to employ the forward regression procedure, and these results further motivate the need to explore sex effects by employing even more sophisticated QTL detection methods, such as that recently described by Yi et al. [38].