Empirical Distribution of FST Based on CNV Study populations We obtained the high-resolution CNV data by Conrad et al. [4] from the Database of Genomic Variants (http://projects.tcag.ca/variation/). The dataset consists of 90 Han Chinese (CHB) and Japanese (JPT), 180 Caucasians of European descent (CEU), and 180 Yoruba (YRI) samples. Among the 450 individuals, we removed 19 individuals with more than 400 missing CNV genotypes. Each CNV was analyzed as biallelic dominant data, classified as neutral versus non-neutral (losses or gains). Empirical distribution of FST To assess population differentiation with an allele frequency spectrum of CNVs, the locus-specific estimates by ANOVA method and by hierarchical Bayesian method were obtained from all autosomes. The estimated values by the ANOVA method (moments estimates) can be negative when levels of differentiation are close to 0 and/or sample sizes are small, indicating no population differentiation at these loci [18]. In the hierarchical Bayesian method, the locus-specific estimate values are all positive, since they are generated by posterior distributions, which is an important benefit of a Bayesian method. With the CNV data of this study, the summary statistics of FST was 0.0853 ± 0.1195 (median, 0.0360; range, -0.0071 to 0.8994) for the ANOVA method and 0.2459 ± 0.0115 (median, 0.2462; range, 0.2193 to 0.6166) for the Bayesian method. The empirical distributions of FST are found in Fig. 1. As can be seen from the Figure, low FST values are prevalent, and a large variability persisted in the ANOVA method, while FST distribution was sharply focused in the Bayesian method. The comparison of empirical distributions of the ANOVA and Bayesian methods demonstrates that Bayesian estimates perform better than ANOVA in identifying CNVs affected by natural selection. Bayesian outlier detections We searched the genome regions that show signals of natural selection by the Bayesian likelihood method implemented via reversible jump MCMC in BayeScan software [27]. For each CNV, BayeScan directly estimates the probability that a CNV is under positive selection, from which the PO are computed [27]. The posterior odds indicate how much more likely the model with selection is compared to the neutral model, and posterior odds of 5.0 was chosen as a threshold in the detection of CNVs under positive selection. This directly allows us to control the FDR, the expected proportion of false positives among outlier CNVs [29]. In our analysis, we chose FDR to be 0.05. In the BayeScan application, first following 10 pilot runs of 5,000 iterations and an additional burn-in of 50,000 iterations, we used 50,000 iterations (sample size of 5,000 and a thinning interval of 10). Outliers detected by BayeScan BayeScan identified the three outlier CNVs with Conrad et al.'s [4] CNV data (Table 1, Fig. 2). The most decisive outlier is CNVR2664.1 (chromosome 5), in which most CEU (90%) and all CHB + JPT populations are outlying gain or loss status. The next decisive outlier is CNVR371.1 (chr. 1), in which only the YRI population has non-neutral CNVs.