Bayesian method
Balding and Nichols have proposed the use of population- and locus-specific estimates in the context of a migration-drift equilibrium model [24]. They modeled allele frequencies of biallelic markers using a beta distribution with expectation p and variance p(1-p)/(1+θ) so that FST = 1/(1+θ). Then, they used a likelihood-based approach to estimate population- and locus-specific FST.
This formulation was extended later to consider multiallelic loci [19]. Falush et al. [25] considered the nonequilibrium fission model that subpopulations evolve in isolation after splitting from an ancestral population. We will not distinguish the migration-drift equilibrium model and nonequilibrium fission model, since both lead to the same multinomial-Dirichlet distribution (or beta-binomial for biallelic loci) for the subpopulation allele frequencies [26]. The main difference between the two models resides in the interpretation given to FST: in the case of the migration-drift equilibrium model, FST measures how divergent each subpopulation is from the total population, while in the case of the fission model, it measures the degree of genetic differentiation between each descendant population and the ancestral population.
We consider a collection of J subpopulations and a set of I loci. Let Ki> be the number of alleles at the i th locus. The extent of differentiation between subpopulation j and the ancestral population at locus i is measured by . Let pi = { pik } denote the allele frequencies of the ancestral population at locus i, where pik is the frequency of the allele k at locus i( = 1). We use p = { pi } to denote the entire set of allele frequencies of the ancestral population and  to denote the current allele frequencies at locus i for subpopulation j. Under the model and the definitions above, the allele frequencies at locus i in subpopulation j follow a Dirichlet distribution with parameters θjpi, a Bayesian prior distribution,
(3)    
where θij = 1/-1. The extent of differentiation at locus i between subpopulation j and the ancestral population is measured by  and is the result of its demographic history. The full prior distribution across loci and populations is given by
(4)    
We need a hierarchical model for locus- and population-specific effects to identify candidate loci under natural selection. It is no doubt that population- and locus-specific estimates of  by the moments (or ANOVA) methods are likely to be inaccurate, especially for loci with a small number of different alleles. As an alternative, Beaumont and Balding [22] proposed a familiar logistic regression model to decompose population- and locus-specific  coefficients (bounded between 0 and 1) into a population-specific component, βj, shared by all loci and a locus-specific component, αi, shared by all populations [22]:
(5)    
The locus-specific effects express mutation and some forms of selection, and population-specific effects express migration rates, population sizes, and population-specific mating patterns [22]. The advantage of this formulation is that instead of estimating locus- and population-specific I·J  coefficients (as in the method of moments or ANOVA), we only have to estimate I the parameters αi and the J parameters βj. With the estimates of αi and βj,  (and equivalently θij = exp(-(αi + βj)) can be estimated.
Above all, the estimate of αi is our objective to detect outlier loci as selection candidates. Departure from neutrality at a given locus is assumed when αi is significantly different from 0 at that locus. A positive value of αi suggests diversifying selection, whereas negative values suggest balancing selection. The posterior probability that a locus is subject to selection, P(αi ≠ 0), is estimated directly from the Markov Chain Monte Carlo method (MCMC) (see below).