PMC:3480682 / 7855-10577
Annnotations
{"target":"https://pubannotation.org/docs/sourcedb/PMC/sourceid/3480682","sourcedb":"PMC","sourceid":"3480682","source_url":"https://www.ncbi.nlm.nih.gov/pmc/3480682","text":"Estimating FST by ANOVA methods\nThe estimators of F-statistics proposed by Weir [17] and Weir and Cockerham [18] are based on an analysis of variance (ANOVA) of allele frequencies, equivalently called the method-of-moments estimates. The weighted ANOVA estimates of FST, FIT, and FIS may be expressed in terms of the mean sum of squares for gametes (MSG), individuals (MSI), and populations (we sometimes say 'between subpopulations') (MSP), where the mean squares are estimated by an ANOVA model. In estimating FST specifically for our analysis of CNV data, we need to consider unbalanced samples (i.e., populations of unequal size). However, as the formulas are messy, we present here those for balanced samples. Formulas for unbalanced samples can be found in Rousset (in Appendix A) [20].\nThe definition of F-statistics used here is\n\nwhere Q values are probabilities of identity in state: Q1 among the genes (gametes) within individuals, Q2 among genes in different individuals within populations, and Q3 among the populations. The estimates are expressed in terms of observed frequencies of identical pairs of genes in the sample, with the following relationships:\n\nand\n\nwhere n is the sample size of each population. Then, the single locus estimator is given by\n(1) \nwhich is found in Weir (1997: 178) [17]. nc will be defined below. If one needs to obtain the multilocus estimator of , it is usual to compute the estimator as a sum of locus-specific numerators over a sum of locus-specific denominators (see Weir [17] and Weir and Cockerham [18]). This is the case that map information for SNPs is obtained for each gene, and a weighted-average FST from all SNPs is estimated for each gene [18]. For a set of I loci, the multilocus ANOVA estimators are\n(2) \nfor nc = (S1-S2/S1)/(n-1), where S1 is the total sample size and S2 is the sum of squared sample sizes of populations [21]. For convenience, we denote the estimator by FST.\nRousset [21] explained that the multilocus estimators of Weir [17] and Weir and Cockerham [18] differ slightly, and these two also differ slightly from that proposed by Rousset [21], which assigns more weight to larger samples. In this paper, the GENEPOP software (version 3.4) (http://wbiomed.curtin.edu.au/genepop/) of Rousset was used for the calculation of FST. In order to distinguish from those of the method-of-moments estimates of Weir [17] and Weir and Cockerham [18], we will call the estimates of GENEPOP ANOVA estimates.\nThe estimated values of FST can be negative when levels of differentiation are close to zero and/or sample sizes are small, indicating no population differentiation at these loci [18]. One can assign a value of zero to negative FST estimates.","divisions":[{"label":"Title","span":{"begin":0,"end":31}}],"tracks":[{"project":"2_test","denotations":[{"id":"23105934-21585727-44845815","span":{"begin":1892,"end":1894},"obj":"21585727"},{"id":"23105934-21585727-44845816","span":{"begin":1956,"end":1958},"obj":"21585727"},{"id":"23105934-21585727-44845817","span":{"begin":2125,"end":2127},"obj":"21585727"}],"attributes":[{"subj":"23105934-21585727-44845815","pred":"source","obj":"2_test"},{"subj":"23105934-21585727-44845816","pred":"source","obj":"2_test"},{"subj":"23105934-21585727-44845817","pred":"source","obj":"2_test"}]}],"config":{"attribute types":[{"pred":"source","value type":"selection","values":[{"id":"2_test","color":"#c6ec93","default":true}]}]}}