The Footprints of Ancestral Population Diversity In apparent dichotomy with the strong selective constraint described for CD209, we observed an unusual excess of diversity of 35 fixed differences separating the two basal branches of the gene tree (fig. 6). In addition, we estimated a T MRCA of 2.8±0.22 MYA, a time that places the most recent common ancestor of CD209 back in the Pliocene epoch, before the estimated time for the origins of the genus Homo ∼1.9 MYA (Wood 1996; Wood and Collard 1999). A number of studies have already reported loci that present unusually deep coalescent times (Harris and Hey 1999; Zhao et al. 2000; Webster et al. 2003; Garrigan et al. 2005a, 2005b), but our estimation for CD209 remains one of the deepest T MRCA values yet reported (Excoffier 2002). The probability of finding such a deep coalescence time under a scenario of a random-mating population was estimated, through a coalescent process (Laval and Excoffier 2004), to be very low (P=.018) (see fig. 7 ). In addition to the unexpected antiquity of the CD209 locus, we observed a peculiar tree topology made of two highly divergent and frequency-unbalanced lineages, cluster A embracing only 2 internal haplotypes and cluster B comprising the remaining 23 (fig. 6). Figure 7 Coalescent-based simulations (2×104) of the expected TMRCA distribution of CD209. Different hypotheses can account for such elongated and divergent haplotype patterns. Indeed, the high levels of nucleotide identity between CD209 and CD209L could have led to gene conversion between the two genes, an event that would explain the outlier position of cluster A in the context of CD209 phylogeny. We reasoned that if gene conversion has occurred, we expect that the derived alleles distinguishing clusters A and B in CD209 would correspond to the allelic state observed in their homologous positions in CD209L. Of all positions, only four fit this criterion. In addition, these positions were not physically clustered, which therefore excludes a major gene-conversion event as the explanation of the divergent CD209 phylogeny. Two other circumstances may be responsible for the topology and the time depth of the CD209 gene tree: long-standing balancing selection or ancient population structure, with Africa, in both cases, being the arena of such events (i.e., cluster A is restricted to Africa). Several lines of evidence argue against the balancing-selection hypothesis. First, under this selective regime, one would expect that Tajima's D test would also point in this direction by yielding significantly positive values, which is not the case (table 2). Second, such a long-standing balancing selection in Africa would have entailed a number of recombinant haplotypes between clusters A and B, which, again, is not the case, as illustrated by the high LD levels at CD209 (fig. 3). Third, a claim of balancing selection at this locus must imply a functional difference between the two balanced alleles. Indeed, three nonsynonymous mutations, situated in the neck region, separate cluster A and B, and they could correspond to the alleles under selection. But, if the neck region is the target of selection, it is more likely that the balanced alleles would correspond to different numbers of repeats rather than punctual nucleotide variation within each track, as observed for CD209L and suggested by functional studies (Bernhard et al. 2004; Feinberg et al. 2005). Since no variation in the number of repeats was detected between both clusters, we predict that there are no major functional differences between the two lineages. Taken together, maintenance of ancient lineages by balancing selection does not seem to be responsible for the observed haplotype divergence. In this view, the patterns observed are best explained by an ancestral population structure on the African continent. Indeed, several studies have already proposed that African populations must have been more strongly subdivided and isolated than non-African ones (Harris and Hey 1999; Labuda et al. 2000; Excoffier 2002; Goldstein and Chikhi 2002; Harding and McVean 2004; Satta and Takahata 2004; Garrigan et al. 2005a). In particular, a recent study of the Xp21.1 locus presented convincing statistical evidence that supports the hypothesis that our species does not descend from a single, historically panmictic population (Garrigan et al. 2005a). The divergent haplotype pattern observed at the Xp21.1 locus prompted those authors to explain their data under the isolation-and-admixture (IAA) model and/or a metapopulation model (Harding and McVean 2004; Wakeley 2004). Indeed, as observed for CD209, under an IAA model, the two basal branches are expected to be longer than those under a Wright-Fisher model, depending on the length of time subpopulations spent in isolation. The extent to which the IAA model fits the data depends on the number of mutations, referred as to “congruent sites,” occurring in the two basal branches of the genealogy. For Xp21.1, 10 congruent sites over 24 polymorphisms were observed (i.e., ∼42% of the total number of sites). We applied the same approach to CD209 and obtained a very similar percentage of ∼45%, in good accordance with the IAA model. Our observations, together with a number of autosomal diversity studies, show that modern human diversity appears to have kept genetic traces of admixture among archaic hominid populations. However, a number of questions remain unanswered, such as the time when these admixture events occurred (i.e., before or after the appearance of anatomically modern humans), the precise quantitative contribution of ancient genetic material to our modern gene pool, and the geographic provenance of these genetic vestiges.