Subjects and Methods Subjects A total of 1140 DNA samples isolated from males from the Iberian Peninsula and the Balearic Islands were analyzed; below, we refer to these samples as “Iberian,” for brevity. All samples were collected with appropriate ethical approval and informed consent. Individuals were assigned to geographical locations on the basis of paternal grandfather's place of birth, and they were then grouped on the basis of traditional regions. Andalusia was divided into western and eastern (including Murcia) parts; Castilla y Leon was divided into northeast and northwest Castile; and a set of individuals from the Pyrenees, including some from north of the Spain-France border, were pooled as “Gascony.” Samples from Portugal were divided into two sets, those north and those south of the Mondego river. Y Chromosome Haplotyping Binary markers (Figure 1) on the nonrecombining region of the Y chromosome were typed in hierarchical multiplexes,43 via the SNaPshot minisequencing procedure (Applied Biosystems) and an ABI3100 Genetic Analyzer (Applied Biosystems). All samples were initially analyzed with multiplex I43 (containing the markers M9, M69, M89, M145, M170, M172, M201, and 12f2). Samples derived for M9 (haplogroup [hg] K) were analyzed with multiplex II43 (containing M17, M45, M173, M207, P25, and SRY10831). Two individuals derived for M45 but ancestral for M207 (hgP∗[xR]) were analyzed with the markers MEH2 and M3 and could thus be assigned to hgQ∗(xQ3). Samples derived for M173 but ancestral for SRY10831.2 and M17 (hgR1∗[xR1a]) were further analyzed with multiplex IV—which, to our knowledge, is previously unpublished—containing the markers M65, M153, M222, M269, and SRY-2627. Ten individuals carry reversions of the marker P25 through gene conversion,44 and the allelic state of this marker in these chromosomes is therefore ignored for the purposes of this study. Samples derived for M145 within multiplex I (in hgDE) were further analyzed with multiplex III43 (containing M33, M35, M75, M78, M81, M96, M123, and P2), and the marker M2 was also analyzed as appropriate. Previously unreported primers were designed on the basis of published information about polymorphic sites.45 Note that hgR2 (R-M124) is reported in Figure 1, because it was detected in the Sephardic Jewish sample (Table S1, available online), but was not typed in the Iberian samples, because all chromosomes derived for M207 (hgR) were also derived for M173 (hgR1). Nomenclature of haplogroups is in accordance with the Y Chromosome Consortium,45 uses updated names,40 and is given in Figure 1. We employ shorthand names as follows: E3b∗ refers to E3b∗(xE3b1, E3b2, E3b3), also known as E-M35∗(xM78, M81, M123); and R1b3∗ refers to R1b3∗(xR1b3b, R1b3d, R1b3f, R1b3g), also known as R-M269∗(xM65, M153, SRY-2627, M222). Nineteen Y-chromosomal STRs (DYS19, DYS385a/b, DYS388, DYS389I, DYS389II, DYS390, DYS391, DYS392, DYS393, DYS434, DYS435, DYS436, DYS437, DYS438, DYS439, DYS460, DYS461, DYS462) were typed in three multiplexes, as described previously,46 on an ABI3100 Genetic Analyzer (Applied Biosystems). Allele nomenclature is as reported by Bosch et al.,46 and DYS385a and DYS385b were omitted from statistical analyses. Comparative data Comparative data for North African populations were obtained from the literature.34,47 For Moroccan and Saharawi data,34 haplogroup resolution was increased to facilitate comparison with haplogroups determined here by the consideration of previously published data on the marker 12f248 and by typing of the hgG-defining marker M201 on chromosomes belonging to hgF∗(xH,I,J,K). Haplogroup G was also undetermined in the published Algerian and Tunisian data,47 so this haplogroup was predicted from Y-STR haplotypes via a published method.49 We used the Bayesian and support vector machine (SVM) approaches, with our Iberian sample as the training set, and we based the predictions on the 14 Y-STRs (the list above, omitting DYS385a/b, and DYS460-462) that are shared between our data and the published data.47 A single Algerian chromosome among the ten hgF∗(xH,I,J,K) cases was predicted with high confidence to belong to hgG (100% [Bayesian] and 96% [SVM]); this low level of the haplogroup in North Africa is consistent with the Moroccan and Saharawi samples and with an independently published set of Algerian data.50 In comparisons among Iberian samples typed here, 17 Y-STRs were considered; when comparisons were done with published data on North African samples,34,47 the number of Y-STRs was reduced to eight for compatibility with the published data, after adjustment of allele nomenclature for DYS389I.47 Comparative data for Sephardic Jewish populations were extracted from a large collection of Y haplotypes assembled by D.M.B. and K.S. The term “Sephardic Jews” is used here in its narrow sense,51 referring to Jewish men deriving from originally Ladino-speaking communities that emanated directly from the Iberian Exile. Included males noted in their informed consents that they, their fathers, and their paternal grandfathers are Sephardic Jews from the specified community. A sample of 174 males was compiled (Table S1), made up of self-defined Sephardic Jewish males either from the Iberian Peninsula itself or from countries that received major migrations of Sephardic Jews after the expulsion of 1492–1496, as follows: Belmonte, Portugal (16); Bulgaria (49); Djerba (13); Greece (2); Spain (3); Turkey (91). Countries that received exiles from the Iberian Peninsula but that themselves had substantial preexisting Jewish communities (Italy and the North African countries) were not included. Haplogroups were equivalent to those typed in the Iberian Peninsula samples, except that sublineages of hgR1b3 were not defined. In haplogroup comparisons, therefore, all of these sublineages were combined into hgR1b3 (also known as R-M269) itself. Data on eight Y-STRs were available, allowing comparison with Iberian and North African data. Data Analysis In many cases, sample sizes for haplogroup-based analyses are larger than those used for the same populations for Y-STR-based analyses. For the Moroccan sample (total n = 147), this is because only a subset (n = 104) was typed for Y-STRs. For the remainder, small reductions in sample sizes are due to the removal of chromosomes carrying STR-allele duplications, or “partial” alleles, which cannot be readily accommodated in STR-based analyses. Summary statistics (Nei's estimator of gene diversity, population-pairwise FST [for haplogroups] and RST [for Y-STR haplotypes]) were calculated with Arlequin.52 Multidimensional scaling based on FST and RST matrices was carried out with PROXSCAL in SPSS 14.0. Relationships between Y-STR haplotypes within specific haplogroups were displayed via reduced-median networks53 constructed within Network 4.500 with the use of intrahaplogroup variance-based weighting as described previously.54 Chromosomes carrying Y-STR allele duplications, or partial alleles, were omitted before analysis. Admixture proportions were estimated with mY statistics implemented in Admix 2.0.55 This coalescent-based estimator takes into account allele frequencies, as well as molecular information.56 All potential parental populations are expected to be sampled and constant in size, and the effects of genetic drift or gene flow since the admixture event are considered negligible.11 We have used three parental populations: Basques (n = 115), Moroccans34 (n = 104), and Sephardic Jews (n = 174). Molecular distances between haplotypes were calculated with binary markers and microsatellites, weighted respectively at 100 and 1 to reflect their differences in mutation rates. Standard errors were calculated on the basis of 10,000 bootstraps. Average square difference (ASD) was calculated as described previously.57