Materials and Methods Plant material The F6 population composed of 94 individuals was obtained by single seed descent from the F2 population described in Moretzsohn et al. (2009). Progenies are derived from a cross between A. ipaënsis (accession GKBSPSc 30076, hereafter referred to in the abbreviated form K 30076), and the closely related A. magna (GKSSc 30097 hereafter referred to in the abbreviated form K 30097), used as the female and male parents, respectively. Seeds were obtained from the Brazilian Arachis germplasm collection, maintained at Embrapa Genetic Resources and Biotechnology (Brasília-DF, Brazil). Phenotyping Rust phenotyping: The recombinant inbred lines (RILs) population and the parents were phenotyped for resistance to P. arachidis. Arachis hypogaea cv. Runner IAC 886 was included as susceptible control. The population was evaluated on F6 and F7 generations. Phenotyping was performed using the detached leaf technique (Moraes and Salgado 1982; Leal-Bertioli et al. 2009). Field assays would not be suitable because of the architecture of the wild-derived diploid plants. Rust spores were collected from infested peanut plants in Pindorama, São Paulo State, Brazil (coordinates 21.1858° S, 48.9072° W). Two bioassays were done, one in 2012 and the other in 2013. In the bioassay of 2012, leaves were inoculated with ca. 4 × 105 urediniospores/mL in 0.05% Tween 20 fungal spores and maintained at 26–28° and photoperiod of 10-hr light and 14-hr dark. In the bioassay of 2013, ca. 2 × 105 urediniospores/mL were used. Four replicates of each individual were analyzed 25 d after inoculation. Susceptibility was measured using the following parameters: total number of lesions/leaf area (cm) (TL/LA), number of sporulated lesions/leaf area (cm) (SL/LA), Incubation period (time for appearance of first lesion in number of days after inoculation) (IncPer), and susceptibility index (SI). SI was calculated with the scale of Savary et al. (1989), with the following modifications: index was the number of lesions times a number that reflected lesion size/reaction. I = ∫(s * n)/LA, where s = lesion size (1 = necrotic aborted lesion, 2−6 = ruptured, sporulating pustules, varying between 0.5 and 3 mm in diameter), n = number of lesions of a particular size, LA = leaf area (mm2). Sporulation was evaluated with the aid of a stereoscope microscope. LA was calculated with the software Quant (Vale et al. 2001). In genotypes that did not present symptoms and therefore did not have incubation period, for QTL analyses, this trait was artificially tabulated as 200. Other agronomic/domestication traits: Plants were grown in long trays (1 m × 30 cm × 30 cm), with enough space for lateral branch trailing and seed set. Branches were regularly trailed back to the pots to ensure that pegs would penetrate the soil. At between 40 and 60 d after planting, height of main stem of up to 10 plants of each RIL was measured (main stem height; MSH). At harvest (approximately 120 d after planting), peg length (PL) and pod constriction (PC) was measured with six replications. Harvested seeds were counted (seed number), dried at 20° at 15% relative humidity (RH) for 15 d, and 10 seeds, randomly selected, were weighted (10-SW). Evaluations were performed in three years, 2011, 2012 and 2013, except for PL and PC, which were evaluated only in 2012. DNA extraction Total genomic DNA was extracted from young leaflets essentially as described by Grattapaglia and Sederoff (1994). The quality and quantity of the DNA were evaluated on 1% agarose gel electrophoresis and spectrophotometer NanoDrop 1000 (Thermo Fisher Scientific). Genetic mapping and QTL analyses A linkage map for this population has been published in Shirasawa et al. (2013). This map contained 773 microsatellite and 25 miniature inverted-repeat transposable element loci. We used these 798 markers plus 26 newly genotyped microsatellite markers to construct an updated linkage map, by using Mapmaker Macintosh 2.0 and Mapmaker/EXP 3.0 (Lander et al. 1987; Lincoln et al. 1992). A χ2 test was performed to test the null hypothesis of 1:1 segregation on all scored markers. A minimum logarithm of the odds (LOD) score of 6.0 and maximum recombination fraction (θ) of 0.35 were set as thresholds for linkage groups (LGs) determination with the “group” command. The most likely marker order within each LG was estimated by the matrix correlation method using the “first order” command. Marker orders were confirmed by comparing the log-likelihood of the possible orders by permuting all adjacent triple orders (“ripple” command). After establishment of the group orders, the LOD score was set to 3.0 to include additional markers in the groups. The “try” command was then used to determine the exact position of the new markers within each group. The new marker orders were again confirmed with the “ripple” command. Recombination fractions were converted into map distances in centimorgans (cM) using the Kosambi’s mapping function and the “error detection” command available in Mapmaker/EXP 3.0 (Lander et al. 1987, Lincoln et al. 1992). Based on this map, genomic regions with no recombination or identical markers were identified and all loci but one were removed from these regions (pairs or groups of loci with 0 cM distance). This newly developed framework map was used for QTL analysis. Phenotyping data included the components of resistance to P. arachidis and agronomic traits (Supporting Information, File S1). Traits phenotyped in different trials or bioassays were analyzed separately. The normality of data distribution was evaluated by skewness and kurtosis values using WinQTL Cartographer, version 2.5 (Wang et al. 2006). QTL were mapped by using the composite interval mapping method proposed by Zeng (1993, 1994) and also the WinQTL Cartographer. This software assumes that the quantitative data under analysis are normally distributed. Some of the data sets did not fit this assumption and were log(x) transformed. We performed composite interval mapping analysis with the standard model (Model 6), scanning the genetic map and estimating the likelihood of a QTL and its corresponding effects at every 1 cM while using eight significant marker cofactors to adjust the phenotypic effects associated with other positions in the genetic map. A window size of 10 cM was used, and therefore cofactors within 10 cM on either side of the QTL test site were not included in the QTL model. Thresholds were determined for each trait by permutation tests (Churchill and Doerge 1994; Doerge and Churchill 1996), by the use of 1000 permutations and a significance level of 0.05. Graphic presentation of the LGs and the significant QTL was drawn with MapChart, version 2.1 (Voorrips 2002). KASP marker development and validation in tetraploid genetics Rationale: The aim of the methods in this section was to develop reliable and easy-to-use DNA markers for the genomic region in A. magna K 30097 that confers rust resistance. Although A. magna K 30097 was of primary interest for this study, A. batizocoi K9484 also is being used in our research and for introgression in breeding programs (Leal-Bertioli et al. 2014b). Therefore, we aimed to develop markers that would function for both these species (Leal-Bertioli et al. 2014a,b). Because introgression will be in allotetraploid cultivated peanut, the markers must function in this genetic context, but for SNP discovery, we used a strategy of SNP calling in the diploid context. This strategy relies on the very close relationship of A. ipaënsis and the B genome of A. hypogaea (Moretzsohn et al. 2013). Because of this close relationship, a polymorphism identified between A. magna and A. ipaënsis is very likely to be conserved between A. magna and the B genome of A. hypogaea. After marker design, this conservation was confirmed by marker assays. Production and assembly of transcript sequences: Total RNA from A. magna K 30097 and A. batizocoi K9484 was extracted from the first expanded leaf of the main axis using the QIAGEN Plant RNeasy kit (QIAGEN) with on-column DNAse treatment. cDNA libraries were constructed using equal amounts of RNA from five individuals of each genotype using the TruSeq v2 library construction kit (Illumina), as described in Leal-Bertioli et al. (2015). To obtain long reads to improve transcriptome assemblies, size-selected libraries were sequenced using MiSEQ v3.0. Adapter and quality trimming was performed using Trim_galore! v0.3.5. (http://www.bioinformatics.babraham.ac.uk/projects/trim_galore/). Adapters were trimmed with Cutadapt (http://code.google.com/p/cutadapt/). FastQC (http://galaxy.csdb.cn:8000/tool_runner?tool_id=fastqc) was used to display quality information for cleaned reads. Transcripts were assembled using Trinity (Haas et al. 2013). Assembled transcripts were filtered to include only the longest isoform from each read cluster. The longest isoforms were then aligned to each other by the use of NCBI blastn v2.2.29. Alignments with 100% sequence identity and ≥ 90% sequence length were considered redundant and removed from the final assembly. SNP discovery: Arachis ipaënsis reference genome sequence (www.Peanutbase.org) was used as proxy of the B genome of A. hypogaea to discover SNPs between the rust-resistant accessions and peanut (susceptible). This was done by aligning A. magna and A. batizocoi transcripts (resistant) with the reference genome of A. ipaënsis (GenBank assembly accession GCA_000816755.1) using the NGSEP pipeline (Duitama et al. 2014) tagging the region where the main QTL for rust resistance was identified (pseudomolecule Araip.B08, peanutbase.org), in the vicinity of the microsatellite marker Ah-280 (region between 117048352 and 129519037 bp) and also for another QTL linked marker on Araip.B08, AHGS1350 (region between 346729 and 848328 bp) (Table 1). Default parameters were used, except the minimum and maximum fragment length for valid paired-end alignments, which we estimated separately for each genotype aligning their first 250,000 fragments and then plotting the distribution of estimated insert lengths (Script available at the NGSEP Web site http://sourceforge.net/projects/ngsep/files/Library/scripts/). We used the recommended parameters of NGSEP for analysis of WGS data: (1) minimum genotype quality 40; (2) minimum value allowed for a base quality score 30; and (3) Maximum number of alignments allowed to start at the same reference site 2. We also used NGSEP for filtering (the most relevant was a maximum minor allele frequency of 0.01) and conversion from VCF to other formats for primer design and visualization of SNPs with Flapjack software (Milne et al. 2010). Table 1 Quantitative trait loci identified for resistance to Puccinia arachidis and agronomic traits on an A. ipaënsis × A. magna F6 population Trait Category Trait Symbol LGa Positionb Nearest Marker(s) LODc Additive Effectd R2 (%)e Rust resistance SI_2012 4 17.0 TC7G10 2.7 0.18 7.8 7 38.7 AHS0598 3.0 0.16 8.2 8 25.4 AHGS1350 / AHS2541 3.3 0.20 13.2 8 35.9 Ah-280 6.9 0.25 21.2 SI_2013 8 35.9 Ah-280 3.2 0.31 5.8 TL/LA_2012 8 25.4 AHGS1350 / AHS2541 4.1 0.13 16.0 8 35.1 Ah-280 2.9 0.09 8.9 TL/LA_2013 8 35.9 Ah-280 3.8 0.20 12.3 SL/LA_2012 8 35.9 Ah-280 3.8 0.07 11.1 SL/LA_2013 8 35.9 Ah-280 3.5 0.14 11.0 Log_IncPer_2012 8 42.9 Ah-280 / Ah-558 8.2 −0.46 59.3 Log_IncPer_2013 8 33.1 AHS2541 / Ah-280 7.6 −0.33 33.9 8 38.9 Ah-280 7.6 −0.33 34.8 Productivity Log_SN 3 82.3 ML2A05 3.4 −0.11 8.6 4 28.0 AHGS2785 2.6 −0.10 6.2 10 35.5 AHS1488 3.1 −0.15 10.3 10-SW 4 68.8 AHGS1279 / AHS2728 3.9 −0.32 18.4 5 44.5 AHGS2602 2.9 0.20 8.3 Seed Peg_Length 1 40.8 AHGS2019 / Seq12B2 3.4 −2.07 7.2 Characteristics 4 64.9 AHGS2155 11.2 4.18 25.8 4 70.8 AHGS1279 / AHS2728 8.7 4.38 30.2 5 10.4 AHS2897 3.4 2.06 6.9 9 43.7 AHGS2018 / AHGS2235 3.0 1.91 6.3 Pod_constriction 1 37.1 AHGS2332 4.1 −0.67 9.4 4 64.3 AHGS1917 3.4 0.59 7.4 5 47.0 AHGS2513 3.5 −0.80 8.0 6 25.4 AHGS2106 3.7 −0.66 9.7 8 0.0 AHGS1383 3.5 −0.60 8.1 9 36.3 AHGS1478_b3 / AHGS2537_b2 3.3 0.56 7.3 Plant MSH_2009 2 37.8 RN31F06 4.8 −5.44 14.5 Architecture 4 64.3 AHGS1917 / AHGS2155 6.5 −5.79 17.6 5 48.7 AHGS1228 3.2 3.61 8.1 Log_MSH2011 4 64.9 AHGS2155 4.1 −0.10 10.7 5 41.1 AHGS1980 4.0 1.64 10.0 6 13.0 AHS2153 4.5 −0.11 11.4 MSH_2012 3 23.2 TC1E06 4.0 −1.97 8.1 4 64.3 AHGS1917 / AHGS2155 12.3 −3.56 30.4 5 48.7 AHGS1228 3.6 2.13 7.4 a Linkage group. b Expressed in Kosambi cM. c LOD score, logarithm of the odds. d Positive values indicate that higher-value alleles come from A. ipaënsis K 30076, and negative values indicate that higher-value alleles come from A. magna K 30097. e Proportion of the phenotypic variance explained by the quantitative trait loci. Primer design and test: Allele-specific forward primers and a common reverse primer were designed for use in KASP assays (LGC Genomics Ltd. Hoddesdon, UK; http://www.lgcgenomics.com/kasp-genotyping-reagents) using BatchPrimer3 (http://probes.pw.usda.gov/batchprimer3/) with the “Allele specific primers and allele flanking primers” option. The parameters used were 60−120 bp in size, Tm between 58 and 60°, and GC content between 30 and 80%. The alternative alleles were marked with 6-FAM and reference alleles with VIC. For each SNP, two allele-specific forward primers and one common reverse primer were designed. A schematic diagram of SNP discovery and primer design is shown in Figure 1, A and B. Primer information is listed in File S1. Figure 1 A schematic diagram of single-nucleotide polymorphism (SNP) discovery and Kompetitive allele-specific polymerase chain reaction (KASP) primer design. A. ipaënsis K 30076 is used as proxy for the B-genome of A. hypogaea. (A) Alignment of A. ipaënsis K 30076, A. magna K 30097, and A. batizocoi K 9484 paired-end cDNA reads onto A. ipaënsis K 30076 genomic sequence, Pseudomolecule Araip.B08 (where rust resistance QTL reside) and the identification of SNPs. (B) Example of design of allele-specific and site specific (common) primers for the SNPs identified. KASP assays were performed with the following genotypes: the diploids A. ipaënsis K 30076, A. batizocoi K9484 and A. magna K 30097, the induced allotetraploids (A. magna K 30097 x A. stenosperma V15076)4x (here called MagSten) and (A. batizocoi K9484 × A. stenosperma V10309)4x (here called BatSten1) and six A. hypogaea cultivars (Tifrunner, Tifguard, GA-06G, NC3033, ICVG 88145, and SPTG_06). Reactions consisted of 2 μL of KASP 2X reaction mix, 0.055 μL of assay primer mix (12 mM of each allele-specific primer and 30 mM of common primer) and 20 ng of genomic DNA, in a 4-µL volume. A C1000 Thermal Cycler (Bio-Rad) was used with the following cycling conditions: 94° for 15 min, nine cycles of 94° for 20 sec, touchdown starting at 65° for 60 sec (decreasing 0.8° per cycle), 29 cycles of 94° for 20 sec, and 57° for 60 sec (http://www.cerealsdb.uk.net/cerealgenomics/CerealsDB/PDFs/KASP_SNP_Genotyping_Manual.pdf). To improve the results, a second KASP program was run as following: 9 cycles of 94° for 20 sec and 57° for 60 sec. Fluorescence was read by a The LightCycler 480 Instrument II (Roche Life Science) and analyzed using the LightCycler 480 Software (V.1.5.1).