KASP marker development and validation in tetraploid genetics Rationale: The aim of the methods in this section was to develop reliable and easy-to-use DNA markers for the genomic region in A. magna K 30097 that confers rust resistance. Although A. magna K 30097 was of primary interest for this study, A. batizocoi K9484 also is being used in our research and for introgression in breeding programs (Leal-Bertioli et al. 2014b). Therefore, we aimed to develop markers that would function for both these species (Leal-Bertioli et al. 2014a,b). Because introgression will be in allotetraploid cultivated peanut, the markers must function in this genetic context, but for SNP discovery, we used a strategy of SNP calling in the diploid context. This strategy relies on the very close relationship of A. ipaënsis and the B genome of A. hypogaea (Moretzsohn et al. 2013). Because of this close relationship, a polymorphism identified between A. magna and A. ipaënsis is very likely to be conserved between A. magna and the B genome of A. hypogaea. After marker design, this conservation was confirmed by marker assays. Production and assembly of transcript sequences: Total RNA from A. magna K 30097 and A. batizocoi K9484 was extracted from the first expanded leaf of the main axis using the QIAGEN Plant RNeasy kit (QIAGEN) with on-column DNAse treatment. cDNA libraries were constructed using equal amounts of RNA from five individuals of each genotype using the TruSeq v2 library construction kit (Illumina), as described in Leal-Bertioli et al. (2015). To obtain long reads to improve transcriptome assemblies, size-selected libraries were sequenced using MiSEQ v3.0. Adapter and quality trimming was performed using Trim_galore! v0.3.5. (http://www.bioinformatics.babraham.ac.uk/projects/trim_galore/). Adapters were trimmed with Cutadapt (http://code.google.com/p/cutadapt/). FastQC (http://galaxy.csdb.cn:8000/tool_runner?tool_id=fastqc) was used to display quality information for cleaned reads. Transcripts were assembled using Trinity (Haas et al. 2013). Assembled transcripts were filtered to include only the longest isoform from each read cluster. The longest isoforms were then aligned to each other by the use of NCBI blastn v2.2.29. Alignments with 100% sequence identity and ≥ 90% sequence length were considered redundant and removed from the final assembly. SNP discovery: Arachis ipaënsis reference genome sequence (www.Peanutbase.org) was used as proxy of the B genome of A. hypogaea to discover SNPs between the rust-resistant accessions and peanut (susceptible). This was done by aligning A. magna and A. batizocoi transcripts (resistant) with the reference genome of A. ipaënsis (GenBank assembly accession GCA_000816755.1) using the NGSEP pipeline (Duitama et al. 2014) tagging the region where the main QTL for rust resistance was identified (pseudomolecule Araip.B08, peanutbase.org), in the vicinity of the microsatellite marker Ah-280 (region between 117048352 and 129519037 bp) and also for another QTL linked marker on Araip.B08, AHGS1350 (region between 346729 and 848328 bp) (Table 1). Default parameters were used, except the minimum and maximum fragment length for valid paired-end alignments, which we estimated separately for each genotype aligning their first 250,000 fragments and then plotting the distribution of estimated insert lengths (Script available at the NGSEP Web site http://sourceforge.net/projects/ngsep/files/Library/scripts/). We used the recommended parameters of NGSEP for analysis of WGS data: (1) minimum genotype quality 40; (2) minimum value allowed for a base quality score 30; and (3) Maximum number of alignments allowed to start at the same reference site 2. We also used NGSEP for filtering (the most relevant was a maximum minor allele frequency of 0.01) and conversion from VCF to other formats for primer design and visualization of SNPs with Flapjack software (Milne et al. 2010). Table 1 Quantitative trait loci identified for resistance to Puccinia arachidis and agronomic traits on an A. ipaënsis × A. magna F6 population Trait Category Trait Symbol LGa Positionb Nearest Marker(s) LODc Additive Effectd R2 (%)e Rust resistance SI_2012 4 17.0 TC7G10 2.7 0.18 7.8 7 38.7 AHS0598 3.0 0.16 8.2 8 25.4 AHGS1350 / AHS2541 3.3 0.20 13.2 8 35.9 Ah-280 6.9 0.25 21.2 SI_2013 8 35.9 Ah-280 3.2 0.31 5.8 TL/LA_2012 8 25.4 AHGS1350 / AHS2541 4.1 0.13 16.0 8 35.1 Ah-280 2.9 0.09 8.9 TL/LA_2013 8 35.9 Ah-280 3.8 0.20 12.3 SL/LA_2012 8 35.9 Ah-280 3.8 0.07 11.1 SL/LA_2013 8 35.9 Ah-280 3.5 0.14 11.0 Log_IncPer_2012 8 42.9 Ah-280 / Ah-558 8.2 −0.46 59.3 Log_IncPer_2013 8 33.1 AHS2541 / Ah-280 7.6 −0.33 33.9 8 38.9 Ah-280 7.6 −0.33 34.8 Productivity Log_SN 3 82.3 ML2A05 3.4 −0.11 8.6 4 28.0 AHGS2785 2.6 −0.10 6.2 10 35.5 AHS1488 3.1 −0.15 10.3 10-SW 4 68.8 AHGS1279 / AHS2728 3.9 −0.32 18.4 5 44.5 AHGS2602 2.9 0.20 8.3 Seed Peg_Length 1 40.8 AHGS2019 / Seq12B2 3.4 −2.07 7.2 Characteristics 4 64.9 AHGS2155 11.2 4.18 25.8 4 70.8 AHGS1279 / AHS2728 8.7 4.38 30.2 5 10.4 AHS2897 3.4 2.06 6.9 9 43.7 AHGS2018 / AHGS2235 3.0 1.91 6.3 Pod_constriction 1 37.1 AHGS2332 4.1 −0.67 9.4 4 64.3 AHGS1917 3.4 0.59 7.4 5 47.0 AHGS2513 3.5 −0.80 8.0 6 25.4 AHGS2106 3.7 −0.66 9.7 8 0.0 AHGS1383 3.5 −0.60 8.1 9 36.3 AHGS1478_b3 / AHGS2537_b2 3.3 0.56 7.3 Plant MSH_2009 2 37.8 RN31F06 4.8 −5.44 14.5 Architecture 4 64.3 AHGS1917 / AHGS2155 6.5 −5.79 17.6 5 48.7 AHGS1228 3.2 3.61 8.1 Log_MSH2011 4 64.9 AHGS2155 4.1 −0.10 10.7 5 41.1 AHGS1980 4.0 1.64 10.0 6 13.0 AHS2153 4.5 −0.11 11.4 MSH_2012 3 23.2 TC1E06 4.0 −1.97 8.1 4 64.3 AHGS1917 / AHGS2155 12.3 −3.56 30.4 5 48.7 AHGS1228 3.6 2.13 7.4 a Linkage group. b Expressed in Kosambi cM. c LOD score, logarithm of the odds. d Positive values indicate that higher-value alleles come from A. ipaënsis K 30076, and negative values indicate that higher-value alleles come from A. magna K 30097. e Proportion of the phenotypic variance explained by the quantitative trait loci. Primer design and test: Allele-specific forward primers and a common reverse primer were designed for use in KASP assays (LGC Genomics Ltd. Hoddesdon, UK; http://www.lgcgenomics.com/kasp-genotyping-reagents) using BatchPrimer3 (http://probes.pw.usda.gov/batchprimer3/) with the “Allele specific primers and allele flanking primers” option. The parameters used were 60−120 bp in size, Tm between 58 and 60°, and GC content between 30 and 80%. The alternative alleles were marked with 6-FAM and reference alleles with VIC. For each SNP, two allele-specific forward primers and one common reverse primer were designed. A schematic diagram of SNP discovery and primer design is shown in Figure 1, A and B. Primer information is listed in File S1. Figure 1 A schematic diagram of single-nucleotide polymorphism (SNP) discovery and Kompetitive allele-specific polymerase chain reaction (KASP) primer design. A. ipaënsis K 30076 is used as proxy for the B-genome of A. hypogaea. (A) Alignment of A. ipaënsis K 30076, A. magna K 30097, and A. batizocoi K 9484 paired-end cDNA reads onto A. ipaënsis K 30076 genomic sequence, Pseudomolecule Araip.B08 (where rust resistance QTL reside) and the identification of SNPs. (B) Example of design of allele-specific and site specific (common) primers for the SNPs identified. KASP assays were performed with the following genotypes: the diploids A. ipaënsis K 30076, A. batizocoi K9484 and A. magna K 30097, the induced allotetraploids (A. magna K 30097 x A. stenosperma V15076)4x (here called MagSten) and (A. batizocoi K9484 × A. stenosperma V10309)4x (here called BatSten1) and six A. hypogaea cultivars (Tifrunner, Tifguard, GA-06G, NC3033, ICVG 88145, and SPTG_06). Reactions consisted of 2 μL of KASP 2X reaction mix, 0.055 μL of assay primer mix (12 mM of each allele-specific primer and 30 mM of common primer) and 20 ng of genomic DNA, in a 4-µL volume. A C1000 Thermal Cycler (Bio-Rad) was used with the following cycling conditions: 94° for 15 min, nine cycles of 94° for 20 sec, touchdown starting at 65° for 60 sec (decreasing 0.8° per cycle), 29 cycles of 94° for 20 sec, and 57° for 60 sec (http://www.cerealsdb.uk.net/cerealgenomics/CerealsDB/PDFs/KASP_SNP_Genotyping_Manual.pdf). To improve the results, a second KASP program was run as following: 9 cycles of 94° for 20 sec and 57° for 60 sec. Fluorescence was read by a The LightCycler 480 Instrument II (Roche Life Science) and analyzed using the LightCycler 480 Software (V.1.5.1).