Novel transcripts and gene orthologs Cufflinks predicted a large number of novel single-exon transcripts (56,719). During the mapping of sequence reads using Life Technologies LifeScope software, spliced reads were only included for the annotated genes, which prevented the construction of novel multi-exon transcripts. After filtering the novel transcripts according to a minimum length (>130 bp) and FPKM value (>5), a total of 13,660 novel candidate transcripts were identified (Figure 2). Furthermore, Cufflinks predicted 38,418 new isoforms, 1595 transcripts with generic overlap with reference genes, and 1311 transcripts with exonic overlap with reference on the opposite strand (Figure 2). For 448 of these opposite strand transcripts in the oviduct and for 787 in the testis, the FPKM was >1. Functional analysis of these genes did not indicate any enriched GO terms, but they do represent possible regulatory sequences for expressed genes in the testis and oviduct. Figure 2 Flowchart of the analysis pipeline for novel transcripts. Several filtering steps were included for the identification of previously unannotated exons using a blast search against the available genomes in the NCBI chromosome database. The number of hits for each species is shown in the bar chart. Species with the highest number of hits (cow, human and sheep) were selected for identification of novel DE genes between the testis and oviduct in the pig. For the identification of gene orthologs in the dataset, we ran a blast search against the genomic sequences of all species available in the NCBI “chromosome” database. Most hits (similarity >90%) were identified in the Bos Taurus assembly UMD3.1 (n = 888). The alignment of genomic sequences between the pig and three other mammalian species (human, cow, and sheep) with the highest amount of sequence hits showed high coherency between corresponding chromosomes (Figure S2, A–C). The amount of novel and common hits between the top three species is shown in Figure S2D. Novel gene orthologs with intronic or exonic hits in these genomes (n = 869) were selected for differential expression analysis between the testis and oviduct, which resulted in 152 DE hits (FDR ≤0.01). These hits were selected based on the criterion that the hit sequence was completely included in the exon or intron of an annotated gene. The number of exonic hits increased considerably when partial exon hits were included (Figure S2E). In total, DE hits were identified in 55 unique genes. Most hits were assigned to nucleoporin 210kDa-like (NUP210L) gene (Table S5 and Figure 3). NUP210L appears to be testis-specific and probably has a role in spermatid development. Annotation of the NUP210L gene is incomplete in the pig genome (10.2.73), where only 2 out of 40 exons in the human (Ensembl database) are annotated. Based on our expression data, most annotated exons in the human, cow, and sheep are also expressed in the pig testis (Figure 3, A–C). Thirty-five genes in the testis and 20 in the oviduct showed high expression (Table S5). These included some recent annotations, which underline possible novel reproduction-related genes. To elucidate the possible role of these genes, we investigated the expression pattern during the first wave of mouse spermatogenesis based on our previous data (Laiho et al. 2013) in which a mouse ortholog was identified. Two genes with mouse ortholog 4930538K18Rik (ENSBTAG00000017387/ENSOARG00000020461) and 4930522H14Rik (C3H1orf185/C1orf185) showed increased expression during spermatogenesis, with the highest mRNA level at postnatal day (PND) 28 (Figure 4). Each time point corresponds to the appearance of specific cell types in the collected tissue sample: spermatogonia at PND 7; early pachytene spermatocytes at PND 14; late pachytene spermatocytes at PND 17; round spermatids at PND 21; and elongating spermatids at PND 28 (Laiho et al. 2013). 4930522H14Rik appeared to be testis-specific in the mouse, and 4930538K18Rik showed high expression in the oviduct and testis (http://www.ncbi.nlm.nih.gov/UniGene). This expression pattern indicates a role for these genes in late steps of spermatid elongation. The expression of NUP210L also increased during the first wave of spermatogenesis (Figure 4). Figure 3 Conserved genomic sequences of NUP210L between pig and cow, human, and sheep. (A) Comparison of the NUP210L genomic region between the pig and cow. (B) Comparison of the NUP210L genomic region between the pig and human. (C) Comparison of the NUP210L genomic region between the pig and sheep. Log-expression levels (FPKM) are shown as red peaks in the testis and as blue in the oviduct, and the similarity of the hit is indicated with the deepening shades of green. Annotation of the genes in the pig is presented above the alignment, and those for the human, cow, or sheep are shown below the alignment. Figure 4 Expression of NUP210L and 4930522H14Rik (C3H1orf185/C1orf185) during the first wave of spermatogenesis in the mouse.