Function of CNV genes A total of 110 Ensembl genes from the UMD 3.1 assembly were identified to be CNV genes, overlapping (either completely or partially) with our detected CNVRs (Supplementary Table 4). These genes included 96 protein-coding genes, 7 snRNA, 6 pseudogenes, and 1 rRNA. Using PANTHER's functional annotation tool to inspect GO slim terms mapping to protein-coding CNV genes, we identified that many of these genes were involved in binding (35%), catalytic activity (23%), receptor activity (39%), signal transducer activity (36%), biological regulation (38%), cellular process (35%), and response to stimulus (48%). Enrichment analysis was performed, using both the full Bos taurus GO database and the GO slim database, to identify GO terms that were significantly over- and underrepresented in our gene set. GO slim terms are a subset of the terms in the entire GO that give a broad overview of the ontology content. GO slim enrichment analysis showed that the terms extracellular transport, response to toxic substance, response to stimulus, response to interferon-gamma, amino acid transport, sensory perception of smell, G-protein coupled receptor signaling pathway, regulation of biological process, MHC protein complex, heterotrimeric G-protein complex, and plasma membrane were significantly overrepresented in the protein-coding genes overlapped by CNVRs (Bonferroni-corrected P < 0.05; Table 3). Results from the full GO database analysis are shown in Supplementary Table 3. Table 3 Significantly over- and underrepresented GO slim terms in the set of CNV genes. Ontology Term Gene Set (n genes) Annotated genesa (19879) CNV genesb (89) CNV genes expected Over (+) or Under (−) P-value BIOLOGICAL PROCESS Extracellular transport 53 6 0.24 + 4.16E-05 Response to toxic substance 45 5 0.20 + 5.08E-04 Response to stimulus 2880 43 12.89 + 9.08E-12 Response to interferon-gamma 58 4 0.26 + 3.50E-02 Amino acid transport 81 5 0.36 + 8.45E-03 Macrophage activation 131 6 0.59 + 7.18E-03 Sensory perception of smell 667 30 2.99 + 9.14E-20 Sensory perception of chemical stimulus 875 32 3.92 + 1.23E-18 Sensory perception 1108 32 4.96 + 1.19E-15 Neurological system process 1593 36 7.13 + 1.18E-14 System process 1809 36 8.10 + 6.23E-13 Single-multicellular organism process 2189 36 9.80 + 2.01E-10 Multicellular organismal process 2199 36 9.85 + 2.30E-10 G-protein coupled receptor signaling pathway 789 13 3.53 + 1.21E-02 Regulation of biological process 2260 34 10.12 + 1.36E-08 Biological regulation 2636 34 11.80 + 8.17E-07 Metabolic process 6613 14 29.61 + 3.88E-02 MOLECULAR FUNCTION n/a CELLULAR COMPONENT MHC protein complex 19 3 0.09 + 5.59E-03 Heterotrimeric G-protein complex 38 4 0.17 + 1.72E-03 Integral to membrane 1478 37 6.62 + 3.11E-17 Membrane 2433 37 10.89 + 2.19E-10 Plasma membrane 1458 24 6.53 + 1.01E-06 Cell part 4063 6 18.19 + 1.97E-02 Intracellular 3993 6 17.88 − 2.58E-02 a Number of genes in the background Bos taurus GO slim annotation set with given GO term. Total number of annotated genes is shown in parentheses. b Number of CNV genes with given GO term. Total number of CNV genes with annotations in the background Bos taurus GO slim annotation set is shown in parentheses. In addition, CNV genes were separated into three categories, duplication genes (genes overlapped by gain CNVs), deletion genes (genes overlapped by deletion CNVs), and mixed genes (genes overlapped by mixed CNVs) (Supplementary Table 3), and enrichment analysis was performed separately for each group. GO slim terms antigen processing and presentation of peptide or polysaccharide antigen via MHC class II, antigen processing and presentation, immune system process, and MHC protein complex were significantly overrepresented in the set of 25 genes overlapped by gain CNVs. For the 38 genes overlapped by deletion CNVs the terms response to toxic substance, response to stimulus, extracellular transport, sensory perception of smell, neurological system process, and regulation of biological process were significantly overrepresented. Genes overlapped by mixed CNVs had overrepresentation of GO terms response to interferon gamma, response to stimulus, response to toxic substance, sensory perception of smell, neurological system process, and regulation of biological process. Several of the biological process categories identified for our cattle CNV have also been identified in other species. For example, MHC class II genes, olfactory receptors (OR), and amino acid transporters have been identified within CNV regions in humans (Schmidt et al., 2003; Traherne, 2008; Young et al., 2008). Human MHC class II and class III genes lie within CNVR in humans, and some of these have been linked to phenotypic variation like congenital hyperplasia, systemic lupus erythematosus disease risk, and host control of HIV-1 (Traherne, 2008). Olfactory receptors are G-protein coupled receptors involved in signal transduction. Young et al. (2008) showed that 18 OR and OR psuedogenes displayed varying copy numbers among 50 people. This variation may play a role in olfactory ability and sensitivity. Olfactory receptors may also play a chemosensory role as they are expressed on sperm and thought to direct them to the egg via chemotaxis (Spehr et al., 2006). Across several subspecies of the Sus genus, OR genes were also over-represented among CNVR (Paudel et al., 2015). These genes may have been important components of swine evolution, as scent would have been critical for foraging for food, avoiding predators, and finding a mate.