CNVR discovery and statistics in the GPE bulls Putative CNVs across the population of 175 bulls were identified using the exome cn.MOPS software package (Supplementary Table 1B). We chose to use the cn.MOPS package since it has been shown to have a lower false-positive rate than other exome CNV detection methods (Guo et al., 2013). CNVs were then merged across samples into CNVRs. In this work, we aimed to study common coding sequence CNVs across the Bos taurus genome. In an attempt to filter out possible false-positive and rare CNVs, CNVRs were filtered out if they were not present in at least 3 samples (>2% of the population). Note that the 2% threshold was chosen arbitrarily. A total of 74 CNVRs were filtered out in this step. The final set of CNVRs consisted of 57 CNVRs (48 on the autosomes and 9 on the X chromosome). Sizes of the CNVRs ranged from 0.0018 to 1.56 Mb, with an average of 0.1419 Mb and a median of 0.0567 Mb. The CNVRs occupied a total of 5.27 unique Mb or 0.19% of the UMD 3.1 Bos taurus genome. Among the CNVRs, 30 showed copy number loss, 16 showed copy number gain, and 11 showed a mix of copy number loss and gain from different individuals. A full list of the CNVRs can be found in Supplementary Table 1A. The distribution of CNVRs along each of the chromosomes is shown in Figure 1. Many CNVRs were present in a small number of bulls (24 of 57 were present in at most 5 bulls). One CNVR [CNVR 4 in Supplementary Table 1A] was present in 36% of the bulls. We observed some variation in the number of CNVRs between breeds. The greatest numbers of CNVRs were seen in Hereford (70), Angus (82), Simmental (72), and Red Angus (70), while the smallest numbers were seen in Braunveih (4) and Charolais × Angus (7). None of the CNVRs were breed-specific. Figure 1 CNVRs in GPE bulls. Plot shows the CNVRs identified from the 175 sequenced GPE bull genomes in Circos format (Krzywinski et al., 2009). The outer ideogram runs clockwise from chromosome 1 to chromosome X with labels in Mb of physical distance. The copy number data is represented in the inner tracks. The two innermost tracks show scatter plots of the CNVRs, where the red track shows copy number loss and the green track shows copy number gain. The size of the dot in the scatter plot is proportional to the number of samples containing the CNVR. The other track shows a heat map which indicates the parts of the genome that contain copy number gain and loss. This plot simply collapses the scatter plot values onto a single radial position.