5. Application of SNP Arrays for Profiling CNVs in Breast Cancer Structural variants, including CNVs, contribute to many complex diseases, and could account for some of the missing heritability of breast cancer. CNVs have been reported to encompass genes known to be involved in breast cancer susceptibility, including BRCA1 and BRCA2, and therefore may similarly affect other genes involved in breast cancer-related pathways [12]. 5.1. Inherited Copy Number Polymorphisms and Breast Cancer Risk Analysis of large genome-wide association studies carried out by the Wellcome Trust Case Control Consortium suggested that common CNVs were unlikely to play a major role in breast cancer susceptibility [70]. This study used a 105K probe Agilent CGH array design containing probes tagging for copy number loci previously identified from (1) the Genome Structural Variation (GSV) Consortium [39]; (2) CNV studies using the SNP arrays Affymetrix 6.0, Illumina 1M, and Affymetrix 500k; (3) novel sequence absent from the reference sequence; 4) candidate genes; and 5) additional risk-associated loci. However, this study was not sufficiently powered to detect the effects of low-penetrant alleles with a minor allele frequency (MAF) less than 5%. Moreover, the genomic regions assessed by this study were limited by the design of the arrays used to generate genotype information across the genome. More recently, a genome-wide association study of common CNVs (MAF ≥ 5%) conducted among Chinese women using high-resolution data from the Affymetrix SNP Array 6.0 identified a deletion in the APOBEC3 gene cluster associated with breast cancer risk. Within this population, the deletion was identified in 65% cases and 45% of controls, conferring odds ratios (ORs) of 1.3 and 1.8 for a hemizygous and homozygous deletion, respectively (p = 2.0 × 10−24) [6]. Subsequent investigations of women with European ancestry using quantitative-PCR also observed the deletion, albeit at a much lower population frequency [71]. Comparable to the study of Chinese women, a higher proportion of breast cancer affected European women (12.4% vs. 10.4%, respectively) because they carried the APOBEC3 allele, thereby conferring low to moderate risk of disease (ORs of 1.2 and 2.3 (p = 0.005) for a hemizygous and homozygous deletion, respectively). Interestingly, the same deletion (CNV ID: CNVR8164.1) was originally identified by the Wellcome Trust Case Control Consortium; however, replication experiments did not show a significant association with breast cancer. As mentioned above, there is now a wealth of array data available from SNP-based genome-wide association studies that can be utilised for assessing the contribution of CNVs to breast cancer risk. Furthermore, the huge number of cases and controls available for future CNV association studies will provide sufficient power to evaluate many CNVs that occur at low frequency. A major limitation with using these array data is the inability to genotype highly repetitive copy number-variable regions. More than 1000 regions across the human genome have been found overlapping CNVs with three or more segregating alleles [72]. Non-array-based technologies that can resolve multicopy integer states, such as qPCR, Nanostring and massively parallel sequencing, will therefore be necessary to determine the clinical significance of these multiallelic variants in breast cancer and other human diseases. 5.2. Inherited and de novo Rare CNVs and Breast Cancer Risk At least seven array-based studies have reported lists of rare CNVs overlapping genes that may contribute towards the development breast cancer [8,73,74]. Despite a number of candidate susceptibility genes being proposed there has been a notable lack of concordance between these studies. More than 120 genes overlapping rare genomic deletions or duplications have been found exclusively or at a greater frequency in familial breast cancer cases; however, none have been replicated between studies (Supplementary Table S1). Such a finding is not surprising as many individuals carry rare or private CNVs regardless of their disease status [2,75]. Furthermore, four of these studies used SNP-based arrays which are known to generate signal-to-noise ratios that are much lower than array-CGH platforms and are therefore more prone to false CNV calls [58]. It remains unclear whether future large-scale studies will provide the reproducible evidence needed to implicate these rare CNVs as breast cancer risk variants and to overcome the issue of false discovery. Growing evidence suggests that the frequency and size of constitutional CNVs are significantly increased in breast cancer-affected individuals [73,74,76]. Studies have assessed the global burden of deletions and duplications in cases and controls by measuring: (1) the number of CNVs per sample; (2) the number CNVs overlapping genes (and vice versa) per sample; (3) the average length of CNVs per sample; and (4) the total number of base pairs affected by CNVs per sample. Although studies have revealed a common trend of increased CNV burden in breast cancer cases, the trend appears to be strongest when assessing CNVs that overlap gene regions [73,74]. Evaluating such genes further by pathway analysis suggests two networks centred on factors known, TP53 and β-estradial [73], may be important in breast cancer risk and development; however, these findings are yet to be reproduced. The feature of “CNV burden” has also been observed in the genome of patients with other cancers, suggesting that an uncharacterised subset of these variants may be causal [77,78,79,80]. Further studies are needed to identify recurring variants at shared loci. 5.3. Is There a Relationship between Germline CNVs and Breast Tumourigenesis A characteristic of sporadic and familial breast tumours is genomic instability, resulting from either inherited mutations in genes that control genome integrity, or mutations that are acquired in somatic cells during development. Breast tumour cells in carriers of the APOBEC3A-APOBEC3B germline deletion show a greater number of C>T transitions than in non-carriers [81], thereby highlighting the importance of this common CNV in breast cancer development. It has previously been proposed that germline CNVs may also contribute to somatically acquired chromosome changes in tumours. Previous studies of Li-Fraumeni Syndrome (LFS) tumours [80] and of colon cancer-affected individuals [82] suggested that constitutional CNVs may act as a foundation on which chromosome copy number aberrations develop in tumour cells. These findings suggested a direct relationship between constitutional genomic variation and tumour genome evolution. The notion that inherited CNVs may influence the occurrence of somatically acquired copy number changes during breast cancer progression has not only prognostic significance, but also important consequences for early decisions relating to clinical management. Subsequent analyses of constitutional and tumour-specific CNVs in matched breast tumour and normal tissue using data from the Illumina Human CNV370 duo beadarray provided evidence that the location of copy number aberrations in tumour cells do not associate with constitutional CNVs [83]. However, the SNP arrays used in these studies had a relatively low number of probes and therefore poor spatial resolution for detecting CNVs and defining the variant boundaries. To determine the relationship between inherited genomic variation and genome evolution in breast cancer, sequencing-based studies are necessary to ensure accurate mapping of CNV breakpoints.