The Role of Constitutional Copy Number Variants in Breast Cancer.
ABSTRACT: Constitutional copy number variants (CNVs) include inherited and de novo deviations from a diploid state at a defined genomic region. These variants contribute significantly to genetic variation and disease in humans, including breast cancer susceptibility. Identification of genetic risk factors for breast cancer in recent years has been dominated by the use of genome-wide technologies, such as single nucleotide polymorphism (SNP)-arrays, with a significant focus on single nucleotide variants. To date, these large datasets have been underutilised for generating genome-wide CNV profiles despite offering a massive resource for assessing the contribution of these structural variants to breast cancer risk. Technical challenges remain in determining the location and distribution of CNVs across the human genome due to the accuracy of computational prediction algorithms and resolution of the array data. Moreover, better methods are required for interpreting the functional effect of newly discovered CNVs. In this review, we explore current and future application of SNP array technology to assess rare and common CNVs in association with breast cancer risk in humans.
Project description:Genome-wide studies of patients carrying pathogenic variants (mutations) in BRCA1 or BRCA2 have reported strong associations between single-nucleotide polymorphisms (SNPs) and cancer risk. To conduct the first genome-wide association analysis of copy-number variants (CNVs) with breast or ovarian cancer risk in a cohort of 2500 BRCA1 pathogenic variant carriers, CNV discovery was performed using multiple calling algorithms and Illumina 610k SNP array data from a previously published genome-wide association study. Our analysis, which focused on functionally disruptive genomic deletions overlapping gene regions, identified a number of loci associated with risk of breast or ovarian cancer for BRCA1 pathogenic variant carriers. Despite only including putative deletions called by at least two or more algorithms, detection of selected CNVs by ancillary molecular technologies only confirmed 40% of predicted common (>1% allele frequency) variants. These include four loci that were associated (unadjusted P<0.05) with breast cancer (GTF2H2, ZNF385B, NAALADL2 and PSG5), and two loci associated with ovarian cancer (CYP2A7 and OR2A1). An interesting finding from this study was an association of a validated CNV deletion at the CYP2A7 locus (19q13.2) with decreased ovarian cancer risk (relative risk=0.50, P=0.007). Genomic analysis found this deletion coincides with a region displaying strong regulatory potential in ovarian tissue, but not in breast epithelial cells. This study highlighted the need to verify CNVs in vitro, but also provides evidence that experimentally validated CNVs (with plausible biological consequences) can modify risk of breast or ovarian cancer in BRCA1 pathogenic variant carriers.
Project description:Breast cancer is one of the most common cancers among women, and susceptibility is explained by genetic, lifestyle and environmental components. Copy Number Variants (CNVs) are structural DNA variations that contribute to diverse phenotypes via gene-dosage effects or cis-regulation. In this study, we aimed to identify germline CNVs associated with breast cancer susceptibility and their relevance to prognosis. We performed whole genome CNV genotyping in 422 cases and 348 controls using Human Affymetrix SNP 6 array. Principal component analysis for population stratification revealed 84 outliers leaving 366 cases and 320 controls of Caucasian ancestry for association analysis; CNVs with frequency?>?10% and overlapping with protein coding genes were considered for breast cancer risk and prognostic relevance. Coding genes within the CNVs identified were interrogated for gene- dosage effects by correlating copy number status with gene expression profiles in breast tumor tissue. We identified 200 CNVs associated with breast cancer (q-value?<?0.05). Of these, 21 CNV regions (overlapping with 22 genes) also showed association with prognosis. We validated representative CNVs overlapping with APOBEC3B and GSTM1 genes using the TaqMan assay. Germline CNVs conferred dosage effects on gene expression in breast tissue. The candidate CNVs identified in this study warrant independent replication.
Project description:Copy Number Variants (CNVs) are a class of structural variations of DNA. Germline CNVs are known to confer disease susceptibility, but their role in breast cancer warrants further investigations. We hypothesized that breast cancer associated germline CNVs contribute to disease risk through gene dosage or other post-transcriptional regulatory mechanisms, possibly through tissue specific expression of CNV-embedded small-noncoding RNAs (CNV-sncRNAs). Our objectives are to identify breast cancer associated CNVs using a genome wide association study (GWAS), identify sncRNA genes embedded within CNVs, confirm breast tissue (tumor and normal) expression of the sncRNAs, correlate their expression with germline copy status and identify pathways influenced by the genes regulated by sncRNAs. We used an association study design and accessed germline CNV data generated on Affymetrix Human SNP 6.0 array in 686 (in-house data) and 495 (TCGA data) subjects served as discovery and validation cohorts. We identified 1812 breast cancer associated CNVs harboring miRNAs (n?=?38), piRNAs (n?=?9865), snoRNAs (n?=?71) and tRNAs (n?=?12) genes. A subset of CNV-sncRNAs expressed in breast tissue, also showed correlation with germline copy status. We identified targets potentially regulated by miRNAs and snoRNAs. In summary, we demonstrate the potential impact of embedded CNV-sncRNAs on expression and regulation of down-stream targets.
Project description:Structural variation is thought to play a major etiological role in the development of autism spectrum disorders (ASDs), and numerous studies documenting the relevance of copy number variants (CNVs) in ASD have been published since 2006. To determine if large ASD families harbor high-impact CNVs that may have broader impact in the general ASD population, we used the Affymetrix genome-wide human SNP array 6.0 to identify 153 putative autism-specific CNVs present in 55 individuals with ASD from 9 multiplex ASD pedigrees. To evaluate the actual prevalence of these CNVs as well as 185 CNVs reportedly associated with ASD from published studies many of which are insufficiently powered, we designed a custom Illumina array and used it to interrogate these CNVs in 3,000 ASD cases and 6,000 controls. Additional single nucleotide variants (SNVs) on the array identified 25 CNVs that we did not detect in our family studies at the standard SNP array resolution. After molecular validation, our results demonstrated that 15 CNVs identified in high-risk ASD families also were found in two or more ASD cases with odds ratios greater than 2.0, strengthening their support as ASD risk variants. In addition, of the 25 CNVs identified using SNV probes on our custom array, 9 also had odds ratios greater than 2.0, suggesting that these CNVs also are ASD risk variants. Eighteen of the validated CNVs have not been reported previously in individuals with ASD and three have only been observed once. Finally, we confirmed the association of 31 of 185 published ASD-associated CNVs in our dataset with odds ratios greater than 2.0, suggesting they may be of clinical relevance in the evaluation of children with ASDs. Taken together, these data provide strong support for the existence and application of high-impact CNVs in the clinical genetic evaluation of children with ASD.
Project description:Genome-wide association studies have found SNPs at 17q22 to be associated with breast cancer risk. To identify potential causal variants related to breast cancer risk, we performed a high resolution fine-mapping analysis that involved genotyping 517 SNPs using a custom Illumina iSelect array (iCOGS) followed by imputation of genotypes for 3,134 SNPs in more than 89,000 participants of European ancestry from the Breast Cancer Association Consortium (BCAC). We identified 28 highly correlated common variants, in a 53?Kb region spanning two introns of the STXBP4 gene, that are strong candidates for driving breast cancer risk (lead SNP rs2787486 (OR?=?0.92; CI 0.90-0.94; P?=?8.96?×?10(-15))) and are correlated with two previously reported risk-associated variants at this locus, SNPs rs6504950 (OR?=?0.94, P?=?2.04?×?10(-09), r(2)?=?0.73 with lead SNP) and rs1156287 (OR?=?0.93, P?=?3.41?×?10(-11), r(2)?=?0.83 with lead SNP). Analyses indicate only one causal SNP in the region and several enhancer elements targeting STXBP4 are located within the 53?kb association signal. Expression studies in breast tumor tissues found SNP rs2787486 to be associated with increased STXBP4 expression, suggesting this may be a target gene of this locus.
Project description:Accurate and efficient genome-wide detection of copy number variants (CNVs) is essential for understanding human genomic variation, genome-wide CNV association type studies, cytogenetics research and diagnostics, and independent validation of CNVs identified from sequencing based technologies. Numerous, array-based platforms for CNV detection exist utilizing array Comparative Genome Hybridization (aCGH), Single Nucleotide Polymorphism (SNP) genotyping or both. We have quantitatively assessed the abilities of twelve leading genome-wide CNV detection platforms to accurately detect Gold Standard sets of CNVs in the genome of HapMap CEU sample NA12878, and found significant differences in performance. The technologies analyzed were the NimbleGen 4.2 M, 2.1 M and 3×720 K Whole Genome and CNV focused arrays, the Agilent 1×1 M CGH and High Resolution and 2×400 K CNV and SNP+CGH arrays, the Illumina Human Omni1Quad array and the Affymetrix SNP 6.0 array. The Gold Standards used were a 1000 Genomes Project sequencing-based set of 3997 validated CNVs and an ultra high-resolution aCGH-based set of 756 validated CNVs. We found that sensitivity, total number, size range and breakpoint resolution of CNV calls were highest for CNV focused arrays. Our results are important for cost effective CNV detection and validation for both basic and clinical applications.
Project description:BACKGROUND: Inherited factors predisposing individuals to breast and ovarian cancer are largely unidentified in a majority of families with hereditary breast and ovarian cancer (HBOC). We aimed to identify germline copy number variations (CNVs) contributing to HBOC susceptibility in the Finnish population. METHODS: A cohort of 84 HBOC individuals (negative for BRCA1/2-founder mutations and pre-screened for the most common breast cancer genes) and 36 healthy controls were analysed with a genome-wide SNP array. CNV-affecting genes were further studied by Gene Ontology term enrichment, pathway analyses, and database searches to reveal genes with potential for breast and ovarian cancer predisposition. CNVs that were considered to be important were validated and genotyped in 20 additional HBOC individuals (6 CNVs) and in additional healthy controls (5 CNVs) by qPCR. RESULTS: An intronic deletion in the EPHA3 receptor tyrosine kinase was enriched in HBOC individuals (12 of 101, 11.9%) compared with controls (27 of 432, 6.3%) (OR?=?1.96; P?=?0.055). EPHA3 was identified in several enriched molecular functions including receptor activity. Both a novel intronic deletion in the CSMD1 tumor suppressor gene and a homozygous intergenic deletion at 5q15 were identified in 1 of 101 (1.0%) HBOC individuals but were very rare (1 of 436, 0.2% and 1 of 899, 0.1%, respectively) in healthy controls suggesting that these variants confer disease susceptibility. CONCLUSION: This study reveals new information regarding the germline CNVs that likely contribute to HBOC susceptibility in Finland. This information may be used to facilitate the genetic counselling of HBOC individuals but the preliminary results warrant additional studies of a larger study group.
Project description:BACKGROUND:Genome-wide association studies (GWASs) have identified multiple genetic susceptibility loci for breast cancer. However, these loci explain only a small fraction of the heritability. Very few studies have evaluated copy number variation (CNV), another important source of human genetic variation, in relation to breast cancer risk. METHODS:We conducted a CNV GWAS in 2623 breast cancer patients and 1946 control subjects using data from Affymetrix SNP Array 6.0 (stage 1). We then replicated the most promising CNV using real-time quantitative polymerase chain reaction (qPCR) in an independent set of 4254 case patients and 4387 control subjects (stage 2). All subjects were recruited from population-based studies conducted among Chinese women in Shanghai. RESULTS:Of the 268 common CNVs (minor allele frequency ? 5%) investigated in stage 1, the strongest association was found for a common deletion in the APOBEC3 genes (P = 1.1×10(-4)) and was replicated in stage 2 (odds ratio =1.35, 95% confidence interval [CI] = 1.27 to 1.44; P = 9.6×10(-22)). Analyses of all samples from both stages using qPCR data produced odds ratios of 1.31 (95% CI = 1.21 to 1.42) for a one-copy deletion and 1.76 (95% CI = 1.57 to 1.97) for a two-copy deletion (P = 2.0×10(-24)). CONCLUSIONS:We provide convincing evidence for a novel breast cancer locus at the APOBEC3 genes. This CNV is one of the strongest common genetic risk variants identified so far for breast cancer.
Project description:Copy-number variants (CNVs) are a major source of genetic variation in human health and disease. Previous studies have implicated replication stress as a causative factor in CNV formation. However, existing data are technically limited in the quality of comparisons that can be made between human CNVs and experimentally induced variants. Here, we used two high-resolution strategies-single nucleotide polymorphism (SNP) arrays and mate-pair sequencing-to compare CNVs that occur constitutionally to those that arise following aphidicolin-induced DNA replication stress in the same human cells. Although the optimized methods provided complementary information, sequencing was more sensitive to small variants and provided superior structural descriptions. The majority of constitutional and all aphidicolin-induced CNVs appear to be formed via homology-independent mechanisms, while aphidicolin-induced CNVs were of a larger median size than constitutional events even when mate-pair data were considered. Aphidicolin thus appears to stimulate formation of CNVs that closely resemble human pathogenic CNVs and the subset of larger nonhomologous constitutional CNVs.
Project description:In clinical diagnostics, both array comparative genomic hybridization (array CGH) and single nucleotide polymorphism (SNP) genotyping have proven to be powerful genomic technologies utilized for the evaluation of developmental delay, multiple congenital anomalies, and neuropsychiatric disorders. Differences in the ability to resolve genomic changes between these arrays may constitute an implementation challenge for clinicians: which platform (SNP vs array CGH) might best detect the underlying genetic cause for the disease in the patient? While only SNP arrays enable the detection of copy number neutral regions of absence of heterozygosity (AOH), they have limited ability to detect single-exon copy number variants (CNVs) due to the distribution of SNPs across the genome. To provide comprehensive clinical testing for both CNVs and copy-neutral AOH, we enhanced our custom-designed high-resolution oligonucleotide array that has exon-targeted coverage of 1860 genes with 60,000 SNP probes, referred to as Chromosomal Microarray Analysis - Comprehensive (CMA-COMP). Of the 3240 cases evaluated by this array, clinically significant CNVs were detected in 445 cases including 21 cases with exonic events. In addition, 162 cases (5.0%) showed at least one AOH region >10 Mb. We demonstrate that even though this array has a lower density of SNP probes than other commercially available SNP arrays, it reliably detected AOH events >10 Mb as well as exonic CNVs beyond the detection limitations of SNP genotyping. Thus, combining SNP probes and exon-targeted array CGH into one platform provides clinically useful genetic screening in an efficient manner.