Interrogating Mutant Allele Expression via Customized Reference Genomes to Define Influential Cancer Mutations.
ABSTRACT: Genetic alterations are essential for cancer initiation and progression. However, differentiating mutations that drive the tumor phenotype from mutations that do not affect tumor fitness remains a fundamental challenge in cancer biology. To better understand the impact of a given mutation within cancer, RNA-sequencing data was used to categorize mutations based on their allelic expression. For this purpose, we developed the MAXX (Mutation Allelic Expression Extractor) software, which is highly effective at delineating the allelic expression of both single nucleotide variants and small insertions and deletions. Results from MAXX demonstrated that mutations can be separated into three groups based on their expression of the mutant allele, lack of expression from both alleles, or expression of only the wild-type allele. By taking into consideration the allelic expression patterns of genes that are mutated in PDAC, it was possible to increase the sensitivity of widely used driver mutation detection methods, as well as identify subtypes that have prognostic significance and are associated with sensitivity to select classes of therapeutic agents in cell culture. Thus, differentiating mutations based on their mutant allele expression via MAXX represents a means to parse somatic variants in tumor genomes, helping to elucidate a gene's respective role in cancer.
Project description:INTRODUCTION:Cis-acting regulatory single nucleotide polymorphisms (SNPs) at specific loci may modulate penetrance of germline mutations at the same loci by introducing different levels of expression of the wild-type allele. We have previously reported that BRCA2 shows differential allelic expression and we hypothesize that the known variable penetrance of BRCA2 mutations might be associated with this mechanism. METHODS:We combined haplotype analysis and differential allelic expression of BRCA2 in breast tissue to identify expression haplotypes and candidate cis-regulatory variants. These candidate variants underwent selection based on in silico predictions for regulatory potential and disruption of transcription factor binding, and were functionally analyzed in vitro and in vivo in normal and breast cancer cell lines. SNPs tagging the expression haplotypes were correlated with the total expression of several genes in breast tissue measured by Taqman and microarray technologies. The effect of the expression haplotypes on breast cancer risk in BRCA2 mutation carriers was investigated in 2,754 carriers. RESULTS:We identified common haplotypes associated with differences in the levels of BRCA2 expression in human breast cells. We characterized three cis-regulatory SNPs located at the promoter and two intronic regulatory elements which affect the binding of the transcription factors C/EBP?, HMGA1, D-binding protein (DBP) and ZF5. We showed that the expression haplotypes also correlated with changes in the expression of other genes in normal breast. Furthermore, there was suggestive evidence that the minor allele of SNP rs4942440, which is associated with higher BRCA2 expression, is also associated with a reduced risk of breast cancer (per-allele hazard ratio (HR) = 0.85, 95% confidence interval (CI) = 0.72 to 1.00, P-trend = 0.048). CONCLUSIONS:Our work provides further insights into the role of cis-regulatory variation in the penetrance of disease-causing mutations. We identified small-effect genetic variants associated with allelic expression differences in BRCA2 which could possibly affect the risk in mutation carriers through altering expression levels of the wild-type allele.
Project description:Due to growing throughput and shrinking cost, massively parallel sequencing is rapidly becoming an attractive alternative to microarrays for the genome-wide study of gene expression and copy number alterations in primary tumors. The sequencing of transcripts (RNA-Seq) should offer several advantages over microarray-based methods, including the ability to detect somatic mutations and accurately measure allele-specific expression. To investigate these advantages we have applied a novel, strand-specific RNA-Seq method to tumors and matched normal tissue from three patients with oral squamous cell carcinomas. Additionally, to better understand the genomic determinants of the gene expression changes observed, we have sequenced the tumor and normal genomes of one of these patients. We demonstrate here that our RNA-Seq method accurately measures allelic imbalance and that measurement on the genome-wide scale yields novel insights into cancer etiology. As expected, the set of genes differentially expressed in the tumors is enriched for cell adhesion and differentiation functions, but, unexpectedly, the set of allelically imbalanced genes is also enriched for these same cancer-related functions. By comparing the transcriptomic perturbations observed in one patient to his underlying normal and tumor genomes, we find that allelic imbalance in the tumor is associated with copy number mutations and that copy number mutations are, in turn, strongly associated with changes in transcript abundance. These results support a model in which allele-specific deletions and duplications drive allele-specific changes in gene expression in the developing tumor.
Project description:p53 is one of the most extensively studied proteins in cancer research. Mutations in p53 generally abolish normal p53 function, and some mutants can gain new oncogenic functions. However, the mechanisms underlying p53 mutation-driven cancer remains to be elucidated. Our study investigated the function of a heterozygous p53 mutation (p.Asn268Glufs*4) in a Li-Fraumeni syndrome (LFS) patient. We used episomal technology to perform somatic reprogramming, and used molecular and cell biology methods to determine the p53 mutation levels in patient-originated induced pluripotent stem (iPS) cells at the RNA and protein levels. We found that p53 protein expression was not increased in this patient's somatic cells compared with those of a healthy control. p53 mutation facilitates the proliferation of tumor cells by inhibiting apoptosis and promoting cell division. It can inhibit the efficiency of somatic reprogramming by inhibiting OCT4 expression during reprogramming stage. Moreover, not all p53 mutant iPS cell lines have mutant p53 RNA sequences. A small percentage of mutant p53 mRNA is present in the somatic cells from the patient and his mother. In summary, this p53 mutation can promote tumor cell proliferation, inhibit somatic reprogramming, and exhibit random p53 allelic expression of heterozygous mutations in the patient and iPS cells which may be one of the reasons why the people with p53 mutations develop cancer at random. This finding suggested that mutant p53 allelic expression should be added to the risk forecasting of cancer.
Project description:Loss of chromosome 18q21 is well documented in colorectal cancer, and it has been suggested that this loss targets the DCC, DPC4/SMAD4, and SMAD2 genes. Recently, the importance of SMAD4, a downstream regulator in the TGF-beta signaling pathway, in colorectal cancer has been highlighted, although the frequency of SMAD4 mutations appears much lower than that of 18q21 loss. We set out to investigate allele loss, mutations, protein expression, and cytogenetics of chromosome 18 copy number in a collection of 44 colorectal cancer cell lines of known status with respect to microsatellite instability (MSI). Fourteen of thirty-two MSI(-) lines showed loss of SMAD4 protein expression; usually, one allele was lost and the other was mutated in one of a number of ways, including deletions of various sizes, splice site changes, and missense and nonsense point mutations (although no frameshifts). Of the 18 MSI(-) cancers with retained SMAD4 expression, four harbored missense mutations in the 3' part of the gene and showed allele loss. The remaining 14 MSI(-) lines had no detectable SMAD4 mutation, but all showed allele loss at SMAD4 and/or DCC. SMAD4 mutations can therefore account for about 50-60% of the 18q21 allele loss in colorectal cancer. No MSI(+) cancer showed loss of SMAD4 protein or SMAD4 mutation, and very few had allelic loss at SMAD4 or DCC, although many of these MSI(+) lines did carry TGFBIIR changes. Although SMAD4 mutations have been associated with late-stage or metastatic disease, our combined molecular and cytogenetic data best fit a model in which SMAD4 mutations occur before colorectal cancers become aneuploid/polyploid, but after the MSI(+) and MSI(-) pathways diverge. Thus, MSI(+) cancers may diverge first, followed by CIN(+) (chromosomal instability) cancers, leaving other cancers to follow a CIN(-)MSI(-) pathway.
Project description:Mosaic Variegated Aneuploidy (MVA) syndrome is a rare autosomal recessive disorder characterized by inaccurate chromosome segregation and high rates of near-diploid aneuploidy. Children with MVA syndrome die at an early age, are cancer prone, and have progeroid features like facial dysmorphisms, short stature, and cataracts. The majority of MVA cases are linked to mutations in BUBR1, a mitotic checkpoint gene required for proper chromosome segregation. Affected patients either have bi-allelic BUBR1 mutations, with one allele harboring a missense mutation and the other a nonsense mutation, or mono-allelic BUBR1 mutations combined with allelic variants that yield low amounts of wild-type BubR1 protein. Parents of MVA patients that carry single allele mutations have mild mitotic defects, but whether they are at risk for any of the pathologies associated with MVA syndrome is unknown. To address this, we engineered a mouse model for the nonsense mutation 2211insGTTA (referred to as GTTA) found in MVA patients with bi-allelic BUBR1 mutations. Here we report that both the median and maximum lifespans of the resulting BubR1(+/GTTA) mice are significantly reduced. Furthermore, BubR1(+/GTTA) mice develop several aging-related phenotypes at an accelerated rate, including cataract formation, lordokyphosis, skeletal muscle wasting, impaired exercise ability, and fat loss. BubR1(+/GTTA) mice develop mild aneuploidies and show enhanced growth of carcinogen-induced tumors. Collectively, these data demonstrate that the BUBR1 GTTA mutation compromises longevity and healthspan, raising the interesting possibility that mono-allelic changes in BUBR1 might contribute to differences in aging rates in the general population.
Project description:<h4>Background</h4>The Cub and Sushi Multiple Domains 1 (CSMD1) gene, located on the short arm of chromosome 8, codes for a type I transmembrane protein whose function is currently unknown. CSMD1 expression is frequently lost in many epithelial cancers. Our goal was to characterize the relationships between CSMD1 somatic mutations, allele imbalance, DNA methylation, and the clinical characteristics in colorectal cancer patients.<h4>Methods</h4>We sequenced the CSMD1 coding regions in 54 colorectal tumors using the 454FLX pyrosequencing platform to interrogate 72 amplicons covering the entire coding sequence. We used heterozygous SNP allele ratios at multiple CSMD1 loci to determine allelic balance and infer loss of heterozygosity. Finally, we performed methylation-specific PCR on 76 colorectal tumors to determine DNA methylation status for CSMD1 and known methylation targets ALX4, RUNX3, NEUROG1, and CDKN2A.<h4>Results</h4>Using 454FLX sequencing and confirming with Sanger sequencing, 16 CSMD1 somatic mutations were identified in 6 of the 54 colorectal tumors (11%). The nonsynonymous to synonymous mutation ratio of the 16 somatic mutations was 15:1, a ratio significantly higher than the expected 2:1 ratio (p?=?0.014). This ratio indicates a presence of positive selection for mutations in the CSMD1 protein sequence. CSMD1 allelic imbalance was present in 19 of 37 informative cases (56%). Patients with allelic imbalance and CSMD1 mutations were significantly younger (average age, 41 years) than those without somatic mutations (average age, 68 years). The majority of tumors were methylated at one or more CpG loci within the CSMD1 coding sequence, and CSMD1 methylation significantly correlated with two known methylation targets ALX4 and RUNX3. C:G>T:A substitutions were significantly overrepresented (47%), suggesting extensive cytosine methylation predisposing to somatic mutations.<h4>Conclusions</h4>Deep amplicon sequencing and methylation-specific PCR reveal that CSMD1 alterations can correlate with earlier clinical presentation in colorectal tumors, thus further implicating CSMD1 as a tumor suppressor gene.
Project description:The hypervariable human minisatellite locus D7S22 (g3) is highly polymorphic. The allelic distribution in D7S22 features a size clustering of the alleles and a comparably low allelic diversity among small alleles. This reduced diversity could reflect a situation where some alleles are less likely to mutate than others. Several factors could explain such an effect, including allele size, variation in repeat composition, and allelic differences in nearby cis-acting elements affecting the mutation rate. We have characterized 40 de novo mutations found on Southern blots in a large amount of paternity-testing material. There is a significant excess of paternal mutations, and small size changes are most frequent. Mutation rate is affected by allele length, with highest rates in larger alleles. Alleles of the family groups with D7S22 mutations and 50 small alleles were analyzed by nucleotide sequencing. Two hundred thirty-six base pairs of the immediate flanking region upstream of the repeat array were PCR amplified and screened for point mutations by DNA sequencing of the PCR products. Two base substitution polymorphisms were identified: one C/G transversion and one A/G transition, 54 bp and 173 bp upstream of the repeat array, respectively. There is a significant association between mutation and occurrence of 54C, while association is not obvious between mutation rate and the 173A/G variants. There is a marked association between different flanking haplotypes and allele size, and within the smallest allele-size group, all alleles had the 54G/173A haplotype. Both allele size and allelic state at site 54 remain associated with mutation rate when the other factor is controlled. Possible mechanisms behind the variation in mutation rate in D7S22 are discussed.
Project description:Genetic heterogeneity contributes to clinical outcome and progression of most tumors, but little is known about allelic diversity for epigenetic compartments, and almost no data exist for acute myeloid leukemia (AML). We examined epigenetic heterogeneity as assessed by cytosine methylation within defined genomic loci with four CpGs (epialleles), somatic mutations, and transcriptomes of AML patient samples at serial time points. We observed that epigenetic allele burden is linked to inferior outcome and varies considerably during disease progression. Epigenetic and genetic allelic burden and patterning followed different patterns and kinetics during disease progression. We observed a subset of AMLs with high epiallele and low somatic mutation burden at diagnosis, a subset with high somatic mutation and lower epiallele burdens at diagnosis, and a subset with a mixed profile, suggesting distinct modes of tumor heterogeneity. Genes linked to promoter-associated epiallele shifts during tumor progression showed increased single-cell transcriptional variance and differential expression, suggesting functional impact on gene regulation. Thus, genetic and epigenetic heterogeneity can occur with distinct kinetics likely to affect the biological and clinical features of tumors.
Project description:Somatic mutations in the EGFR tyrosine kinase domain play a critical role in the development and treatment of non-small cell lung cancer (NSCLC). Strong genetic influence on susceptibility to these mutations has been suggested. To identify the genetic factors conferring risk for the EGFR tyrosine kinase mutations in NSCLC, a case-control study was conducted in 141 Taiwanese NSCLC patients by focusing on three functional polymorphisms in the EGFR gene [-216G/T, intron 1 (CA)n, and R497K]. Allelic imbalance of the EGFR -216G/T polymorphism was also tested in the heterozygous patients and in the NCI-60 cancer cell lines to further verify its function. We found that the frequencies of the alleles -216T and CA-19 are significantly higher in the patients with any mutation (P = 0.032 and 0.01, respectively), in particular in those with exon 19 microdeletions (P = 0.006 and 0.033, respectively), but not in the patients with L858R mutation. The -216T allele is favored to be amplified in both tumor DNA of lung cancer patients and cancer cell lines. We conclude that the local haplotype structures across the EGFR gene may favor the development of cellular malignancies and thus significantly confer risk to the occurrence of EGFR mutations in NSCLC, particularly the exon 19 microdeletions.
Project description:Introduction: Cis-acting regulatory single nucleotide polymorphisms (SNPs) at specific loci may modulate penetrance of germline mutations at the same loci by introducing different levels of expression of the wild-type allele. We have previously reported that BRCA2 shows differential allelic expression and we hypothesize that the known variable penetrance of BRCA2 mutations might be associated with this mechanism. Methods: We combined haplotype analysis and differential allelic expression of BRCA2 in breast tissue to identify expression haplotypes and candidate cis-regulatory variants. These candidate variants underwent selection based on in-silico predictions for regulatory potential and disruption of transcription factor binding, and were functionally analysed in-vitro and in-vivo in normal and breast cancer cell lines. SNPs tagging the expression haplotypes were correlated with the total expression of several genes in breast tissue measured by Taqman and microarray technologies. The effect of the expression haplotypes on breast cancer risk in BRCA2 mutation carriers was investigated in 2754 carriers. Results: We identified common haplotypes associated with differences in the levels of BRCA2 expression in human breast cells. We characterised three cis-regulatory SNPs located at the promoter and two intronic regulatory elements, which affect the binding of the transcription factors C/EBPα, HMGA1, DBP and ZF5. We showed that the expression haplotypes also correlated with changes in the expression of other genes in normal breast. Furthermore, there was suggestive evidence that the minor allele of SNP rs4942440, which is associated with higher BRCA2 expression, is also associated with a reduced risk of breast cancer (per-allele HR=0.85, 95%CI=0.72-1.00, P-trend=0.048). Conclusion: Our work provides further insights into the role of cis-regulatory variation in the penetrance of disease-causing mutations. We identified small-effect genetic variants associated with allelic expression differences in BRCA2, which could possibly affect the risk in mutation carriers through altering expression levels of the wild-type allele. Total gene expression of normal breast sample from healthy controls. This submission represents transcriptome component of study.