Project description:We have combined high-quality genome sequencing and RNA-sequencing data within a 17-individual, three generation family. Using these data, we have contrasted cis-acting expression, allele-specific expression and splicing quantitative trait loci (collectively termed eQTLs) within the family to eQTLs discovered within a cell-type and ethnicity-matched population sample. We identified that eQTL that exhibit larger effects in the family compared to the population are enriched for rare regulatory and splicing variants and were more likely to influence essential genes. In addition, we identify several large effect-size eQTLs within the family for genes involved in complex disease. Through analysis of eQTLs in a large family we also report the utility of non-coding genome annotation to predicting the effect of rare non-coding variants. We find that a combination of distance to the transcription start site, evolutionary constraint and epigenetic annotation is considerably more informative for predicting the consequence of rare non-coding variants than for common variants. In summary, through transcriptome analyses within a large family we are able to identify the contribution of rare non-coding variants to expression phenotypes and further demonstrate the predictive potential of diverse non-coding genome annotation for interpretation of the impact of rare non-coding variants. RNA-Sequencing of CEPH/UTAH family 1463
Project description:We have combined high-quality genome sequencing and RNA-sequencing data within a 17-individual, three generation family. Using these data, we have contrasted cis-acting expression, allele-specific expression and splicing quantitative trait loci (collectively termed eQTLs) within the family to eQTLs discovered within a cell-type and ethnicity-matched population sample. We identified that eQTL that exhibit larger effects in the family compared to the population are enriched for rare regulatory and splicing variants and were more likely to influence essential genes. In addition, we identify several large effect-size eQTLs within the family for genes involved in complex disease. Through analysis of eQTLs in a large family we also report the utility of non-coding genome annotation to predicting the effect of rare non-coding variants. We find that a combination of distance to the transcription start site, evolutionary constraint and epigenetic annotation is considerably more informative for predicting the consequence of rare non-coding variants than for common variants. In summary, through transcriptome analyses within a large family we are able to identify the contribution of rare non-coding variants to expression phenotypes and further demonstrate the predictive potential of diverse non-coding genome annotation for interpretation of the impact of rare non-coding variants.
2014-09-01 | GSE56961 | GEO
Project description:Whole Exome Sequencing identifies multiple rare variants in a large south Indian family with Primary Open Angle Glaucoma
Project description:Understanding the function of rare non-coding genetic variants represents a significant challenge. Here, we developed MapUTR, a screen to identify rare 3’ UTR variants affecting mRNA abundance post-transcriptionally. Among 17,301 rare variants, an average of 24.5% were functional, with 70% in cancer-related genes, many in critical cancer pathways. This observation motivated a further interrogation of 11,929 cancer somatic mutations, uncovering 3,928 (33%) functional mutations in well-established cancer driver genes, such as CDKN2A. Functional MapUTR variants were enriched in miRNA targets and protein-RNA interaction sites. Based on MapUTR, we define a new metric, untranslated tumor mutation burden (uTMB), reflecting the amount of somatic functional MapUTR variants of a tumor. We showed the potential of uTMB in predicting patient survival. Through prime editing, we characterized three variants in cancer-relevant genes (MFN2, FOSL2, and IRAK1), illustrating their cancer-driving potential. Our study elucidates the function of thousands of non-coding variants, nominates non-coding cancer driver mutations, and demonstrates their potential contributions to cancer.
Project description:Understanding the function of rare non-coding genetic variants represents a significant challenge. Here, we developed MapUTR, a screen to identify rare 3’ UTR variants affecting mRNA abundance post-transcriptionally. Among 17,301 rare variants, an average of 24.5% were functional, with 70% in cancer-related genes, many in critical cancer pathways. This observation motivated a further interrogation of 11,929 cancer somatic mutations, uncovering 3,928 (33%) functional mutations in well-established cancer driver genes, such as CDKN2A. Functional MapUTR variants were enriched in miRNA targets and protein-RNA interaction sites. Based on MapUTR, we define a new metric, untranslated tumor mutation burden (uTMB), reflecting the amount of somatic functional MapUTR variants of a tumor. We showed the potential of uTMB in predicting patient survival. Through prime editing, we characterized three variants in cancer-relevant genes (MFN2, FOSL2, and IRAK1), illustrating their cancer-driving potential. Our study elucidates the function of thousands of non-coding variants, nominates non-coding cancer driver mutations, and demonstrates their potential contributions to cancer.
Project description:Understanding the function of rare non-coding genetic variants represents a significant challenge. Here, we developed MapUTR, a screen to identify rare 3’ UTR variants affecting mRNA abundance post-transcriptionally. Among 17,301 rare variants, an average of 24.5% were functional, with 70% in cancer-related genes, many in critical cancer pathways. This observation motivated a further interrogation of 11,929 cancer somatic mutations, uncovering 3,928 (33%) functional mutations in well-established cancer driver genes, such as CDKN2A. Functional MapUTR variants were enriched in miRNA targets and protein-RNA interaction sites. Based on MapUTR, we define a new metric, untranslated tumor mutation burden (uTMB), reflecting the amount of somatic functional MapUTR variants of a tumor. We showed the potential of uTMB in predicting patient survival. Through prime editing, we characterized three variants in cancer-relevant genes (MFN2, FOSL2, and IRAK1), illustrating their cancer-driving potential. Our study elucidates the function of thousands of non-coding variants, nominates non-coding cancer driver mutations, and demonstrates their potential contributions to cancer.
Project description:The non-coding genome is substantially larger than the protein-coding genome, but has been largely unexplored by genetic association studies. Here, we performed region-based rare-variant association analysis of >25,000 variants in untranslated regions of 6,139 amyotrophic lateral sclerosis (ALS) whole-genomes and those of 70,403 non-ALS controls. We identified Interleukin-18 Receptor Accessory Protein (IL18RAP) 3′UTR variants as significantly enriched in non-ALS genomes and associated with five-fold reduced risk of developing ALS, and this was replicated in an independent cohort. These variants in the IL18RAP 3′UTR reduce mRNA stability and the binding of double-stranded RNA-binding proteins. Finally, the variants of IL18RAP 3′UTR confer a survival advantage for motor neurons because they dampen neurotoxicity of human iPSC-derived microglia bearing an ALS-associated expansion in C9orf72, and this depends on NF-κB signaling. This study reveals genetic variants that protect against ALS by reducing neuroinflammation, and emphasizes the importance of non-coding genetic association studies.
Project description:About 45% of congenital heart disease (CHD) is caused by rare gene mutations. Non-coding mutations that perturb cis-regulatory elements (CREs) likely contribute to CHD among the remaining cases without clear etiology. However, identifying CHD-causing non-coding variants has been problematic. We combined human induced pluripotent stem cell-derived cardiomyocyte (iPSC-CM) differentiation and a lentivirus-mediated massively parallel reporter assay (lentiMPRA) to create a high-throughput platform to measure human cardiac enhancer activity. We tested 2451 candidate human cardiac enhancers, identified 1185 with measurable activity, and functionally dissected 123 of these by systematic tiling mutagenesis. We functionally evaluated 6761 non-coding de novo variants (ncDNVs) prioritized from the whole genome sequencing (WGS) of 749 CHD trios. 397 ncDNVs significantly affected cardiac CRE activity. Remarkably, 53% of these ncDNVs increased enhancer activity, often at regions with undetectable enhancer activity in the reference sequence. We introduced 10 of these DNVs associated with CHD genes into iPSCs and found that 4 altered expression of neighboring genes. Moreover, these 4 DNVs also altered cardiomyocyte differentiation, as assessed by single nucleus RNA sequencing. Using the MPRA data, we developed a regression model to prioritize future DNVs for functional testing and demonstrate that this model finds enrichment of DNVs in a second, independent WGS cohort. Taken together, we developed a scalable system to measure the impact of non-coding DNVs on CRE activity and deployed this platform to systematically assess the contribution of non-coding DNVs to CHD.
Project description:A number of genetic studies have identified rare protein-coding DNA variations associated with autism spectrum disorder (ASD), a neurodevelopmental disorder with significant genetic etiology and heterogeneity. In contrast, the contributions of functional, regulatory genetic variations that occur in the extensive non-protein-coding regions of the genome remain poorly understood. Here we developed a genome-wide analysis to identify rare single nucleotide variants (SNVs) that occur in non-coding regions and determined regulatory function and evolutionary conservation of these variants. Using publicly available datasets and computational predictions, we identified SNVs within putative regulatory regions in promoters, transcription factor binding sites, microRNA genes and their target sites. Overall, we found regulatory variants in the ASD cases were enriched in autism-risk genes and genes involved in fetal neurodevelopment. As with previously reported coding mutations, we found an enrichment of regulatory variants associated with dysregulation of neurodevelopmental and synaptic signaling pathways. Among these were rare inherited non-coding SNVs found in the mature sequence of a number of microRNAs predicted to affect the regulation of autism-risk genes. We show a paternally inherited miR-873-5p variant, with reduced NRXN2 binding affinity, overlays a maternally inherited NRXN1 putative loss-of-function coding variation to likely increase genetic liability in an idiopathic ASD case. Our analysis pipeline provides a new resource for identifying loss-of-function regulatory DNA variations that may contribute to the genetic etiology of complex disorders.