Project description:IntroductionLung cancer is the leading cause of cancer deaths in the world, and lung adenocarcinoma (LUAD) is its most prevalent subtype. Symptoms are often found in advanced disease in which treatment options are limited. Identifying genetic risk factors will enable better identification of high-risk individuals.MethodsTo identify LUAD risk genes, we performed a case-control association study for gene-level burden of rare, deleterious variants (RDVs) in germline whole-exome sequencing data of 1083 patients with LUAD and 7650 controls, split into discovery and validation cohorts. Of these, we performed whole-exome sequencing on 97 patients and acquired the rest from multiple public databases. We annotated all rare variants for pathogenicity conservatively, using the guidelines of the American College of Medical Genetics and Genomics and ClinVar curation, and investigated gene-level RDV burden using penalized logistic regression. All statistical tests were two-sided.ResultsWe discovered and replicated the finding that the burden of germline ATM RDVs was significantly higher in patients with LUAD versus controls (combined cohort OR = 4.6; p = 1.7e-04; 95% confidence interval = 2.2-9.5; 1.21% of cases; 0.24% of controls). Germline ATM RDVs were also enriched in an independent clinical cohort of 1594 patients from the MSK-IMPACT study (0.63%). In addition, we observed that an Ashkenazi Jewish (AJ) founder ATM variant, rs56009889, was statistically significantly more frequent in AJ cases versus AJ controls in our cohort (combined AJ cohort OR = 2.7, p = 6.9e-03, 95% confidence interval = 1.3-5.3).ConclusionsOur results indicate that ATM is a moderate-penetrance LUAD risk gene and that LUAD may be a part of the ATM-related cancer syndrome spectrum. Individuals with ATM RDVs are at an elevated LUAD risk and can benefit from increased surveillance (particularly computed tomography scanning), early detection, and chemoprevention programs, improving prognosis.
Project description:Pathogenic variants underlying Mendelian diseases often disrupt the normal physiology of a few tissues and organs. However, variant effect prediction tools that aim to identify pathogenic variants are typically oblivious to tissue contexts. Here we report a machine-learning framework, denoted 'Tissue Risk Assessment of Causality by Expression for variants' (TRACEvar, https://netbio.bgu.ac.il/TRACEvar/), that offers two advancements. First, TRACEvar predicts pathogenic variants that disrupt the normal physiology of specific tissues. This was achieved by creating 14 tissue-specific models that were trained on over 14,000 variants and combined 84 attributes of genetic variants with 495 attributes derived from tissue omics. TRACEvar outperformed 10 well-established and tissue-oblivious variant effect prediction tools. Second, the resulting models are interpretable, thereby illuminating variants' mode-of-action. Application of TRACEvar to variants of 52 rare-disease patients highlighted pathogenicity mechanisms and relevant disease processes. Lastly, interpretation of large-scale models revealed that top-ranking determinants of pathogenicity included attributes of disease-affected tissues, particularly cellular process activities. Hence, tissue contexts and interpretable machine-learning models can greatly enhance the etiology of rare diseases.
Project description:The acceleration of DNA sequencing in samples from patients and population studies has resulted in extensive catalogues of human genetic variation, but the interpretation of rare genetic variants remains problematic. A notable example of this challenge is the existence of disruptive variants in dosage-sensitive disease genes, even in apparently healthy individuals. Here, by manual curation of putative loss-of-function (pLoF) variants in haploinsufficient disease genes in the Genome Aggregation Database (gnomAD)1, we show that one explanation for this paradox involves alternative splicing of mRNA, which allows exons of a gene to be expressed at varying levels across different cell types. Currently, no existing annotation tool systematically incorporates information about exon expression into the interpretation of variants. We develop a transcript-level annotation metric known as the 'proportion expressed across transcripts', which quantifies isoform expression for variants. We calculate this metric using 11,706 tissue samples from the Genotype Tissue Expression (GTEx) project2 and show that it can differentiate between weakly and highly evolutionarily conserved exons, a proxy for functional importance. We demonstrate that expression-based annotation selectively filters 22.8% of falsely annotated pLoF variants found in haploinsufficient disease genes in gnomAD, while removing less than 4% of high-confidence pathogenic variants in the same genes. Finally, we apply our expression filter to the analysis of de novo variants in patients with autism spectrum disorder and intellectual disability or developmental disorders to show that pLoF variants in weakly expressed regions have similar effect sizes to those of synonymous variants, whereas pLoF variants in highly expressed exons are most strongly enriched among cases. Our annotation is fast, flexible and generalizable, making it possible for any variant file to be annotated with any isoform expression dataset, and will be valuable for the genetic diagnosis of rare diseases, the analysis of rare variant burden in complex disorders, and the curation and prioritization of variants in recall-by-genotype studies.
Project description:Autism is a highly heritable complex disorder in which de novo mutation (DNM) variation contributes significantly to risk. Using whole-genome sequencing data from 3,474 families, we investigate another source of large-effect risk variation, ultra-rare variants. We report and replicate a transmission disequilibrium of private, likely gene-disruptive (LGD) variants in probands but find that 95% of this burden resides outside of known DNM-enriched genes. This variant class more strongly affects multiplex family probands and supports a multi-hit model for autism. Candidate genes with private LGD variants preferentially transmitted to probands converge on the E3 ubiquitin-protein ligase complex, intracellular transport and Erb signaling protein networks. We estimate that these variants are approximately 2.5 generations old and significantly younger than other variants of similar type and frequency in siblings. Overall, private LGD variants are under strong purifying selection and appear to act on a distinct set of genes not yet associated with autism.
Project description:BackgroundAccurate interpretation of rare genetic variants is a challenge for clinical translation. Updates in recommendations for rare variant classification require the reanalysis and reclassification. We aim to perform an exhaustive re-analysis of rare variants associated with inherited arrhythmogenic syndromes, which were classified ten years ago, to determine whether their classification aligns with current standards and research findings.MethodsIn 2010, the rare variants identified through genetic analysis were classified following recommendations available at that time. Nowadays, the same variants have been reclassified following current American College of Medical Genetics and Genomics recommendations.FindingsOur cohort included 104 cases diagnosed with inherited arrhythmogenic syndromes and 17 post-mortem cases in which inherited arrhythmogenic syndromes was cause of death. 71.87% of variants change their classification. While 65.62% of variants were classified as likely pathogenic in 2010, after reanalysis, only 17.96% remain as likely pathogenic. In 2010, 18.75% of variants were classified as uncertain role but nowadays 60.15% of variants are classified of unknown significance.InterpretationReclassification occurred in more than 70% of rare variants associated with inherited arrhythmogenic syndromes. Our results support the periodical reclassification and personalized clinical translation of rare variants to improve diagnosis and adjust treatment.FundingObra Social "La Caixa Foundation" (ID 100010434, LCF/PR/GN16/50290001 and LCF/PR/GN19/50320002), Fondo Investigacion Sanitaria (FIS PI16/01203 and FIS, PI17/01690), Sociedad Española de Cardiología, and "Fundacio Privada Daniel Bravo Andreu".
Project description:Pathogenic variants underlying Mendelian diseases often disrupt the normal physiology of a few tissues and organs. However, variant effect prediction tools that aim to identify pathogenic variants are typically oblivious to tissue contexts. Here we report a machine-learning framework, denoted "Tissue Risk Assessment of Causality by Expression for variants" (TRACEvar, https://netbio.bgu.ac.il/TRACEvar/ ), that offers two advancements. First, TRACEvar predicts pathogenic variants that disrupt the normal physiology of specific tissues. This was achieved by creating 14 tissue-specific models that were trained on over 14,000 variants and combined 84 attributes of genetic variants with 495 attributes derived from tissue omics. TRACEvar outperformed 10 well-established and tissue-oblivious variant effect prediction tools. Second, the resulting models are interpretable, thereby illuminating variants' mode of action. Application of TRACEvar to variants of 52 rare-disease patients highlighted pathogenicity mechanisms and relevant disease processes. Lastly, the interpretation of all tissue models revealed that top-ranking determinants of pathogenicity included attributes of disease-affected tissues, particularly cellular process activities. Collectively, these results show that tissue contexts and interpretable machine-learning models can greatly enhance the etiology of rare diseases.
Project description:PurposeStringent variant interpretation guidelines can lead to high rates of variants of uncertain significance (VUS) for genetically heterogeneous disease like long QT syndrome (LQTS) and Brugada syndrome (BrS). Quantitative and disease-specific customization of American College of Medical Genetics and Genomics/Association for Molecular Pathology (ACMG/AMP) guidelines can address this false negative rate.MethodsWe compared rare variant frequencies from 1847 LQTS (KCNQ1/KCNH2/SCN5A) and 3335 BrS (SCN5A) cases from the International LQTS/BrS Genetics Consortia to population-specific gnomAD data and developed disease-specific criteria for ACMG/AMP evidence classes-rarity (PM2/BS1 rules) and case enrichment of individual (PS4) and domain-specific (PM1) variants.ResultsRare SCN5A variant prevalence differed between European (20.8%) and Japanese (8.9%) BrS patients (p = 5.7 × 10-18) and diagnosis with spontaneous (28.7%) versus induced (15.8%) Brugada type 1 electrocardiogram (ECG) (p = 1.3 × 10-13). Ion channel transmembrane regions and specific N-terminus (KCNH2) and C-terminus (KCNQ1/KCNH2) domains were characterized by high enrichment of case variants and >95% probability of pathogenicity. Applying the customized rules, 17.4% of European BrS and 74.8% of European LQTS cases had (likely) pathogenic variants, compared with estimated diagnostic yields (case excess over gnomAD) of 19.2%/82.1%, reducing VUS prevalence to close to background rare variant frequency.ConclusionLarge case-control data sets enable quantitative implementation of ACMG/AMP guidelines and increased sensitivity for inherited arrhythmia genetic testing.
Project description:A proper interpretation of the pathogenicity of rare variants is crucial before clinical translation. Ongoing addition of new data may modify previous variant classifications; however, how often a reanalysis is necessary remains undefined. We aimed to extensively reanalyze rare variants associated with inherited channelopathies originally classified 5 years ago and its clinical impact. In 2016, rare variants identified through genetic analysis were classified following the American College of Medical Genetics and Genomics' recommendations. Five years later, we have reclassified the same variants following the same recommendations but including new available data. Potential clinical implications were discussed. Our cohort included 49 cases of inherited channelopathies diagnosed in 2016. Update show that 18.36% of the variants changed classification mainly due to improved global frequency data. Reclassifications mostly occurred in minority genes associated with channelopathies. Similar percentage of variants remain as deleterious nowadays, located in main known genes (SCN5A, KCNH2 and KCNQ1). In 2016, 69.38% of variants were classified as unknown significance, but now, 53.06% of variants are classified as such, remaining the most common group. No management was modified after translation of genetic data into clinics. After 5 years, nearly 20% of rare variants associated with inherited channelopathies were reclassified. This supports performing periodic reanalyses of no more than 5 years since last classification. Use of newly available data is necessary, especially concerning global frequencies and family segregation. Personalized clinical translation of rare variants can be crucial to management if a significant change in classification is identified.
Project description:Short QT syndrome, one of the most lethal entities associated with sudden cardiac death, is a rare genetic disease characterized by short QT intervals detected by electrocardiogram. Several genetic variants are causally linked to the disease, but there has yet to be a comprehensive analysis of variants among patients with short QT syndrome. To fill this gap, we performed an exhaustive study of variants currently catalogued as deleterious in short QT syndrome according to the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Analysis of the 32 variants described in the literature determined that only nine (28.12%) have a conclusive pathogenic role. All definitively pathogenic variants are located in KCNQ1, KCNH2, or KCNJ2; three genes encoding potassium channels. Other variants located in genes encoding calcium or sodium channels are associated with electrical alterations concomitant with shortened QT intervals but do not guarantee a diagnosis of short QT syndrome. We recommend caution regarding previously reported variants classified as pathogenic. An exhaustive re-analysis is necessary to clarify the role of each variant before routinely translating genetic findings to the clinical setting.
Project description:Genomic studies may generate massive amounts of data, bringing interpretation challenges. Efforts for the differentiation of benign and pathogenic variants gain importance. In this article, we used segregation analysis and other molecular data to reclassify to benign or likely benign several rare clinically curated variants of autosomal dominant inheritance from a cohort of 500 Brazilian patients with rare diseases. This study included only symptomatic patients who had undergone molecular investigation with exome sequencing for suspected diseases of genetic etiology. Variants clinically suspected as the causative etiology and harbored by genes associated with highly-penetrant conditions of autosomal dominant inheritance underwent Sanger confirmation in the proband and inheritance pattern determination because a "de novo" event was expected. Among all 327 variants studied, 321 variants were inherited from asymptomatic parents. Considering segregation analysis, we have reclassified 51 rare variants as benign and 211 as likely benign. In our study, the inheritance of a highly penetrant variant expected to be de novo for pathogenicity assumption was considered as a non-segregation and, therefore, a key step for benign or likely benign classification. Studies like ours may help to identify rare benign variants and improve the correct interpretation of genetic findings.