Project description:With the rapid progress of cancer genome studies, many missense mutations in populations of somatic cells of different cancer types and at different stages have been identified. However, it is challenging to understand the implications of these cancer-related variants. We have developed a computational method that integrates structural, topographical, and evolutionary information for assessments of biochemical effects and the extent of deleteriousness of the cancer-related variants. We have mapped somatic missense mutations from the Catalogue of Somatic Mutations In Cancer (COSMIC) to 3D structures in the Protein Data Bank (PDB). Our results show that a large portion of these missense mutations is located on protein surface pockets, which often serve as a structural and functional unit of cancer variants. We provide detailed analysis of several examples and assessment on the importance of these variants, including prediction of previously unreported cancer-variants, along with independent evidence from the literature. Furthermore, we show our predictions can inform on the functional roles and the mechanism of predicted cancer variants.
Project description:Computational models have made significant progress in predicting the effect of protein variants. However, deciphering numerous variants of uncertain significance (VUS) located within intrinsically disordered regions (IDRs) remains challenging. To address this issue, we introduce phase separation, which is tightly linked to IDRs, into the investigation of missense variants. Phase separation is vital for multiple physiological processes. By leveraging missense variants that alter phase separation propensity, we develop a machine learning approach named PSMutPred to predict the impact of missense mutations on phase separation. PSMutPred demonstrates robust performance in predicting missense variants that affect natural phase separation. In vitro experiments further underscore its validity. By applying PSMutPred on over 522,000 ClinVar missense variants, it significantly contributes to decoding the pathogenesis of disease variants, especially those in IDRs. Our work provides insights into the understanding of a vast number of VUSs in IDRs, expediting clinical interpretation and diagnosis.
Project description:Correctly identifying the true driver mutations in a patient's tumor is a major challenge in precision oncology. Most efforts address frequent mutations, leaving medium- and low-frequency variants mostly unaddressed. For TP53, this identification is crucial for both somatic and germline mutations, with the latter associated with the Li-Fraumeni syndrome (LFS), a multiorgan cancer predisposition. We present TP53_PROF (prediction of functionality), a gene specific machine learning model to predict the functional consequences of every possible missense mutation in TP53, integrating human cell- and yeast-based functional assays scores along with computational scores. Variants were labeled for the training set using well-defined criteria of prevalence in four cancer genomics databases. The model's predictions provided accuracy of 96.5%. They were validated experimentally, and were compared to population data, LFS datasets, ClinVar annotations and to TCGA survival data. Very high accuracy was shown through all methods of validation. TP53_PROF allows accurate classification of TP53 missense mutations applicable for clinical practice. Our gene specific approach integrated machine learning, highly reliable features and biological knowledge, to create an unprecedented, thoroughly validated and clinically oriented classification model. This approach currently addresses TP53 mutations and will be applied in the future to other important cancer genes.
Project description:Disruption of circadian rhythms increases the risk of several types of cancer. Mammalian cryptochromes (CRY1 and CRY2) are circadian transcriptional repressors that are related to DNA-repair enzymes. While CRYs lack DNA-repair activity, they modulate the transcriptional response to DNA damage, and CRY2 can promote SKP1 cullin 1-F-box (SCF)FBXL3-mediated ubiquitination of c-MYC and other targets. Here, we characterize five mutations in CRY2 observed in human cancers in The Cancer Genome Atlas. We demonstrate that two orthologous mutations of mouse CRY2 (D325H and S510L) accelerate the growth of primary mouse fibroblasts expressing high levels of c-MYC. Neither mutant affects steady-state levels of overexpressed c-MYC, and they have divergent impacts on circadian rhythms and on the ability of CRY2 to interact with SCFFBXL3 Unexpectedly, stable expression of either CRY2 D325H or of CRY2 S510L robustly suppresses P53 target-gene expression, suggesting that this may be a primary mechanism by which they influence cell growth.
Project description:Most human cancers contain mutations in the transcription factor p53 and majority of these are missense and located in the DNA binding core domain. In this study, the stabilities of all core domain missense mutations are predicted and are used to infer their likely inactivation mechanisms. Overall, 47.0% non-PRO/GLY mutants are stable (DeltaDeltaG < 1.0 kT) and 36.3% mutants are unstable (DeltaDeltaG > 3.0 kT), 12.2% mutants are with 1.0 kT < DeltaDeltaG < 3.0 kT. Only 4.5% mutants are with no conclusive predictions. Certain types of either stable or unstable mutations are found not to depend on their local structures. Y, I, C, V, F and W (W, R and F) are the most common residues before (after) mutation in unstable mutants. Q, N, K, D, A, S and T (I, T, L and V) are the most common residues before (after) mutation in stable mutants. The stability correlations with sequence, structure, and molecular contacts are also analyzed. No direct correlation between secondary structure and stability is apparent, but a strong correlation between solvent exposure and stability is noticeable. Our correlation analysis shows that loss of protein-protein contacts may be an alternative cause for p53 inactivation. Correlation with clinical data shows that loss of stability and loss of DNA contacts are the two main inactivation mechanisms. Finally, correlation with functional data shows that most mutations which retain functions are stable, and most mutations that gain functions are unstable, indicating destabilized and deformed p53 proteins are more likely to find new binding partners.PACS codes: 87.14.E-
Project description:ROS1 is the largest receptor tyrosine kinase in the human genome. Rearrangements of the ROS1 gene result in oncogenic ROS1 kinase fusion proteins that are currently the only validated biomarkers for targeted therapy with ROS1 TKIs in patients. While numerous somatic missense mutations in ROS1 exist in the cancer genome, their impact on catalytic activity and pathogenic potential is unknown. We interrogated the AACR Genie database and identified thirty-four missense mutations in the ROS1 tyrosine kinase domain for further analysis. Our experiments revealed that these mutations have varying effects on ROS1 kinase function, ranging from complete loss to significantly increased catalytic activity. Notably, Asn and Gly substitutions at Asp2113 in the ROS1 kinase domain were found to be TKI-sensitive transformative and oncogenic variants in cell-based model systems. In vivo experiments showed that ROS1 D2113N induced tumor formation that was sensitive to crizotinib and lorlatinib, FDA-approved ROS1-TKIs. Collectively, these findings highlight the tumorigenic potential of specific point mutations within the ROS1 kinase domain and their potential as therapeutic targets with FDA-approved ROS1-TKIs.
Project description:Resistance to small-molecule drugs is the main cause of the failure of therapeutic drugs in clinical practice. Missense mutations altering the binding of ligands to proteins are one of the critical mechanisms that result in genetic disease and drug resistance. Computational methods have made a lot of progress for predicting binding affinity changes and identifying resistance mutations, but their prediction accuracy and speed are still not satisfied and need to be further improved. To address these issues, we introduce a structure-based machine learning method for quantitatively estimating the effects of single mutations on ligand binding affinity changes (named as PremPLI). A comprehensive comparison of the predictive performance of PremPLI with other available methods on two benchmark datasets confirms that our approach performs robustly and presents similar or even higher predictive accuracy than the approaches relying on first-principle statistical mechanics and mixed physics- and knowledge-based potentials while requires much less computational resources. PremPLI can be used for guiding the design of ligand-binding proteins, identifying and understanding disease driver mutations, and finding potential resistance mutations for different drugs. PremPLI is freely available at https://lilab.jysw.suda.edu.cn/research/PremPLI/ and allows to do large-scale mutational scanning.
Project description:BackgroundSeveral methods have been developed to predict the pathogenicity of missense mutations but none has been specifically designed for classification of variants in mtDNA-encoded polypeptides. Moreover, there is not available curated dataset of neutral and damaging mtDNA missense variants to test the accuracy of predictors. Because mtDNA sequencing of patients suffering mitochondrial diseases is revealing many missense mutations, it is needed to prioritize candidate substitutions for further confirmation. Predictors can be useful as screening tools but their performance must be improved.ResultsWe have developed a SVM classifier (Mitoclass.1) specific for mtDNA missense variants. Training and validation of the model was executed with 2,835 mtDNA damaging and neutral amino acid substitutions, previously curated by a set of rigorous pathogenicity criteria with high specificity. Each instance is described by a set of three attributes based on evolutionary conservation in Eukaryota of wildtype and mutant amino acids as well as coevolution and a novel evolutionary analysis of specific substitutions belonging to the same domain of mitochondrial polypeptides. Our classifier has performed better than other web-available tested predictors. We checked performance of three broadly used predictors with the total mutations of our curated dataset. PolyPhen-2 showed the best results for a screening proposal with a good sensitivity. Nevertheless, the number of false positive predictions was too high. Our method has an improved sensitivity and better specificity in relation to PolyPhen-2. We also publish predictions for the complete set of 24,201 possible missense variants in the 13 human mtDNA-encoded polypeptides.ConclusionsMitoclass.1 allows a better selection of candidate damaging missense variants from mtDNA. A careful search of discriminatory attributes and a training step based on a curated dataset of amino acid substitutions belonging exclusively to human mtDNA genes allows an improved performance. Mitoclass.1 accuracy could be improved in the future when more mtDNA missense substitutions will be available for updating the attributes and retraining the model.
Project description:Disruption of circadian rhythms increases the risk of several types of cancer. Mammalian cryptochromes (CRY1 and CRY2) are circadian transcriptional repressors that are related to DNA repair enzymes. While CRYs lack DNA repair activity, they modulate the transcriptional response to DNA damage, and CRY2 can promote SCFFBXL3-mediated ubiquitination of c-MYC and other targets. Here, we characterize five mutations in CRY2 observed in human cancers in The Cancer Genome Atlas. We demonstrate that two orthologous mutations of mouse CRY2 (D325H and S510L) accelerate the growth of primary mouse fibroblasts expressing high levels of c-MYC. Neither mutant affects steady state levels of overexpressed c-MYC, and they have divergent impacts on circadian rhythms and on the ability of CRY2 to interact with SCFFBXL3. Unexpectedly, stable expression of either CRY2 D325H or of CRY2 S510L robustly suppresses P53 target gene expression, suggesting that this is the primary mechanism by which they influence cell growth.
Project description:Recurrent human epidermal growth factor receptor 2 (HER2) missense mutations have been reported in human cancers. These mutations occur primarily in the absence of HER2 gene amplification such that most HER2-mutant tumors are classified as "negative" by FISH or immunohistochemistry assays. It remains unclear whether nonamplified HER2 missense mutations are oncogenic and whether they are targets for HER2-directed therapies that are currently approved for the treatment of HER2 gene-amplified breast cancers. Here we functionally characterize HER2 kinase and extracellular domain mutations through gene editing of the endogenous loci in HER2 nonamplified human breast epithelial cells. In in vitro and in vivo assays, the majority of HER2 missense mutations do not impart detectable oncogenic changes. However, the HER2 V777L mutation increased biochemical pathway activation and, in the context of a PIK3CA mutation, enhanced migratory features in vitro. However, the V777L mutation did not alter in vivo tumorigenicity or sensitivity to HER2-directed therapies in proliferation assays. Our results suggest the oncogenicity and potential targeting of HER2 missense mutations should be considered in the context of cooperating genetic alterations and provide previously unidentified insights into functional analysis of HER2 mutations and strategies to target them.