Supervised mutational signatures for obesity and other tissue-specific etiological factors in cancer.
ABSTRACT: Determining the etiologic basis of the mutations that are responsible for cancer is one of the fundamental challenges in modern cancer research. Different mutational processes induce different types of DNA mutations, providing 'mutational signatures' that have led to key insights into cancer etiology. The most widely used signatures for assessing genomic data are based on unsupervised patterns that are then retrospectively correlated with certain features of cancer. We show here that supervised machine-learning techniques can identify signatures, called SuperSigs, that are more predictive than those currently available. Surprisingly, we found that aging yields different SuperSigs in different tissues, and the same is true for environmental exposures. We were able to discover SuperSigs associated with obesity, the most important lifestyle factor contributing to cancer in Western populations.
Project description:Mutational signatures have been proved as a valuable pattern in somatic genomics, mainly regarding cancer, with a potential application as a biomarker in clinical practice. Up to now, several bioinformatic packages to address this topic have been developed in different languages/platforms. MutationalPatterns has arisen as the most efficient tool for the comparison with the signatures currently reported in the Catalogue of Somatic Mutations in Cancer (COSMIC) database. However, the analysis of mutational signatures is nowadays restricted to a small community of bioinformatic experts.In this work we present Mutational Signatures in Cancer (MuSiCa), a new web tool based on MutationalPatterns and built using the Shiny framework in R language. By means of a simple interface suited to non-specialized researchers, it provides a comprehensive analysis of the somatic mutational status of the supplied cancer samples. It permits characterizing the profile and burden of mutations, as well as quantifying COSMIC-reported mutational signatures. It also allows classifying samples according to the above signature contributions.MuSiCa is a helpful web application to characterize mutational signatures in cancer samples. It is accessible online at http://bioinfo.ciberehd.org/GPtoCRC/en/tools.html and source code is freely available at https://github.com/marcos-diazg/musica .
Project description:Cancer cells accumulate somatic mutations as result of DNA damage, inaccurate repair and other mechanisms. Different genetic instability processes result in characteristic non-random patterns of DNA mutations, also known as mutational signatures. We developed mutSignatures, an integrated R-based computational framework aimed at deciphering DNA mutational signatures. Our software provides advanced functions for importing DNA variants, computing mutation types, and extracting mutational signatures via non-negative matrix factorization. Specifically, mutSignatures accepts multiple types of input data, is compatible with non-human genomes, and supports the analysis of non-standard mutation types, such as tetra-nucleotide mutation types. We applied mutSignatures to analyze somatic mutations found in smoking-related cancer datasets. We characterized mutational signatures that were consistent with those reported before in independent investigations. Our work demonstrates that selected mutational signatures correlated with specific clinical and molecular features across different cancer types, and revealed complementarity of specific mutational patterns that has not previously been identified. In conclusion, we propose mutSignatures as a powerful open-source tool for detecting the molecular determinants of cancer and gathering insights into cancer biology and treatment.
Project description:Some cancer therapies damage DNA and cause mutations in both cancerous and healthy cells. Therapy-induced mutations may underlie some of the long-term and late side effects of treatments, such as mental disabilities, organ toxicity and secondary neoplasms. Nevertheless, the burden of mutation contributed by different chemotherapies has not been explored. Here we identify the mutational signatures or footprints of six widely used anticancer therapies across more than 3,500 metastatic tumors originating from different organs. These include previously known and new mutational signatures generated by platinum-based drugs as well as a previously unknown signature of nucleoside metabolic inhibitors. Exploiting these mutational footprints, we estimate the contribution of different treatments to the mutation burden of tumors and their risk of contributing coding and potential driver mutations in the genome. The mutational footprints identified here allow for precise assessment of the mutational risk of different cancer therapies to understand their long-term side effects.
Project description:Mutational processes and signatures that drive early tumorigenesis are centrally important for early cancer prevention. Yet, to date, biomarkers and risk factors for polyps (adenomas) that inordinately and rapidly develop into colon cancer remain poorly defined. Here, we describe surprisingly high mutational profiles through whole-genome sequence (WGS) analysis in 2 of 4 pairs of benign colorectal adenoma tissue samples. Unsupervised hierarchical clustered transcriptomic analysis of a further 7 pairs of adenomas reveals distinct mutational signatures regardless of adenoma size. Transitional single nucleotide substitutions of C:G>T:A predominate in the adenoma mutational spectrum. Strikingly, we observe mutations in the TGF-? pathway and CEA-associated genes in 4 out of 11 adenomas, overlapping with the Wnt pathway. Immunohistochemical labeling reveals a nearly 5-fold increase in CEA levels in 23% of adenoma samples with a concomitant loss of TGF-? signaling. We also define a functional role by which the CEA B3 domain interacts with TGFBR1, potentially inactivating the tumor suppressor function of TGF-? signaling. Our study uncovers diverse mutational processes underlying the transition from early adenoma to cancer. This has broad implications for biomarker-driven targeting of CEA/TGF-? in high-risk adenomas and may lead to early detection of aggressive adenoma to CRC progression.
Project description:OBJECTIVE:Recently, many tumor sequencing studies have inferred and reported on mutational signatures, short nucleotide patterns at which particular somatic base substitutions appear more often. A number of signatures reflect biological processes in the patient and factors associated with cancer risk. Our goal is to infer mutational signatures appearing in colon cancer, a cancer for which environmental risk factors vary by cancer subtype, and compare the signatures to those in adult stem cells from normal colon. We also compare the mutational signatures to others in the literature. RESULTS:We apply a probabilistic mutation signature model to somatic mutations previously reported for six adult normal colon stem cells and 431 colon adenocarcinomas. We infer six mutational signatures in colon cancer, four being specific to tumors with hypermutation. Just two signatures explained the majority of mutations in the small number of normal aging colon samples. All six signatures are independently identified in a series of 295 Chinese colorectal cancers.
Project description:Driver mutations are the genetic variants responsible for oncogenesis, but how specific somatic mutational events arise in cells remains poorly understood. Mutational signatures derive from the frequency of mutated trinucleotides in a given cancer sample, and they provide an avenue for investigating the underlying mutational processes that operate in cancer. Here we analyse somatic mutations from 7,815 cancer exomes from The Cancer Genome Atlas (TCGA) across 26 cancer types. We curate a list of 50 known cancer driver mutations by analysing recurrence in our cohort and annotations of known cancer-associated genes from the Cancer Gene Census, IntOGen database and Cancer Genome Interpreter. We then use these datasets to perform binary univariate logistic regression and establish the statistical relationship between individual driver mutations and known mutational signatures across different cancer types. Our analysis led to the identification of 39 significant associations between driver mutations and mutational signatures (P < 0.004, with a false discovery rate of < 5%). We first validate our methodology by establishing statistical links for known and novel associations between driver mutations and the mutational signature arising from Polymerase Epsilon proofreading deficiency. We then examine associations between driver mutations and mutational signatures for AID/APOBEC enzyme activity and deficient mismatch repair. We also identify negative associations (odds ratio < 1) between mutational signatures and driver mutations, and here we examine the role of aging and cigarette smoke mutagenesis in the generation of driver mutations in IDH1 and KRAS in brain cancers and lung adenocarcinomas respectively. Our study provides statistical foundations for hypothesised links between otherwise independent biological processes and we uncover previously unexplored relationships between driver mutations and mutagenic processes during cancer development. These associations give insights into how cancers acquire advantageous mutations and can provide direction to guide further mechanistic studies into cancer pathogenesis.
Project description:BACKGROUND:The incidence of breast cancer among young women (aged ?40?years) has increased in North America and Europe. Fewer than 10% of cases among young women are attributable to inherited BRCA1 or BRCA2 mutations, suggesting an important role for somatic mutations. This study investigated genomic differences between young- and older-onset breast tumours. METHODS:In this study we characterized the mutational landscape of 89 young-onset breast tumours (?40?years) and examined differences with 949 older-onset tumours (>?40?years) using data from The Cancer Genome Atlas. We examined mutated genes, mutational load, and types of mutations. We used complementary R packages "deconstructSigs" and "SomaticSignatures" to extract mutational signatures. A recursively partitioned mixture model was used to identify whether combinations of mutational signatures were related to age of onset. RESULTS:Older patients had a higher proportion of mutations in PIK3CA, CDH1, and MAP3K1 genes, while young-onset patients had a higher proportion of mutations in GATA3 and CTNNB1. Mutational load was lower for young-onset tumours, and a higher proportion of these mutations were C?>?A mutations, but a lower proportion were C?>?T mutations compared to older-onset tumours. The most common mutational signatures identified in both age groups were signatures 1 and 3 from the COSMIC database. Signatures resembling COSMIC signatures 2 and 13 were observed among both age groups. We identified a class of tumours with a unique combination of signatures that may be associated with young age of onset. CONCLUSIONS:The results of this exploratory study provide some evidence that the mutational landscape and mutational signatures among young-onset breast cancer are different from those of older-onset patients. The characterization of young-onset tumours could provide clues to their etiology which may inform future prevention. Further studies are required to confirm our findings.
Project description:Motivation:Cancers arise as the result of somatically acquired changes in the DNA of cancer cells. However, in addition to the mutations that confer a growth advantage, cancer genomes accumulate a large number of somatic mutations resulting from normal DNA damage and repair processes as well as carcinogenic exposures or cancer related aberrations of DNA maintenance machinery. These mutagenic processes often produce characteristic mutational patterns called mutational signatures. The decomposition of a cancer genome's mutation catalog into mutations consistent with such signatures can provide valuable information about cancer etiology. However, the results from different decomposition methods are not always consistent. Hence, one needs to be able to not only decompose a patient's mutational profile into signatures but also establish the accuracy of such decomposition. Results:We proposed two complementary ways of measuring confidence and stability of decomposition results and applied them to analyze mutational signatures in breast cancer genomes. We identified both very stable and highly unstable signatures, as well as signatures that previously have not been associated with breast cancer. We also provided additional support for the novel signatures. Our results emphasize the importance of assessing the confidence and stability of inferred signature contributions. Availability:All tools developed in this paper have been implemented in an R package, called SignatureEstimation, which is available from https://www.ncbi.nlm.nih.gov/CBBresearch/Przytycka/index.cgi#signatureestimation. Contact:email@example.com (D.W.), firstname.lastname@example.org (T.M.P.). Supplementary information:Supplementary data are available at Bioinformatics online.
Project description:All cancers are caused by somatic mutations; however, understanding of the biological processes generating these mutations is limited. The catalogue of somatic mutations from a cancer genome bears the signatures of the mutational processes that have been operative. Here we analysed 4,938,362 mutations from 7,042 cancers and extracted more than 20 distinct mutational signatures. Some are present in many cancer types, notably a signature attributed to the APOBEC family of cytidine deaminases, whereas others are confined to a single cancer class. Certain signatures are associated with age of the patient at cancer diagnosis, known mutagenic exposures or defects in DNA maintenance, but many are of cryptic origin. In addition to these genome-wide mutational signatures, hypermutation localized to small genomic regions, 'kataegis', is found in many cancer types. The results reveal the diversity of mutational processes underlying the development of cancer, with potential implications for understanding of cancer aetiology, prevention and therapy.
Project description:Abstract Although not all somatic mutations are cancer drivers, their mutational signatures, i.e. the patterns of genomic alterations at a genome-wide scale, provide insights into past exposure to mutagens, DNA damage and repair processes. Computational deconvolution of somatic mutation patterns and expert curation pan-cancer studies have identified a number of mutational signatures associated with point mutations, dinucleotide substitutions, insertions and deletions, and rearrangements, and have established etiologies for a subset of these signatures. However, the mechanisms underlying nearly one-third of all mutational signatures are not yet understood. The signatures with established etiology and those with hitherto unknown origin appear to have some differences in strand bias, GC content and nucleotide context diversity. It is possible that some of the hitherto ‘unknown’ signatures predominantly occur outside gene regions. While nucleotide contexts might be adequate to establish etiologies of some mutational signatures, in other cases additional features, such as broader (epi)genomic contexts, including chromatin, replication timing, processivity and local mutational patterns, may help fully understand the underlying DNA damage and repair processes. Nonetheless, remarkable progress in characterization of mutational signatures has provided fundamental insights into the biology of cancer, informed disease etiology and opened up new opportunities for cancer prevention, risk management, and therapeutic decision making.