Identification of constrained cancer driver genes based on mutation timing.
ABSTRACT: Cancer drivers are genomic alterations that provide cells containing them with a selective advantage over their local competitors, whereas neutral passengers do not change the somatic fitness of cells. Cancer-driving mutations are usually discriminated from passenger mutations by their higher degree of recurrence in tumor samples. However, there is increasing evidence that many additional driver mutations may exist that occur at very low frequencies among tumors. This observation has prompted alternative methods for driver detection, including finding groups of mutually exclusive mutations and incorporating prior biological knowledge about gene function or network structure. Dependencies among drivers due to epistatic interactions can also result in low mutation frequencies, but this effect has been ignored in driver detection so far. Here, we present a new computational approach for identifying genomic alterations that occur at low frequencies because they depend on other events. Unlike passengers, these constrained mutations display punctuated patterns of occurrence in time. We test this driver-passenger discrimination approach based on mutation timing in extensive simulation studies, and we apply it to cross-sectional copy number alteration (CNA) data from ovarian cancer, CNA and single-nucleotide variant (SNV) data from breast tumors and SNV data from colorectal cancer. Among the top ranked predicted drivers, we find low-frequency genes that have already been shown to be involved in carcinogenesis, as well as many new candidate drivers. The mutation timing approach is orthogonal and complementary to existing driver prediction methods. It will help identifying from cancer genome data the alterations that drive tumor progression.
Project description:Cancer drivers require statistical modeling to distinguish them from passenger events, which accumulate during tumorigenesis but provide no fitness advantage to cancer cells. The discovery of driver genes and mutations relies on the assumption that exact positional recurrence is unlikely by chance; thus, the precise sharing of mutations across patients identifies drivers. Examining the mutation landscape in cancer genomes, we found that many recurrent cancer mutations previously designated as drivers are likely passengers. Our integrated bioinformatic and biochemical analyses revealed that these passenger hotspot mutations arise from the preference of APOBEC3A, a cytidine deaminase, for DNA stem-loops. Conversely, recurrent APOBEC-signature mutations not in stem-loops are enriched in well-characterized driver genes and may predict new drivers. This demonstrates that mesoscale genomic features need to be integrated into computational models aimed at identifying mutations linked to diseases.
Project description:Driver mutations are somatic mutations that provide growth advantage to tumor cells, while passenger mutations are those not functionally related to oncogenesis. Distinguishing drivers from passengers is challenging because drivers occur much less frequently than passengers, they tend to have low prevalence, their functions are multifactorial and not intuitively obvious. Missense mutations are excellent candidates as drivers, as they occur more frequently and are potentially easier to identify than other types of mutations. Although several methods have been developed for predicting the functional impact of missense mutations, only a few have been specifically designed for identifying driver mutations. As more mutations are being discovered, more accurate predictive models can be developed using machine learning approaches that systematically characterize the commonality and peculiarity of missense mutations under the background of specific cancer types. Here, we present a cancer driver annotation (CanDrA) tool that predicts missense driver mutations based on a set of 95 structural and evolutionary features computed by over 10 functional prediction algorithms such as CHASM, SIFT, and MutationAssessor. Through feature optimization and supervised training, CanDrA outperforms existing tools in analyzing the glioblastoma multiforme and ovarian carcinoma data sets in The Cancer Genome Atlas and the Cancer Cell Line Encyclopedia project.
Project description:Cancer progression is driven by the accumulation of a small number of genetic alterations. However, these few driver alterations reside in a cancer genome alongside tens of thousands of additional mutations termed passengers. Passengers are widely believed to have no role in cancer, yet many passengers fall within protein-coding genes and other functional elements that can have potentially deleterious effects on cancer cells. Here we investigate the potential of moderately deleterious passengers to accumulate and alter the course of neoplastic progression. Our approach combines evolutionary simulations of cancer progression with an analysis of cancer sequencing data. From simulations, we find that passengers accumulate and largely evade natural selection during progression. Although individually weak, the collective burden of passengers alters the course of progression, leading to several oncological phenomena that are hard to explain with a traditional driver-centric view. We then tested the predictions of our model using cancer genomics data and confirmed that many passengers are likely damaging and have largely evaded negative selection. Finally, we use our model to explore cancer treatments that exploit the load of passengers by either (i) increasing the mutation rate or (ii) exacerbating their deleterious effects. Though both approaches lead to cancer regression, the latter is a more effective therapy. Our results suggest a unique framework for understanding cancer progression as a balance of driver and passenger mutations.
Project description:The dichotomous model of "drivers" and "passengers" in cancer posits that only a few mutations in a tumor strongly affect its progression, with the remaining ones being inconsequential. Here, we leveraged the comprehensive variant dataset from the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) project to demonstrate that-in addition to the dichotomy of high- and low-impact variants-there is a third group of medium-impact putative passengers. Moreover, we also found that molecular impact correlates with subclonal architecture (i.e., early versus late mutations), and different signatures encode for mutations with divergent impact. Furthermore, we adapted an additive-effects model from complex-trait studies to show that the aggregated effect of putative passengers, including undetected weak drivers, provides significant additional power (?12% additive variance) for predicting cancerous phenotypes, beyond PCAWG-identified driver mutations. Finally, this framework allowed us to estimate the frequency of potential weak-driver mutations in PCAWG samples lacking any well-characterized driver alterations.
Project description:BACKGROUND: Cancer cells harbor a large number of molecular alterations such as mutations, amplifications and deletions on DNA sequences and epigenetic changes on DNA methylations. These aberrations may dysregulate gene expressions, which in turn drive the malignancy of tumors. Deciphering the causal and statistical relations of molecular aberrations and gene expressions is critical for understanding the molecular mechanisms of clinical phenotypes. RESULTS: In this work, we proposed a computational method to reconstruct association modules containing driver aberrations, passenger mRNA or microRNA expressions, and putative regulators that mediate the effects from drivers to passengers. By applying the module-finding algorithm to the integrated datasets of NCI-60 cancer cell lines, we found that gene expressions were driven by diverse molecular aberrations including chromosomal segments' copy number variations, gene mutations and DNA methylations, microRNA expressions, and the expressions of transcription factors. In-silico validation indicated that passenger genes were enriched with the regulator binding motifs, functional categories or pathways where the drivers were involved, and co-citations with the driver/regulator genes. Moreover, 6 of 11 predicted MYB targets were down-regulated in an MYB-siRNA treated leukemia cell line. In addition, microRNA expressions were driven by distinct mechanisms from mRNA expressions. CONCLUSIONS: The results provide rich mechanistic information regarding molecular aberrations and gene expressions in cancer genomes. This kind of integrative analysis will become an important tool for the diagnosis and treatment of cancer in the era of personalized medicine.
Project description:Identifying cancer driver genes and exploring their functions are essential and the most urgent need in basic cancer research. Developing efficient methods to differentiate between driver and passenger somatic mutations revealed from large-scale cancer genome sequencing data is critical to cancer driver gene discovery. Here, we compared distinct features of SNP with SNV data in detail and found that the weighted ratio of SNV to SNP (termed as WVPR) is an excellent indicator for cancer driver genes. The power of WVPR was validated by accurate predictions of known drivers. We ranked most of human genes by WVPR and did functional analyses on the list. The results demonstrate that driver genes are usually highly enriched in chromatin organization related genes/pathways. And some protein complexes, such as histone acetyltransferase, histone methyltransferase, telomerase, centrosome, sin3 and U12-type spliceosomal complexes, are hot spots of driver mutations. Furthermore, this study identified many new potential driver genes (e.g. NTRK3 and ZIC4) and pathways including oxidative phosphorylation pathway, which were not deemed by previous methods. Taken together, our study not only developed a method to identify cancer driver genes/pathways but also provided new insights into molecular mechanisms of cancer development.
Project description:Cancer progression is an example of a rapid adaptive process where evolving new traits is essential for survival and requires a high mutation rate. Precancerous cells acquire a few key mutations that drive rapid population growth and carcinogenesis. Cancer genomics demonstrates that these few driver mutations occur alongside thousands of random passenger mutations--a natural consequence of cancer's elevated mutation rate. Some passengers are deleterious to cancer cells, yet have been largely ignored in cancer research. In population genetics, however, the accumulation of mildly deleterious mutations has been shown to cause population meltdown. Here we develop a stochastic population model where beneficial drivers engage in a tug-of-war with frequent mildly deleterious passengers. These passengers present a barrier to cancer progression describable by a critical population size, below which most lesions fail to progress, and a critical mutation rate, above which cancers melt down. We find support for this model in cancer age-incidence and cancer genomics data that also allow us to estimate the fitness advantage of drivers and fitness costs of passengers. We identify two regimes of adaptive evolutionary dynamics and use these regimes to understand successes and failures of different treatment strategies. A tumor's load of deleterious passengers can explain previously paradoxical treatment outcomes and suggest that it could potentially serve as a biomarker of response to mutagenic therapies. The collective deleterious effect of passengers is currently an unexploited therapeutic target. We discuss how their effects might be exacerbated by current and future therapies.
Project description:The identification of cancer drivers is a major goal of current cancer research. Finding driver genes within large chromosomal events is especially challenging because such alterations encompass many genes. Previously, we demonstrated that zebrafish malignant peripheral nerve sheath tumors (MPNSTs) are highly aneuploid, much like human tumors. In this study, we examined 147 zebrafish MPNSTs by massively parallel sequencing and identified both large and focal copy number alterations (CNAs). Given the low degree of conserved synteny between fish and mammals, we reasoned that comparative analyses of CNAs from fish versus human MPNSTs would enable elimination of a large proportion of passenger mutations, especially on large CNAs. Accordingly, we found less than one third of the human CNA genes were co-gained or co-lost in zebrafish, dramatically narrowing the list of candidate cancer drivers for both focal and large CNAs. We conclude that zebrafish-human comparative analysis represents a powerful, and broadly applicable, tool to enrich for evolutionarily conserved cancer drivers. --- This filing comprises data related to GEO entry GSE23666 ("Highly Aneuploid Zebrafish Malignant Peripheral Nerve Sheath Tumors have Genetic Alterations Similar to Human Cancers"), representing a followup study. 147 pairs of zebrafish (Danio rerio) tumor (MPNST) and normal tail control samples
Project description:Cancer genome sequencing efforts are leading to the identification of genetic mutations in many types of malignancy. However, the majority of these genetic alterations have been considered random passengers that do not directly contribute to tumorigenesis. We have previously conducted a soft agar-based short hairpin RNA (shRNA) screen within colorectal cancer (CRC) candidate driver genes (CAN-genes) using a karyotypically diploid hTERT- and CDK4-immortalized human colonic epithelial cell (HCEC) model and discovered that depletion of 65 of the 151 CAN-genes enhanced anchorage-independent growth in HCECs with ectopic expression of K-Ras(V12) and/or TP53 knockdown. We now constructed an interaction map of the confirmed CAN-genes with CRC non-CAN-genes and screened for functional tumor suppressors. Remarkably, depletion of 15 out of 25 presumed passenger genes that interact with confirmed CAN-genes (60%) promoted soft agar growth in HCECs with TP53 knockdown compared to only 7 out of 55 (12.5%) of presumed passenger genes that do not interact. We have thus demonstrated a pool of driver mutations among the putative CRC passenger/incidental mutations, establishing the importance of employing biological filters, in addition to bioinformatics, to identify driver mutations.