ABSTRACT: Current statistical models for assessing hotspot significance do not properly account for variation in site-specific mutability, thereby yielding many false-positives. We thus (i) detail a Log-normal-Poisson (LNP) background model that accounts for this variability in a manner consistent with models of mutagenesis; (ii) use it to show that passenger hotspots arise from all common mutational processes; and (iii) apply it to a ?10,000-patient cohort to nominate driver hotspots with far fewer false-positives compared with conventional methods. Overall, we show that many cancer hotspot mutations recurring at the same genomic site across multiple tumors are actually passenger events, recurring at inherently mutable genomic sites under no positive selection.
Project description:High-affinity antibodies arise within weeks of infection from the evolution of B-cell receptors under selection to improve antigen recognition. This rapid adaptation is enabled by the distribution of highly mutable "hotspot" motifs in B-cell receptor genes. High mutability in antigen-binding regions (complementarity determining regions [CDRs]) creates variation in binding affinity, whereas low mutability in structurally important regions (framework regions [FRs]) may reduce the frequency of destabilizing mutations. During the response, loss of mutational hotspots and changes in their distribution across CDRs and FRs are predicted to compromise the adaptability of B-cell receptors, yet the contributions of different mechanisms to gains and losses of hotspots remain unclear. We reconstructed changes in anti-HIV B-cell receptor sequences and show that mutability losses were ?56% more frequent than gains in both CDRs and FRs, with the higher relative mutability of CDRs maintained throughout the response. At least 21% of the total mutability loss was caused by synonymous mutations. However, nonsynonymous substitutions caused most (79%) of the mutability loss in CDRs. Because CDRs also show strong positive selection, this result suggests that selection for mutations that increase binding affinity contributed to loss of mutability in antigen-binding regions. Although recurrent adaptation to evolving viruses could indirectly select for high mutation rates, we found no evidence of indirect selection to increase or retain hotspots. Our results suggest mutability losses are intrinsic to both the neutral and adaptive evolution of B-cell populations and might constrain their adaptation to rapidly evolving pathogens such as HIV and influenza.
Project description:BACKGROUND:APOBEC-driven mutagenesis and functional positive selection of mutated genes may synergistically drive the higher frequency of some hotspot driver mutations compared to other mutations within the same gene, as we reported for FGFR3 S249C. Only a few APOBEC-associated driver hotspot mutations have been identified in bladder cancer (BCa). Here, we systematically looked for and characterised APOBEC-associated hotspots in BCa. METHODS:We analysed 602 published exome-sequenced BCas, for part of which gene expression data were also available. APOBEC-associated hotspots were identified by motif-mapping, mutation signature fitting and APOBEC-mediated mutagenesis comparison. Joint analysis of DNA hairpin stability and gene expression was performed to predict driver or passenger hotspots. Aryl hydrocarbon receptor (AhR) activity was calculated based on its target genes expression. Effects of AhR knockout/inhibition on BCa cell viability were analysed. RESULTS:We established a panel of 44 APOBEC-associated hotspot mutations in BCa, which accounted for about half of the hotspot mutations. Fourteen of them overlapped with the hotspots found in other cancer types with high APOBEC activity. They mostly occurred in the DNA lagging-strand templates and the loop of DNA hairpins. APOBEC-associated hotspots presented systematically a higher prevalence than the other mutations within each APOBEC-target gene, independently of their functional impact. A combined analysis of DNA loop stability and gene expression allowed to distinguish known passenger from known driver hotspot mutations in BCa, including loss-of-function mutations affecting tumour suppressor genes, and to predict new candidate drivers, such as AHR Q383H. We further characterised AHR Q383H as an activating driver mutation associated with high AhR activity in luminal tumours. High AhR activity was also found in tumours presenting amplifications of AHR and its co-receptor ARNT. We finally showed that BCa cells presenting those different genetic alterations were sensitive to AhR inhibition. CONCLUSIONS:Our study identified novel potential drivers within APOBEC-associated hotspot mutations in BCa reinforcing the importance of APOBEC mutagenesis in BCa. It could allow a better understanding of BCa biology and aetiology and have clinical implications such as AhR as a potential therapeutic target. Our results also challenge the dogma that all hotspot mutations are drivers and mostly gain-of-function mutations affecting oncogenes.
Project description:Determining the cancer type and molecular subtype has important clinical implications. The primary site is however unknown for some malignancies discovered in the metastatic stage. Moreover liquid biopsies may be used to screen for tumoral DNA, which upon detection needs to be assigned to a site-of-origin. Classifiers based on genomic features are a promising approach to prioritize the tumor anatomical site, type and subtype. We examined the predictive ability of causal (driver) somatic mutations in this task, comparing it against global patterns of non-selected (passenger) mutations, including features based on regional mutation density (RMD). In the task of distinguishing 18 cancer types, the driver mutations-mutated oncogenes or tumor suppressors, pathways and hotspots-classified 36% of the patients to the correct cancer type. In contrast, the features based on passenger mutations did so at 92% accuracy, with similar contribution from the RMD and the trinucleotide mutation spectra. The RMD and the spectra covered distinct sets of patients with predictions. In particular, introducing the RMD features into a combined classification model increased the fraction of diagnosed patients by 50 percentage points (at 20% FDR). Furthermore, RMD was able to discriminate molecular subtypes and/or anatomical site of six major cancers. The advantage of passenger mutations was upheld under high rates of false negative mutation calls and with exome sequencing, even though overall accuracy decreased. We suggest whole genome sequencing is valuable for classifying tumors because it captures global patterns emanating from mutational processes, which are informative of the underlying tumor biology.
Project description:Repair of oxidative stress- and inflammation-induced DNA lesions by the base excision repair (BER) pathway prevents mutation, a form of genomic instability which is often observed in cancer as 'mutation hotspots'. This suggests that some sequences have inherent mutability, possibly due to sequence-related differences in repair. This study has explored intrinsic mutability as a consequence of sequence-specific repair of lipid peroxidation-induced DNA adduct, 1, N(6)-ethenoadenine (?A). For the first time, we observed significant delay in repair of ?A at mutation hotspots in the tumor suppressor gene p53 compared to non-hotspots in live human hepatocytes and endothelial cells using an in-cell real time PCR-based method. In-cell and in vitro mechanism studies revealed that this delay in repair was due to inefficient turnover of N-methylpurine-DNA glycosylase (MPG), which initiates BER of ?A. We determined that the product dissociation rate of MPG at the hotspot codons was ?5-12-fold lower than the non-hotspots, suggesting a previously unknown mechanism for slower repair at mutation hotspots and implicating sequence-related variability of DNA repair efficiency to be responsible for mutation hotspot signatures.
Project description:Malaria hotspots, defined as areas where transmission intensity exceeds the average level, become more pronounced as transmission declines. Targeting hotspots may accelerate reductions in transmission and could be pivotal for malaria elimination. Determinants of hotspot location, particularly of their movement, are poorly understood. We used spatial statistical methods to identify foci of incidence of self-reported malaria in a large census population of 64,000 people, in 8,290 compounds over a 2.5-year study period. Regression models examine stability of hotspots and identify static and dynamic correlates with their location. Hotspot location changed over short time-periods, rarely recurring in the same area. Hotspots identified in spring versus fall season differed in their stability. Households located in a hotspot in the fall were more likely to be located in a hotspot the following fall (RR?=?1.77, 95% CI: 1.66-1.89), but the opposite was true for compounds in spring hotspots (RR?=?0.15, 95% CI: 0.08-0.28). Location within a hotspot was related to environmental and static household characteristics such as distance to roads or rivers. Human migration into a household was correlated with risk of hotspot membership, but the direction of the association differed based on the origin of the migration event.
Project description:Somatic hypermutation (SHM) of the immunoglobulin variable (IgV) loci is a key process in antibody affinity maturation. The enzyme activation-induced deaminase (AID), initiates SHM by creating C ? U mismatches on single-stranded DNA (ssDNA). AID has preferential hotspot motif targets in the context of WRC/GYW (W = A/T, R = A/G, Y = C/T) and particularly at WGCW overlapping hotspots where hotspots appear opposite each other on both strands. Subsequent recruitment of the low-fidelity DNA repair enzyme, Polymerase eta (Pol?), during mismatch repair, creates additional mutations at WA/TW sites. Although there are more than 50 functional immunoglobulin heavy chain variable (IGHV) segments in humans, the fundamental differences between these genes and their ability to respond to all possible foreign antigens is still poorly understood. To better understand this, we generated profiles of WGCW hotspots in each of the human IGHV genes and found the expected high frequency in complementarity determining regions (CDRs) that encode the antigen binding sites but also an unexpectedly high frequency of WGCW in certain framework (FW) sub-regions. Principal Components Analysis (PCA) of these overlapping AID hotspot profiles revealed that one major difference between IGHV families is the presence or absence of WGCW in a sub-region of FW3 sometimes referred to as "CDR4." Further differences between members of each family (e.g., IGHV1) are primarily determined by their WGCW densities in CDR1. We previously suggested that the co-localization of AID overlapping and Pol? hotspots was associated with high mutability of certain IGHV sub-regions, such as the CDRs. To evaluate the importance of this feature, we extended the WGCW profiles, combining them with local densities of Pol? (WA) hotspots, thus describing the co-localization of both types of hotspots across all IGHV genes. We also verified that co-localization is associated with higher mutability. PCA of the co-localization profiles showed CDR1 and CDR2 as being the main contributors to variance among IGHV genes, consistent with the importance of these sub-regions in antigen binding. Our results suggest that AID overlapping (WGCW) hotspots alone or in conjunction with Pol? (WA/TW) hotspots are key features of evolutionary variation between IGHV genes.
Project description:Cancer drivers require statistical modeling to distinguish them from passenger events, which accumulate during tumorigenesis but provide no fitness advantage to cancer cells. The discovery of driver genes and mutations relies on the assumption that exact positional recurrence is unlikely by chance; thus, the precise sharing of mutations across patients identifies drivers. Examining the mutation landscape in cancer genomes, we found that many recurrent cancer mutations previously designated as drivers are likely passengers. Our integrated bioinformatic and biochemical analyses revealed that these passenger hotspot mutations arise from the preference of APOBEC3A, a cytidine deaminase, for DNA stem-loops. Conversely, recurrent APOBEC-signature mutations not in stem-loops are enriched in well-characterized driver genes and may predict new drivers. This demonstrates that mesoscale genomic features need to be integrated into computational models aimed at identifying mutations linked to diseases.
Project description:In some species, meiotic recombination is concentrated in small genomic regions. These "recombination hotspots" leave signatures in fine-scale patterns of linkage disequilibrium, raising the prospect that the genomic landscape of hotspots can be characterized from sequence variation. This approach has led to the inference that hotspots evolve rapidly in some species, but are conserved in others. Historic demographic events, such as population bottlenecks, are known to affect patterns of linkage disequilibrium across the genome, violating population genetic assumptions of this approach. Although such events are prevalent, demographic history is generally ignored when making inferences about the evolution of recombination hotspots. To determine the effect of demography on the detection of recombination hotspots, we use the coalescent to simulate haplotypes with a known recombination landscape. We measure the ability of popular linkage disequilibrium-based programs to detect hotspots across a range of demographic histories, including population bottlenecks, hidden population structure, population expansions, and population contractions. We find that demographic events have the potential to greatly reduce the power and increase the false positive rate of hotspot discovery. Neither the power nor the false positive rate of hotspot detection can be predicted without also knowing the demographic history of the sample. Our results suggest that ignoring demographic history likely overestimates the power to detect hotspots and therefore underestimates the degree of hotspot sharing between species. We suggest strategies for incorporating demographic history into population genetic inferences about recombination hotspots.
Project description:About 40% of the hotspots for meiotic recombination contain the degenerate consensus sequence 5'-CCNCCNTNNCCNC-3'. Here we present a novel protocol for enriching hotspot sequences from digested genomic DNA by using biotinylated oligonucleotides and streptavidin-coated magnetic beads. The captured hotspots can be released by simple digestion with restriction enzymes for subsequent characterization by second generation sequencing or PCR. The capture protocol specifically enriches hotspot sequences, judged by using fluorophore-conjugated synthetic oligonucleotides and synthetic double-stranded oligonucleotides in combination with PCR. The capture protocol enriches single-stranded DNA, denatured double-stranded DNA, and large fragments (>3000 bp) of digested plasmid DNA with good efficacy. No false positive and false negatives were detected when enriching digested DNA from human cell cultures and primary human cells. The protocol can probably be adapted to enriching sequences other than the hotspot sequence by altering the sequence in the capture oligonucleotide. We intend to apply this protocol in studies assessing effects of micronutrient status on meiotic recombination events in human sperm.
Project description:The role of secondary structures and base mutability at different levels of transcription and supercoiling is analyzed in variable region antibody genes VH5, VH94 and VH186.2. The data are consistent with a model of somatic hypermutation in which increasing levels of transcription and secondary structure stability correlate with the initial formation of successive mutable sites. Encoded differences exist in stem length and the number of GC pairs at low versus high levels of transcription in CDRs. These circumstances simplify the complexities of coordinating mutagenesis by confining this process to each mutable site successively, as they form in response to increasing levels of transcription during affinity maturation.