The molecular genealogy of sequential overlapping inversions implies both homologous chromosomes of a heterokaryotype in an inversion origin.
ABSTRACT: Cytological and molecular studies have revealed that inversion chromosomal polymorphism is widespread across taxa and that inversions are among the most common structural changes fixed between species. Two major mechanisms have been proposed for the origin of inversions considering that breaks occur at either repetitive or non-homologous sequences. While inversions originating through the first mechanism might have a multiple origin, those originating through the latter mechanism would have a unique origin. Variation at regions flanking inversion breakpoints can be informative on the origin and history of inversions given the reduced recombination in heterokaryotypes. Here, we have analyzed nucleotide variation at a fragment flanking the most centromere-proximal shared breakpoint of several sequential overlapping inversions of the E chromosome of Drosophila subobscura -inversions E1, E2, E9 and E3. The molecular genealogy inferred from variation at this shared fragment does not exhibit the branching pattern expected according to the sequential origin of inversions. The detected discordance between the molecular and cytological genealogies has led us to consider a novel possibility for the origin of an inversion, and more specifically that one of these inversions originated on a heterokaryotype for chromosomal arrangements. Based on this premise, we propose three new models for inversions origin.
Project description:<h4>Background</h4>Detection of genomic inversions remains challenging. Many existing methods primarily target inzversions with a non repetitive breakpoint, leaving inverted repeat (IR) mediated non-allelic homologous recombination (NAHR) inversions largely unexplored.<h4>Result</h4>We present npInv, a novel tool specifically for detecting and genotyping NAHR inversion using long read sub-alignment of long read sequencing data. We benchmark npInv with other tools in both simulation and real data. We use npInv to generate a whole-genome inversion map for NA12878 consisting of 30 NAHR inversions (of which 15 are novel), including all previously known NAHR mediated inversions in NA12878 with flanking IR less than 7kb. Our genotyping accuracy on this dataset was 94%. We used PCR to confirm the presence of two of these novel inversions. We show that there is a near linear relationship between the length of flanking IR and the minimum inversion size, without inverted repeats.<h4>Conclusion</h4>The application of npInv shows high accuracy in both simulation and real data. The results give deeper insight into understanding inversion.
Project description:Genome sequence comparison across the Drosophila genus revealed that some fixed inversion breakpoints had been multiply reused at this long timescale. Cytological studies of Drosophila inversion polymorphism had previously shown that, also at this shorter timescale, some breakpoints had been multiply reused. The paucity of molecularly characterized polymorphic inversion breakpoints has so far precluded contrasting whether cytologically shared breakpoints of these relatively young inversions are actually reused at the molecular level. The E chromosome of Drosophila subobscura stands out because it presents several inversion complexes. This is the case of the E1+2+9+3 arrangement that originated from the ancestral Est arrangement through the sequential accumulation of four inversions (E1, E2, E9 and E3) sharing some breakpoints. We recently identified the breakpoints of inversions E1 and E2, which allowed establishing reuse at the molecular level of the cytologically shared breakpoint of these inversions. Here, we identified and sequenced the breakpoints of inversions E9 and E3, because they share breakpoints at sections 58D and 64C with those of inversions E1 and E2. This has allowed establishing that E9 and E3 originated through the staggered-break mechanism. Most importantly, sequence comparison has revealed the multiple reuse at the molecular level of the proximal breakpoint (section 58D), which would have been used at least by inversions E2, E9 and E3. In contrast, the distal breakpoint (section 64C) might have been only reused once by inversions E1 and E2, because the distal E3 breakpoint is displaced >70 kb from the other breakpoint limits.
Project description:Chromosomal inversions are structural changes that alter gene order but generally not gene content in the affected region. In Drosophila, extensive cytological studies revealed the widespread character of inversion polymorphism, with evidence for its adaptive character. In Drosophila subobscura, polymorphism affects both its four large autosomal elements and its X (A) chromosome. The characterization of eight of these autosomal inversions breakpoints revealed that most of them originated through the staggered-breaks mechanism. Here, we have performed chromosomal walks to identify the breakpoints of two X-chromosome widely distributed inversions -A2 and A1- of D. subobscura. Inversion A2 is considered a warm-adapted arrangement that exhibits parallel latitudinal clines in the species ancestral distribution area and in both American subcontinents, whereas inversion A1 is only present in the Palearctic region where it presents an east-west cline. The duplication detected at the A2 inversion breakpoints is consistent with its origin by the staggered-breaks mechanism. Inversion A1 breakpoints could not be molecularly identified even though they could be narrowly delimited. This result points to chromosome walking limitations when using as a guide the genome of other species. Limitations stem from the rate of evolution by paracentric inversions, which in Drosophila is highest for the X chromosome.
Project description:Chromosomal inversions can contribute to the adaptation of organisms to their environment by capturing particular advantageous allelic combinations of a set of genes included in the inverted fragment and also by advantageous functional changes due to the inversion process itself that might affect not only the expression of flanking genes but also their dose and structure. Of the two mechanisms originating inversions -ectopic recombination, and staggered double-strand breaks and subsequent repair- only the latter confers the inversion the potential to have dosage effects and/or to generate advantageous chimeric genes. In Drosophila subobscura, there is ample evidence for the adaptive character of its chromosomal polymorphism, with an important contribution of some warm-climate arrangements such as E1+2+9+12. Here, we have characterized the breakpoints of inversion E12 and established that it originated through the staggered-break mechanism like four of the five inversions of D. subobscura previously studied. This mechanism that also predominates in the D. melanogaster lineage might be prevalent in the Sophophora subgenus and contribute to the adaptive character of the polymorphic and fixed inversions of its species. Finally, we have shown that the D. subobscura inversion breakpoint regions have generally been disrupted by additional structural changes occurred at different time scales.
Project description:Genomic inversions are an increasingly recognized source of genetic variation. However, a lack of reliable high-throughput genotyping assays for these structures has precluded a full understanding of an inversion's phylogenetic, phenotypic, and population genetic properties. We characterize these properties for one of the largest polymorphic inversions in man (the ?4.5-Mb 8p23.1 inversion), a structure that encompasses numerous signals of natural selection and disease association. We developed and validated a flexible bioinformatics tool that utilizes SNP data to enable accurate, high-throughput genotyping of the 8p23.1 inversion. This tool was applied retrospectively to diverse genome-wide data sets, revealing significant population stratification that largely follows a clinal "serial founder effect" distribution model. Phylogenetic analyses establish the inversion's ancestral origin within the Homo lineage, indicating that 8p23.1 inversion has occurred independently in the Pan lineage. The human inversion breakpoint was localized to an inverted pair of human endogenous retrovirus elements within the large, flanking low-copy repeats; experimental validation of this breakpoint confirmed these elements as the likely intermediary substrates that sponsored inversion formation. In five data sets, mRNA levels of disease-associated genes were robustly associated with inversion genotype. Moreover, a haplotype associated with systemic lupus erythematosus was restricted to the derived inversion state. We conclude that the 8p23.1 inversion is an evolutionarily dynamic structure that can now be accommodated into the understanding of human genetic and phenotypic diversity.
Project description:Chromosomal inversions are a ubiquitous feature of genetic variation. Theoretical models describe several mechanisms by which inversions can drive adaptation and be maintained as polymorphisms. While inversions have been shown previously to be under selection, or contain genetic variation under selection, the specific phenotypic consequences of inversions leading to their maintenance remain unclear. Here we use genomic sequence and expression data from the <i>Drosophila</i> Genetic Reference Panel (DGRP) to explore the effects of two cosmopolitan inversions, <i>In</i>(<i>2L</i>)<i>t</i> and <i>In</i>(<i>3R</i>)<i>Mo</i>, on patterns of transcriptional variation. We demonstrate that each inversion has a significant effect on transcript abundance for hundreds of genes across the genome. Inversion-affected loci (IAL) appear both within inversions as well as on unlinked chromosomes. Importantly, IAL do not appear to be influenced by the previously reported genome-wide expression correlation structure. We found that five genes involved with sterol uptake, four of which are Niemann-Pick Type 2 orthologs, are upregulated in flies with <i>In</i>(<i>3R</i>)<i>Mo</i> but do not have SNPs in linkage disequilibrium (LD) with the inversion. We speculate that this upregulation is driven by genetic variation in <i>mod</i>(<i>mdg4</i>) that is in LD with <i>In</i>(<i>3R</i>)<i>Mo</i> We find that there is little evidence for a regional or position effect of inversions on gene expression at the chromosomal level, but do find evidence for the distal breakpoint of <i>In</i>(<i>3R</i>)<i>Mo</i> interrupting one gene and possibly disassociating the two flanking genes from regulatory elements.
Project description:In recent years different types of structural variants (SVs) have been discovered in the human genome and their functional impact has become increasingly clear. Inversions, however, are poorly characterized and more difficult to study, especially those mediated by inverted repeats or segmental duplications. Here, we describe the results of a simple and fast inverse PCR (iPCR) protocol for high-throughput genotyping of a wide variety of inversions using a small amount of DNA. In particular, we analyzed 22 inversions predicted in humans ranging from 5.1 kb to 226 kb and mediated by inverted repeat sequences of 1.6-24 kb. First, we validated 17 of the 22 inversions in a panel of nine HapMap individuals from different populations, and we genotyped them in 68 additional individuals of European origin, with correct genetic transmission in ? 12 mother-father-child trios. Global inversion minor allele frequency varied between 1% and 49% and inversion genotypes were consistent with Hardy-Weinberg equilibrium. By analyzing the nucleotide variation and the haplotypes in these regions, we found that only four inversions have linked tag-SNPs and that in many cases there are multiple shared SNPs between standard and inverted chromosomes, suggesting an unexpected high degree of inversion recurrence during human evolution. iPCR was also used to check 16 of these inversions in four chimpanzees and two gorillas, and 10 showed both orientations either within or between species, providing additional support for their multiple origin. Finally, we have identified several inversions that include genes in the inverted or breakpoint regions, and at least one disrupts a potential coding gene. Thus, these results represent a significant advance in our understanding of inversion polymorphism in human populations and challenge the common view of a single origin of inversions, with important implications for inversion analysis in SNP-based studies.
Project description:Inversion polymorphisms have occupied a privileged place in Drosophila genetic research since their discovery in the 1920s. Indeed, inversions seem to be nearly ubiquitous, and the majority of species that have been thoroughly surveyed have been found to be polymorphic for one or more chromosomal inversions. Despite enduring interest, however, inversions remain difficult to study because their effects are often cryptic, and few efficient assays have been developed. Even in Drosophila melanogaster, in which inversions can be reliably detected and have received considerable attention, the breakpoints of only three inversions have been characterized molecularly. Hence, inversion detection and assay design remain important unsolved problems. Here, we present a method for identification and local de novo assembly of inversion breakpoints using next-generation paired-end reads derived from D. melanogaster isofemale lines. PCR and cytological confirmations demonstrate that our method can reliably assemble inversion breakpoints, providing tools for future research on D. melanogaster inversions as well as a framework for detection and assay design of inversions and other chromosome aberrations in diverse taxa.
Project description:Alternative arrangements of chromosome 2 inversions in Anopheles gambiae are important sources of population structure, and are associated with adaptation to environmental heterogeneity. The forces responsible for their origin and maintenance are incompletely understood. Molecular characterization of inversion breakpoints provides insight into how they arose, and provides the basis for development of molecular karyotyping methods useful in future studies.Sequence comparison of regions near the cytological breakpoints of 2Rb allowed the molecular delineation of breakpoint boundaries. Comparisons were made between the standard 2R+b arrangement in the An. gambiae PEST reference genome and the inverted 2Rb arrangements in the An. gambiae M and S genome assemblies. Sequence differences between alternative 2Rb arrangements were exploited in the design of a PCR diagnostic assay, which was evaluated against the known chromosomal banding pattern of laboratory colonies and field-collected samples from Mali and Cameroon.The breakpoints of the 7.55 Mb 2Rb inversion are flanked by extensive runs of the same short (72 bp) tandemly organized sequence, which was likely responsible for chromosomal breakage and rearrangement. Application of the molecular diagnostic assay suggested that 2Rb has a single common origin in An. gambiae and its sibling species, Anopheles arabiensis, and also that the standard arrangement (2R+b) may have arisen twice through breakpoint reuse. The molecular diagnostic was reliable when applied to laboratory colonies, but its accuracy was lower in natural populations.The complex repetitive sequence flanking the 2Rb breakpoint region may be prone to structural and sequence-level instability. The 2Rb molecular diagnostic has immediate application in studies based on laboratory colonies, but its usefulness in natural populations awaits development of complementary molecular tools.
Project description:Cytological studies revealed that the number of chromosomes and their organization varies across species. The increasing availability of whole genome sequences of multiple species across specific phylogenies has confirmed and greatly extended these cytological observations. In the Drosophila genus, the ancestral karyotype consists of five rod-like acrocentric chromosomes (Muller elements A to E) and one dot-like chromosome (element F), each exhibiting a generally conserved gene content. Chromosomal fusions and paracentric inversions are thus the major contributors, respectively, to chromosome number variation among species and to gene order variation within chromosomal element. The subobscura cluster of Drosophila consists in three species that retain the genus ancestral karyotype and differ by a reduced number of fixed inversions. Here, we have used cytological information and the D. guanche genome sequence to identify and molecularly characterize the breakpoints of inversions that became fixed since the D. guanche-D. subobscura split. Our results have led us to propose a modified version of the D. guanche cytological map of its X chromosome, and to establish that (i) most inversions became fixed in the D. subobscura lineage and (ii) the order in which the four X chromosome overlapping inversions occurred and became fixed.