An Alu transposition model for the origin and expansion of human segmental duplications.
ABSTRACT: Relative to genomes of other sequenced organisms, the human genome appears particularly enriched for large, highly homologous segmental duplications (> or =90% sequence identity and > or =10 kbp in length). The molecular basis for this enrichment is unknown. We sought to gain insight into the mechanism of origin, by systematically examining sequence features at the junctions of duplications. We analyzed 9,464 junctions within regions of high-quality finished sequence from a genomewide set of 2,366 duplication alignments. We observed a highly significant (P<.0001) enrichment of Alu short interspersed element (SINE) sequences near or within the junction. Twenty-seven percent of all segmental duplications terminated within an Alu repeat. The Alu junction enrichment was most pronounced for interspersed segmental duplications separated by > or =1 Mb of intervening sequence. Alu elements at the junctions showed higher levels of divergence, consistent with Alu-Alu-mediated recombination events. When we classified Alu elements into major subfamilies, younger elements (AluY and AluS) accounted for the enrichment, whereas the oldest primate family (AluJ) showed no enrichment. We propose that the primate-specific burst of Alu retroposition activity (which occurred 35-40 million years ago) sensitized the ancestral human genome for Alu-Alu-mediated recombination events, which, in turn, initiated the expansion of gene-rich segmental duplications and their subsequent role in nonallelic homologous recombination.
Project description:Low-copy repeats, or segmental duplications, are highly dynamic regions in the genome. The low-copy repeats on chromosome 22q11.2 (LCR22) are a complex mosaic of genes and pseudogenes formed by duplication processes; they mediate chromosome rearrangements associated with velo-cardio-facial syndrome/DiGeorge syndrome, der(22) syndrome, and cat-eye syndrome. The ability to trace the substrates and products of recombination events provides a unique opportunity to identify the mechanisms responsible for shaping LCR22s. We examined the genomic sequence of known LCR22 genes and their duplicated derivatives. We found Alu (SINE) elements at the breakpoints in the substrates and at the junctions in the truncated products of recombination for USP18, GGT, and GGTLA, consistent with Alu-mediated unequal crossing-over events. In addition, we were able to trace a likely interchromosomal Alu-mediated fusion between IGSF3 on 1p13.1 and GGT on 22q11.2. Breakpoints occurred inside Alu elements as well as in the 5' or 3' ends of them. A possible stimulus for the 5' or 3' terminal rearrangements may be the high sequence similarities between different Alu elements, combined with a potential recombinogenic role of retrotransposon target-site duplications flanking the Alu element, containing potentially kinkable DNA sites. Such sites may represent focal points for recombination. Thus, genome shuffling by Alu-mediated rearrangements has contributed to genome architecture during primate evolution.
Project description:We describe genomic structures of 59 X-chromosome segmental duplications that include the proteolipid protein 1 gene (PLP1) in patients with Pelizaeus-Merzbacher disease. We provide the first report of 13 junction sequences, which gives insight into underlying mechanisms. Although proximal breakpoints were highly variable, distal breakpoints tended to cluster around low-copy repeats (LCRs) (50% of distal breakpoints), and each duplication event appeared to be unique (100 kb to 4.6 Mb in size). Sequence analysis of the junctions revealed no large homologous regions between proximal and distal breakpoints. Most junctions had microhomology of 1-6 bases, and one had a 2-base insertion. Boundaries between single-copy and duplicated DNA were identical to the reference genomic sequence in all patients investigated. Taken together, these data suggest that the tandem duplications are formed by a coupled homologous and nonhomologous recombination mechanism. We suggest repair of a double-stranded break (DSB) by one-sided homologous strand invasion of a sister chromatid, followed by DNA synthesis and nonhomologous end joining with the other end of the break. This is in contrast to other genomic disorders that have recurrent rearrangements formed by nonallelic homologous recombination between LCRs. Interspersed repetitive elements (Alu elements, long interspersed nuclear elements, and long terminal repeats) were found at 18 of the 26 breakpoint sequences studied. No specific motif that may predispose to DSBs was revealed, but single or alternating tracts of purines and pyrimidines that may cause secondary structures were common. Analysis of the 2-Mb region susceptible to duplications identified proximal-specific repeats and distal LCRs in addition to the previously reported ones, suggesting that the unique genomic architecture may have a role in nonrecurrent rearrangements by promoting instability.
Project description:Alu repetitive elements are known to be major contributors to genome instability by generating Alu-mediated copy-number variants (CNVs). Most of the reported Alu-mediated CNVs are simple deletions and duplications, and the mechanism underlying Alu-Alu-mediated rearrangement has been attributed to non-allelic homologous recombination (NAHR). Chromosome 17 at the p13.3 genomic region lacks extensive low-copy repeat architecture; however, it is highly enriched for Alu repetitive elements, with a fraction of 30% of total sequence annotated in the human reference genome, compared with the 10% genome-wide and 18% on chromosome 17. We conducted mechanistic studies of the 17p13.3 CNVs by performing high-density oligonucleotide array comparative genomic hybridization, specifically interrogating the 17p13.3 region with ?150 bp per probe density; CNV breakpoint junctions were mapped to nucleotide resolution by polymerase chain reaction and Sanger sequencing. Studied rearrangements include 5 interstitial deletions, 14 tandem duplications, 7 terminal deletions and 13 complex genomic rearrangements (CGRs). Within the 17p13.3 region, Alu-Alu-mediated rearrangements were identified in 80% of the interstitial deletions, 46% of the tandem duplications and 50% of the CGRs, indicating that this mechanism was a major contributor for formation of breakpoint junctions. Our studies suggest that Alu repetitive elements facilitate formation of non-recurrent CNVs, CGRs and other structural aberrations of chromosome 17 at p13.3. The common observation of Alu-mediated rearrangement in CGRs and breakpoint junction sequences analysis further demonstrates that this type of mechanism is unlikely attributed to NAHR, but rather may be due to a recombination-coupled DNA replicative repair process.
Project description:Duplicated sequences are important sources of genetic instability and in the evolution of new gene function within species. Hominids have a preponderance of intrachromosomal duplications organized in an interspersed fashion, as opposed to tandem duplications, which are common in other mammalian genomes such as mouse, dog, and cow. Multiple lines of evidence, including sequence divergence, comparative primate genomes, and fluorescence in situ hybridization (FISH) analyses, point to an excess of segmental duplications in the common ancestor of humans and African great apes. We find that much of the interspersed human duplication architecture within chromosomes is focused around common sequence elements referred to as "core duplicons." These cores correspond to the expansion of gene families, some of which show signatures of positive selection and lack orthologs present in other mammalian species. This genomic architecture predisposes apes and humans not only to extensive genetic diversity, but also to large-scale structural diversity mediated by nonallelic homologous recombination. In humans, many de novo large-scale genomic changes mediated by these duplications are associated with neuropsychiatric and neurodevelopmental disease. We propose that the disadvantage of a high rate of new mutations is offset by the selective advantage of newly minted genes within the cores.
Project description:BACKGROUND: Alu elements are short (approximately 300 bp) interspersed elements that amplify in primate genomes through a process termed retroposition. The expansion of these elements has had a significant impact on the structure and function of primate genomes. Approximately 10 % of the mass of the human genome is comprised of Alu elements, making them the most abundant short interspersed element (SINE) in our genome. The majority of Alu amplification occurred early in primate evolution, and the current rate of Alu retroposition is at least 100 fold slower than the peak of amplification that occurred 30-50 million years ago. Alu elements are therefore a rich source of inter- and intra-species primate genomic variation. RESULTS: A total of 153 Alu elements from the Ye subfamily were extracted from the draft sequence of the human genome. Analysis of these elements resulted in the discovery of two new Alu subfamilies, Ye4 and Ye6, complementing the previously described Ye5 subfamily. DNA sequence analysis of each of the Alu Ye subfamilies yielded average age estimates of approximately 14, approximately 13 and approximately 9.5 million years old for the Alu Ye4, Ye5 and Ye6 subfamilies, respectively. In addition, 120 Alu Ye4, Ye5 and Ye6 loci were screened using polymerase chain reaction (PCR) assays to determine their phylogenetic origin and levels of human genomic diversity. CONCLUSION: The Alu Ye lineage appears to have started amplifying relatively early in primate evolution and continued propagating at a low level as many of its members are found in a variety of hominoid (humans, greater and lesser ape) genomes. Detailed sequence analysis of several Alu pre-integration sites indicated that multiple types of events had occurred, including gene conversions, near-parallel independent insertions of different Alu elements and Alu-mediated genomic deletions. A potential hotspot for Alu insertion in the Fer1L3 gene on chromosome 10 was also identified.
Project description:The long interspersed element-1 (LINE-1 or L1) and Alu elements are the most abundant mobile elements comprising 21% and 11% of the human genome, respectively. Since the divergence of human and chimpanzee lineages, these elements have vigorously created chromosomal rearrangements causing genomic difference between humans and chimpanzees by either increasing or decreasing the size of genome. Here, we report an exotic mechanism, retrotransposon recombination-mediated inversion (RRMI), that usually does not alter the amount of genomic material present. Through the comparison of the human and chimpanzee draft genome sequences, we identified 252 inversions whose respective inversion junctions can clearly be characterized. Our results suggest that L1 and Alu elements cause chromosomal inversions by either forming a secondary structure or providing a fragile site for double-strand breaks. The detailed analysis of the inversion breakpoints showed that L1 and Alu elements are responsible for at least 44% of the 252 inversion loci between human and chimpanzee lineages, including 49 RRMI loci. Among them, three RRMI loci inverted exonic regions in known genes, which implicates this mechanism in generating the genomic and phenotypic differences between human and chimpanzee lineages. This study is the first comprehensive analysis of mobile element bases inversion breakpoints between human and chimpanzee lineages, and highlights their role in primate genome evolution.
Project description:Alus are the most abundant and successful short interspersed nuclear elements found in primate genomes. In humans, they represent about 10% of the genome, although few are retrotransposition-competent and are clustered into subfamilies according to the source gene from which they evolved. Recombination between them can lead to genomic rearrangements of clinical and evolutionary significance. In this study, we have addressed the role of recombination in the origin of chimeric Alu source genes by the analysis of all known consensus sequences of human Alus. From the allelic diversity of Alu consensus sequences, validated in extant elements resulting from whole genome searches, distinct events of recombination were detected in the origin of particular subfamilies of AluS and AluY source genes. These results demonstrate that at least two subfamilies are likely to have emerged from ectopic Alu-Alu recombination, which stimulates further research regarding the potential of chimeric active Alus to punctuate the genome.
Project description:BACKGROUND: Segmental duplications (SDs) on 22q11.2 (LCR22), serve as substrates for meiotic non-allelic homologous recombination (NAHR) events resulting in several clinically significant genomic disorders. RESULTS: To understand the duplication activity leading to the complicated SD structure of this region, we have applied the A-Bruijn graph algorithm to decompose the 22q11.2 SDs to 523 fundamental duplication sequences, termed subunits. Cross-species syntenic analysis of primate genomes demonstrates that many of these LCR22 subunits emerged very recently, especially those implicated in human genomic disorders. Some subunits have expanded more actively than others, and young Alu SINEs, are associated much more frequently with duplicated sequences that have undergone active expansion, confirming their role in mediating recombination events. Many copy number variations (CNVs) exist on 22q11.2, some flanked by SDs. Interestingly, two chromosome breakpoints for 13 CNVs (mean length 65 kb) are located in paralogous subunits, providing direct evidence that SD subunits could contribute to CNV formation. Sequence analysis of PACs or BACs identified extra CNVs, specifically, 10 insertions and 18 deletions within 22q11.2; four were more than 10 kb in size and most contained young AluYs at their breakpoints. CONCLUSIONS: Our study indicates that AluYs are implicated in the past and current duplication events, and moreover suggests that DNA rearrangements in 22q11.2 genomic disorders perhaps do not occur randomly but involve both actively expanded duplication subunits and Alu elements.
Project description:About 5% of the human genome consists of segmental duplications or low-copy repeats, which are large, highly homologous (>95%) fragments of sequence. It has been estimated that these segmental duplications emerged during the past approximately 35 million years (Myr) of human evolution and that they correlate with chromosomal rearrangements. Williams-Beuren syndrome (WBS) is a segmental aneusomy syndrome that is the result of a frequent de novo deletion at 7q11.23, mediated by large (approximately 400-kb) region-specific complex segmental duplications composed of different blocks. We have precisely defined the structure of the segmental duplications on human 7q11.23 and characterized the copy number and structure of the orthologous regions in other primates (macaque, orangutan, gorilla, and chimpanzee). Our data indicate a recent origin and rapid evolution of the 7q11.23 segmental duplications, starting before the diversification of hominoids (approximately 12-16 million years ago [Mya]), with species-specific duplications and intrachromosomal rearrangements that lead to significant differences among those genomes. Alu sequences are located at most edges of the large hominoid-specific segmental duplications, suggesting that they might have facilitated evolutionary rearrangements. We propose a mechanistic model based on Alu-mediated duplicated transposition along with nonallelic homologous recombination for the generation and local expansion of the segmental duplications. The extraordinary rate of evolutionary turnover of this region, rich in segmental duplications, results in important genomic variation among hominoid species, which could be of functional relevance and predispose to disease.
Project description:<h4>Background</h4>Alu elements are Short INterspersed Elements (SINEs) in primate genomes that have proven useful as markers for studying genome evolution, population biology and phylogenetics. Most of these applications, however, have been limited to humans and their nearest relatives, chimpanzees. In an effort to expand our understanding of Alu sequence evolution and to increase the applicability of these markers to non-human primate biology, we have analyzed available Alu sequences for loci specific to platyrrhine (New World) primates.<h4>Results</h4>Branching patterns along an Alu sequence phylogeny indicate three major classes of platyrrhine-specific Alu sequences. Sequence comparisons further reveal at least three New World monkey-specific subfamilies; AluTa7, AluTa10, and AluTa15. Two of these subfamilies appear to be derived from a gene conversion event that has produced a recently active fusion of AluSc- and AluSp-type elements. This is a novel mode of origin for new Alu subfamilies.<h4>Conclusion</h4>The use of Alu elements as genetic markers in studies of genome evolution, phylogenetics, and population biology has been very productive when applied to humans. The characterization of these three new Alu subfamilies not only increases our understanding of Alu sequence evolution in primates, but also opens the door to the application of these genetic markers outside the hominid lineage.