Coalescence and genetic diversity in sexual populations under selection.
ABSTRACT: In sexual populations, selection operates neither on the whole genome, which is repeatedly taken apart and reassembled by recombination, nor on individual alleles that are tightly linked to the chromosomal neighborhood. The resulting interference between linked alleles reduces the efficiency of selection and distorts patterns of genetic diversity. Inference of evolutionary history from diversity shaped by linked selection requires an understanding of these patterns. Here, we present a simple but powerful scaling analysis identifying the unit of selection as the genomic "linkage block" with a characteristic length, , determined in a self-consistent manner by the condition that the rate of recombination within the block is comparable to the fitness differences between different alleles of the block. We find that an asexual model with the strength of selection tuned to that of the linkage block provides an excellent description of genetic diversity and the site frequency spectra compared with computer simulations. This linkage block approximation is accurate for the entire spectrum of strength of selection and is particularly powerful in scenarios with many weakly selected loci. The latter limit allows us to characterize coalescence, genetic diversity, and the speed of adaptation in the infinitesimal model of quantitative genetics.
Project description:Under neutrality, linkage disequilibrium results from physically linked sites having nonindependent coalescent histories. In obligately sexual organisms, meiotic recombination is the dominant force separating linked variants from one another, and thus in determining the decay of linkage disequilibrium with physical distance. In facultatively sexual diploid organisms that principally reproduce clonally, mechanisms of mitotic exchange are expected to become relatively more important in shaping linkage disequilibrium. Here we outline mathematical and computational models of a facultative-sex coalescent process that includes meiotic and mitotic recombination, via both crossovers and gene conversion, to determine how linkage disequilibrium is affected with facultative sex. We demonstrate that the degree to which linkage disequilibrium is broken down by meiotic recombination simply scales with the probability of sex if it is sufficiently high (much greater than [Formula: see text] for population size N). However, with very rare sex (occurring with frequency on the order of [Formula: see text]), mitotic gene conversion plays a particularly important and complicated role because it both breaks down associations between sites and removes within-individual diversity. Strong population structure under rare sex leads to lower average linkage disequilibrium values than in panmictic populations, due to the influence of low-frequency polymorphisms created by allelic sequence divergence acting in individual subpopulations. These analyses provide information on how to interpret observed linkage disequilibrium patterns in facultative sexuals and to determine what genomic forces are likely to shape them.
Project description:Genetic diversity is unusually high at loci in the S-locus region of the self-incompatible species of the flowering plant, Arabidopsis lyrata, not just in the S loci themselves, but also at two nearby loci. In a previous study of a single natural population from Iceland, we attributed this elevated polymorphism to linkage disequilibrium (LD) between variants at loci close to the S locus and the S alleles, which are maintained in the population by balancing selection. With the four S-flanking loci whose diversity we previously studied, we could not determine the extent of the region linked to the S loci in which neutral sites are affected. We also could not exclude the possibility of a population bottleneck, or of admixture, as causes of the LD. We have now studied four more distant loci flanking the S-locus region, and more populations, and we analyze the results using a theoretical model of the effect of balancing selection on diversity at linked neutral sites within and between different functional S-allelic classes. In the model, diversity is a function of the number of selectively maintained alleles and the recombination distances from the selectively maintained sites. We use the model to estimate the number of different functional S alleles, their turnover rate, and recombination rates between the S-locus region and other loci. Our estimates suggest that there is a small region of very low recombination surrounding the S-locus region.
Project description:In Brassica species, self-incompatibility is controlled genetically by haplotypes involving two known genes, SLG and SRK, and possibly an as yet unknown gene controlling pollen incompatibility types. Alleles at the incompatibility loci are maintained by frequency-dependent selection, and diversity at SLG and SRK appears to be very ancient, with high diversity at silent and replacement sites, particularly in certain "hypervariable" portions of the genes. It is important to test whether recombination occurs in these genes before inferences about function of different parts of the genes can be made from patterns of diversity within their sequences. In addition, it has been suggested that, to maintain the relationship between alleles within a given S-haplotype, recombination is suppressed in the S-locus region. The high diversity makes many population genetic measures of recombination inapplicable. We have analyzed linkage disequilibrium within the SLG gene of two Brassica species, using published coding sequences. The results suggest that intragenic recombination has occurred in the evolutionary history of these alleles. This is supported by patterns of synonymous nucleotide diversity within both the SLG and SRK genes, and between domains of the SRK gene. Finally, clusters of linkage disequilibrium within the SLG gene suggest that hypervariable regions are under balancing selection, and are not merely regions of relaxed selective constraint.
Project description:The organization and allelic recombination of the merozoite surface protein-1 gene of Plasmodium vivax (PvMsp-1), the most widely prevalent human malaria parasite, were evaluated in complete nucleotide sequences of 40 isolates from various geographic areas. Alignment of 31 distinct alleles revealed the mosaic organization of PvMsp-1, consisting of seven interallele conserved blocks flanked by six variable blocks. The variable blocks showed extensive variation in repeats and nonrepeat unique sequences. Numerous recombination sites were distributed throughout PvMsp-1, in both conserved blocks and variable block unique sequences, and the distribution was not uniform. Heterozygosity of PvMsp-1 alleles was higher in Asia (0.953 +/- 0.009) than in Brazil (0.813 +/- 0.047). No identical alleles were shared between Asia and Brazil, whereas all but one variable block nonrepeat sequence found in Brazil occurred in Asia. These observations suggest that P. vivax populations in Asia are ancestral to Brazilian populations, and that PvMsp-1 has heterogeneity in frequency of allelic recombination events. Recurrent origins of new PvMsp-1 alleles by repeated recombination events were supported by a rapid decline in linkage disequilibrium between pairs of synonymous sites with increasing nucleotide distance, with little linkage disequilibrium at a distance of over 3 kb in a P. vivax population from Thailand, evidence for an effectively high recombination rate of the parasite. Meanwhile, highly reduced nucleotide diversity was noted in a region encoding the 19-kDa C-terminal epidermal growth factor-like domain of merozoite surface protein-1, a vaccine candidate.
Project description:Evolutionary genetic studies have shown a positive correlation between levels of nucleotide diversity and either rates of recombination or genetic distance to genes. Both positive-directional and purifying selection have been offered as the source of these correlations via genetic hitchhiking and background selection, respectively. Phylogenetically conserved elements (CEs) are short (?100?bp), widely distributed (comprising ?5% of genome), sequences that are often found far from genes. While the function of many CEs is unknown, CEs also are associated with reduced diversity at linked sites. Using high coverage (>80×) whole genome data from two human populations, the Yoruba and the CEU, we perform fine scale evaluations of diversity, rates of recombination, and linkage to genes. We find that the local rate of recombination has a stronger effect on levels of diversity than linkage to genes, and that these effects of recombination persist even in regions far from genes. Our whole genome modeling demonstrates that, rather than recombination or GC-biased gene conversion, selection on sites within or linked to CEs better explains the observed genomic diversity patterns. A major implication is that very few sites in the human genome are predicted to be free of the effects of selection. These sites, which we refer to as the human "neutralome," comprise only 1.2% of the autosomes and 5.1% of the X chromosome. Demographic analysis of the neutralome reveals larger population sizes and lower rates of growth for ancestral human populations than inferred by previous analyses.
Project description:Patterns of nucleotide sequence diversity are analyzed for three duplicate alcohol dehydrogenase loci (adh1-adh3) within a species-wide sample of 25 accessions of wild barley (Hordeum vulgare ssp. spontaneum). The adh1 and adh2 loci are tightly linked (recombination fraction <0.01) while the adh3 locus is inherited independently. Wild barley is predominantly self-fertilizing (approximately 98%), and as a consequence, effective recombination is restricted by the extreme reduction in heterozygosity. Large reductions in effective recombination, in turn, widen the conditions for linkage to influence nucleotide sequence diversity through the action of selective sweeps or background selection. These considerations would appear to predict (1) homogeneity in patterns of nucleotide sequence diversity, especially between closely linked loci, and (2) extensive linkage disequilibrium relative to random-mating species. In contrast to these expectations, the wild barley data reveal heterogeneity in patterns of nucleotide sequence diversity and levels of linkage disequilibrium that are indistinguishable from those observed at adh1 in maize, an outbreeding grass species.
Project description:Recombination shapes nucleotide variation within genomes. Patterns are thought to arise from the local recombination landscape, influencing the degree to which neutral variation experiences hitchhiking with selected variation. This study examines DNA polymorphism along Chromosome 4 (element B) of Drosophila americana to identify effects of hitchhiking arising as a consequence of Y-linked transmission. A centromeric fusion between the X and 4(th) chromosomes segregates in natural populations of D. americana. Frequency of the X-4 fusion exhibits a strong positive correlation with latitude, which has explicit consequences for unfused 4(th) chromosomes. Unfused Chromosome 4 exists as a non-recombining Y chromosome or as an autosome proportional to the frequency of the X-4 fusion. Furthermore, Y linkage along the unfused 4 is disrupted as a function of the rate of recombination with the centromere. Inter-population and intra-chromosomal patterns of nucleotide diversity were assayed using six regions distributed along unfused 4(th) chromosomes derived from populations with different frequencies of the X-4 fusion. No difference in overall level of nucleotide diversity was detected among populations, yet variation along the chromosome exhibits a distinct pattern in relation to the X-4 fusion. Sequence diversity is inflated at loci experiencing the strongest Y linkage. These findings are inconsistent with the expected reduction in nucleotide diversity resulting from hitchhiking due to background selection or selective sweeps. In contrast, excessive polymorphism is accruing in association with transient Y linkage, and furthermore, hitchhiking with sexually antagonistic alleles is potentially responsible.
Project description:BACKGROUND:Sex-determination genes drive the evolution of adjacent chromosomal regions. Sexually antagonistic selection favors the accumulation of inversions that reduce recombination in regions adjacent to the sex-determination gene. Once established, the clonal inheritance of sex-linked inversions leads to the accumulation of deleterious alleles, repetitive elements and a gradual decay of sex-linked genes. This in turn creates selective pressures for the evolution of mechanisms that compensate for the unequal dosage of gene expression. Here we use whole genome sequencing to characterize the structure of a young sex chromosome and quantify sex-specific gene expression in the developing gonad. RESULTS:We found an 8.8 Mb block of strong differentiation between males and females that corresponds to the location of a previously mapped sex-determiner on linkage group 1 of Oreochromis niloticus. Putatively disruptive mutations are found in many of the genes within this region. We also found a significant female-bias in the expression of genes within the block of differentiation compared to those outside the block of differentiation. Eight candidate sex-determination genes were identified within this region. CONCLUSIONS:This study demonstrates a block of differentiation on linkage group 1, suggestive of an 8.8 Mb inversion encompassing the sex-determining locus. The enrichment of female-biased gene expression inside the proposed inversion suggests incomplete dosage compensation. This study helps establish a model for studying the early-to-intermediate stages of sex chromosome evolution.
Project description:Much of quantitative genetics is based on the 'infinitesimal model', under which selection has a negligible effect on the genetic variance. This is typically justified by assuming a very large number of loci with additive effects. However, it applies even when genes interact, provided that the number of loci is large enough that selection on each of them is weak relative to random drift. In the long term, directional selection will change allele frequencies, but even then, the effects of epistasis on the ultimate change in trait mean due to selection may be modest. Stabilising selection can maintain many traits close to their optima, even when the underlying alleles are weakly selected. However, the number of traits that can be optimised is apparently limited to ~4Ne by the 'drift load', and this is hard to reconcile with the apparent complexity of many organisms. Just as for the mutation load, this limit can be evaded by a particular form of negative epistasis. A more robust limit is set by the variance in reproductive success. This suggests that selection accumulates information most efficiently in the infinitesimal regime, when selection on individual alleles is weak, and comparable with random drift. A review of evidence on selection strength suggests that although most variance in fitness may be because of alleles with large Nes, substantial amounts of adaptation may be because of alleles in the infinitesimal regime, in which epistasis has modest effects.
Project description:Recombination confers a major evolutionary advantage by breaking up linkage disequilibrium between harmful and beneficial mutations, thereby facilitating selection. However, in species that are only periodically sexual, such as many microbial eukaryotes, the realized rate of recombination is also affected by the frequency of sex, meaning that infrequent sex can increase the effects of selection at linked sites despite high recombination rates. Despite this, the rate of sex of most facultatively sexual species is unknown. Here, we use genomewide patterns of linkage disequilibrium to infer fine-scale recombination rate variation in the genome of the facultatively sexual green alga Chlamydomonas reinhardtii. We observe recombination rate variation of up to two orders of magnitude and find evidence of recombination hotspots across the genome. Recombination rate is highest flanking genes, consistent with trends observed in other nonmammalian organisms, though intergenic recombination rates vary by intergenic tract length. We also find a positive relationship between nucleotide diversity and physical recombination rate, suggesting a widespread influence of selection at linked sites in the genome. Finally, we use estimates of the effective rate of recombination to calculate the rate of sex that occurs in natural populations, estimating a sexual cycle roughly every 840 generations. We argue that the relatively infrequent rate of sex and large effective population size creates a population genetic environment that increases the influence of selection on linked sites across the genome.