Project description:fwdpp is a C++ library of routines intended to facilitate the development of forward-time simulations under arbitrary mutation and fitness models. The library design provides a combination of speed, low memory overhead, and modeling flexibility not currently available from other forward simulation tools. The library is particularly useful when the simulation of large populations is required, as programs implemented using the library are much more efficient than other available forward simulation programs.
Project description:Due to the increasing power of personal computers, as well as the availability of flexible forward-time simulation programs like simuPOP, it is now possible to simulate the evolution of complex human diseases using a forward-time approach. This approach is potentially more powerful than the coalescent approach since it allows simulations of more than one disease susceptibility locus using almost arbitrary genetic and demographic models. However, the application of such simulations has been deterred by the lack of a suitable simulation framework. For example, it is not clear when and how to introduce disease mutants-especially those under purifying selection-to an evolving population, and how to control the disease allele frequencies at the last generation. In this paper, we introduce a forward-time simulation framework that allows us to generate large multi-generation populations with complex diseases caused by unlinked disease susceptibility loci, according to specified demographic and evolutionary properties. Unrelated individuals, small or large pedigrees can be drawn from the resulting population and provide samples for a wide range of study designs and ascertainment methods. We demonstrate our simulation framework using three examples that map genes associated with affection status, a quantitative trait, and the age of onset of a hypothetical cancer, respectively. Nonadditive fitness models, population structure, and gene-gene interactions are simulated. Case-control, sibpair, and large pedigree samples are drawn from the simulated populations and are examined by a variety of gene-mapping methods.
Project description:MotivationThe analysis of the evolutionary dynamics of a population with many polymorphic loci is challenging, as a large number of possible genotypes needs to be tracked. In the absence of analytical solutions, forward computer simulations are an important tool in multi-locus population genetics. The run time of standard algorithms to simulate sexual populations increases as 8(L) with the number of loci L, or with the square of the population size N.ResultsWe have developed algorithms to simulate large populations with arbitrary genetic maps, including multiple crossovers, with a run time that scales as 3(L). If the number of crossovers is restricted to at most one, the run time is reduced to L2(L). The algorithm is based on an analogue of the Fast Fourier Transform (FFT) and allows for arbitrary fitness functions (i.e. any epistasis). In addition, we include a streamlined individual-based framework. The library is implemented as a collection of C++ classes and a Python interface.
Project description:Summaryforqs is a forward-in-time simulation of recombination, quantitative traits and selection. It was designed to investigate haplotype patterns resulting from scenarios where substantial evolutionary change has taken place in a small number of generations due to recombination and/or selection on polygenic quantitative traits.Availability and implementationforqs is implemented as a command-line C++ program. Source code and binary executables for Linux, OSX and Windows are freely available under a permissive BSD license: https://bitbucket.org/dkessner/forqs.
Project description:BackgroundForward-time simulations have unique advantages in power and flexibility for the simulation of genetic samples of complex human diseases because they can closely mimic the evolution of human populations carrying these diseases. However, a number of methodological and computational constraints have prevented the power of this simulation method from being fully explored in existing forward-time simulation methods.ResultsUsing a general-purpose forward-time population genetics simulation environment, we developed a forward-time simulation method that can be used to simulate realistic samples for genome-wide association studies. We examined the properties of this simulation method by comparing simulated samples with real data and demonstrated its wide applicability using four examples, including a simulation of case-control samples with a disease caused by multiple interacting genetic and environmental factors, a simulation of trio families affected by a disease-predisposing allele that had been subjected to either slow or rapid selective sweep, and a simulation of a structured population resulting from recent population admixture.ConclusionsOur algorithm simulates populations that closely resemble the complex structure of the human genome, while allows the introduction of signals of natural selection. Because of its flexibility to generate different types of samples with arbitrary disease or quantitative trait models, this simulation method can simulate realistic samples to evaluate the performance of a wide variety of statistical gene mapping methods for genome-wide association studies.
Project description:The notion of fitness landscapes, a map between genotype and fitness, was proposed more than 80 years ago. For most of this time data was only available for a few alleles, and thus we had only a restricted view of the whole fitness landscape. Recently, advances in genetics and molecular biology allow a more detailed view of them. Here we review experimental and theoretical studies of fitness landscapes of functional RNAs, especially aptamers and ribozymes. We find that RNA structures can be divided into critical structures, connecting structures, neutral structures and forbidden structures. Such characterisation, coupled with theoretical sequence-to-structure predictions, allows us to construct the whole fitness landscape. Fitness landscapes then can be used to study evolution, and in our case the development of the RNA world.
Project description:Epistatic interactions among genes can give rise to rugged fitness landscapes, in which multiple "peaks" of high-fitness allele combinations are separated by "valleys" of low-fitness genotypes. How populations traverse rugged fitness landscapes is a long-standing question in evolutionary biology. Sexual reproduction may affect how a population moves within a rugged fitness landscape. Sex may generate new high-fitness genotypes by recombination, but it may also destroy high-fitness genotypes by shuffling the genes of a fit parent with a genetically distinct mate, creating low-fitness offspring. Either of these opposing aspects of sex require genotypic diversity in the population. Spatially structured populations may harbor more diversity than well-mixed populations, potentially amplifying both positive and negative effects of sex. On the other hand, spatial structure leads to clumping in which mating is more likely to occur between like types, diminishing the effects of recombination. In this study, we use computer simulations to investigate the combined effects of recombination and spatial structure on adaptation in rugged fitness landscapes. We find that spatially restricted mating and offspring dispersal may allow multiple genotypes inhabiting suboptimal peaks to coexist, and recombination at the "sutures" between the clusters of these genotypes can create genetically novel offspring. Sometimes such an offspring genotype inhabits a new peak on the fitness landscape. In such a case, spatially restricted mating allows this fledgling subpopulation to avoid recombination with distinct genotypes, as mates are more likely to be the same genotype. Such population "centers" can allow nascent peaks to establish despite recombination. Spatial structure may therefore allow an evolving population to enjoy the creative side of sexual recombination while avoiding its destructive side.
Project description:Whether or not evolution by natural selection is predictable depends on the existence of general patterns shaping the way mutations interact with the genetic background. This interaction, also known as epistasis, has been observed during adaptation (macroscopic epistasis) and in individual mutations (microscopic epistasis). Interestingly, a consistent negative correlation between the fitness effect of beneficial mutations and background fitness (known as diminishing returns epistasis) has been observed across different species and conditions. We tested whether the adaptation pattern of an additional species, Schizosaccharomyces pombe, followed the same trend. We used strains that differed by the presence of large karyotype differences and observed the same pattern of fitness convergence. Using these data along with published datasets, we measured the ability of different models to describe adaptation rates. We found that a phenotype-fitness landscape shaped like a power law is able to correctly predict adaptation dynamics in a variety of species and conditions. Furthermore we show that this model can provide a link between the observed macroscopic and microscopic epistasis. It may be very useful in the development of algorithms able to predict the adaptation of microorganisms from measures of the current phenotypes. Overall, our results suggest that even though adaptation quickly slows down, populations adapting to lab conditions may be quite far from a fitness peak.
Project description:While chromosomal rearrangements are ubiquitous in all domains of life, very little is known about their evolutionary significance, mostly because, apart from a few specifically studied and well-documented mechanisms (interaction with recombination, gene duplication, etc.), very few models take them into account. As a consequence, we lack a general theory to account for their direct and indirect contributions to evolution. Here, we propose Aevol, a forward-in-time simulation platform specifically dedicated to unravelling the evolutionary significance of chromosomal rearrangements (CR) compared to local mutations (LM). Using the platform, we evolve populations of organisms in four conditions characterized by an increasing diversity of mutational operators-from substitutions alone to a mix of substitutions, InDels and CR-but with a constant global mutational rate. Despite being almost invisible in the phylogeny owing to the scarcity of their fixation in the lineages, we show that CR make a decisive contribution to the evolutionary dynamics by comparing the outcome in these four conditions. As expected, chromosomal rearrangements allow fast expansion of the gene repertoire through gene duplication, but they also reduce the effect of diminishing-returns epistasis, hence sustaining adaptation on the long-run. At last, we show that chromosomal rearrangements tightly regulate the size of the genome through indirect selection for reproductive robustness. Overall, these results confirm the need to improve our theoretical understanding of the contribution of chromosomal rearrangements to evolution and show that dedicated platforms like Aevol can efficiently contribute to this agenda.
Project description:There is an increasing demand for evolutionary models to incorporate relatively realistic dynamics, ranging from selection at many genomic sites to complex demography, population structure, and ecological interactions. Such models can generally be implemented as individual-based forward simulations, but the large computational overhead of these models often makes simulation of whole chromosome sequences in large populations infeasible. This situation presents an important obstacle to the field that requires conceptual advances to overcome. The recently developed tree-sequence recording method (Kelleher, Thornton, Ashander, & Ralph, 2018), which stores the genealogical history of all genomes in the simulated population, could provide such an advance. This method has several benefits: (1) it allows neutral mutations to be omitted entirely from forward-time simulations and added later, thereby dramatically improving computational efficiency; (2) it allows neutral burn-in to be constructed extremely efficiently after the fact, using "recapitation"; (3) it allows direct examination and analysis of the genealogical trees along the genome; and (4) it provides a compact representation of a population's genealogy that can be analysed in Python using the msprime package. We have implemented the tree-sequence recording method in SLiM 3 (a free, open-source evolutionary simulation software package) and extended it to allow the recording of non-neutral mutations, greatly broadening the utility of this method. To demonstrate the versatility and performance of this approach, we showcase several practical applications that would have been beyond the reach of previously existing methods, opening up new horizons for the modelling and exploration of evolutionary processes.