Correction: Quantitative reconstruction of weaning ages in archaeological human populations using bone collagen nitrogen isotope ratios and approximate bayesian computation.
Correction: Quantitative reconstruction of weaning ages in archaeological human populations using bone collagen nitrogen isotope ratios and approximate bayesian computation.
Project description:BackgroundNitrogen isotope analysis of bone collagen has been used to reconstruct the breastfeeding practices of archaeological human populations. However, weaning ages have been estimated subjectively because of a lack of both information on subadult bone collagen turnover rates and appropriate analytical models.MethodologyTemporal changes in human subadult bone collagen turnover rates were estimated from data on tissue-level bone metabolism reported in previous studies. A model for reconstructing precise weaning ages was then developed using a framework of approximate Bayesian computation and incorporating the estimated turnover rates. The model is presented as a new open source R package, WARN (Weaning Age Reconstruction with Nitrogen isotope analysis), which computes the age at the start and end of weaning, (15)N-enrichment through maternal to infant tissue, and [Formula: see text] value of collagen synthesized entirely from weaning foods with their posterior probabilities. The model was applied to 39 previously reported Holocene skeletal populations from around the world, and the results were compared with weaning ages observed in ethnographic studies.ConclusionsThere were no significant differences in the age at the end of weaning between the archaeological (2.80±1.32 years) and ethnographic populations. By comparing archaeological populations, it appears that weaning ages did not differ with the type of subsistence practiced (i.e., hunting-gathering or not). Most of [Formula: see text]-enrichment (2.44±0.90‰) was consistent with biologically valid values. The nitrogen isotope ratios of subadults after the weaning process were lower than those of adults in most of the archaeological populations (-0.48±0.61‰), and this depletion was greater in non-hunter-gatherer populations. Our results suggest that the breastfeeding period in humans had already been shortened by the early Holocene compared with those in extant great apes.
Project description:Population sex ratios are of high ecological relevance, but are challenging to determine in species lacking conspicuous external cues indicating their sex. Acoustic sexing is an option if vocalizations differ between sexes, but is precluded by overlapping distributions of the values of male and female vocalizations in many species. A method allowing the inference of sex ratios despite such an overlap will therefore greatly increase the information extractable from acoustic data. To meet this demand, we developed a novel approach using Approximate Bayesian Computation (ABC) to infer the sex ratio of populations from acoustic data. Additionally, parameters characterizing the male and female distribution of acoustic values (mean and standard deviation) are inferred. This information is then used to probabilistically assign a sex to a single acoustic signal. We furthermore develop a simpler means of sex ratio estimation based on the exclusion of calls from the overlap zone. Applying our methods to simulated data demonstrates that sex ratio and acoustic parameter characteristics of males and females are reliably inferred by the ABC approach. Applying both the ABC and the exclusion method to empirical datasets (echolocation calls recorded in colonies of lesser horseshoe bats, Rhinolophus hipposideros) provides similar sex ratios as molecular sexing. Our methods aim to facilitate evidence-based conservation, and to benefit scientists investigating ecological or conservation questions related to sex- or group specific behaviour across a wide range of organisms emitting acoustic signals. The developed methodology is non-invasive, low-cost and time-efficient, thus allowing the study of many sites and individuals. We provide an R-script for the easy application of the method and discuss potential future extensions and fields of applications. The script can be easily adapted to account for numerous biological systems by adjusting the type and number of groups to be distinguished (e.g. age, social rank, cryptic species) and the acoustic parameters investigated.
Project description:Approximate Bayesian computation (ABC) constitutes a class of computational methods rooted in Bayesian statistics. In all model-based statistical inference, the likelihood function is of central importance, since it expresses the probability of the observed data under a particular statistical model, and thus quantifies the support data lend to particular values of parameters and to choices among different models. For simple models, an analytical formula for the likelihood function can typically be derived. However, for more complex models, an analytical formula might be elusive or the likelihood function might be computationally very costly to evaluate. ABC methods bypass the evaluation of the likelihood function. In this way, ABC methods widen the realm of models for which statistical inference can be considered. ABC methods are mathematically well-founded, but they inevitably make assumptions and approximations whose impact needs to be carefully assessed. Furthermore, the wider application domain of ABC exacerbates the challenges of parameter estimation and model selection. ABC has rapidly gained popularity over the last years and in particular for the analysis of complex problems arising in biological sciences (e.g., in population genetics, ecology, epidemiology, and systems biology).
Project description:Despite an increasingly vast literature on cophylogenetic reconstructions for studying host-parasite associations, understanding the common evolutionary history of such systems remains a problem that is far from being solved. Most algorithms for host-parasite reconciliation use an event-based model, where the events include in general (a subset of) cospeciation, duplication, loss, and host switch. All known parsimonious event-based methods then assign a cost to each type of event in order to find a reconstruction of minimum cost. The main problem with this approach is that the cost of the events strongly influences the reconciliation obtained. Some earlier approaches attempt to avoid this problem by finding a Pareto set of solutions and hence by considering event costs under some minimization constraints. To deal with this problem, we developed an algorithm, called Coala, for estimating the frequency of the events based on an approximate Bayesian computation approach. The benefits of this method are 2-fold: (i) it provides more confidence in the set of costs to be used in a reconciliation, and (ii) it allows estimation of the frequency of the events in cases where the data set consists of trees with a large number of taxa. We evaluate our method on simulated and on biological data sets. We show that in both cases, for the same pair of host and parasite trees, different sets of frequencies for the events lead to equally probable solutions. Moreover, often these solutions differ greatly in terms of the number of inferred events. It appears crucial to take this into account before attempting any further biological interpretation of such reconciliations. More generally, we also show that the set of frequencies can vary widely depending on the input host and parasite trees. Indiscriminately applying a standard vector of costs may thus not be a good strategy.
Project description:Recent technological advances may lead to the development of small scale quantum computers capable of solving problems that cannot be tackled with classical computers. A limited number of algorithms has been proposed and their relevance to real world problems is a subject of active investigation. Analysis of many-body quantum system is particularly challenging for classical computers due to the exponential scaling of Hilbert space dimension with the number of particles. Hence, solving problems relevant to chemistry and condensed matter physics are expected to be the first successful applications of quantum computers. In this paper, we propose another class of problems from the quantum realm that can be solved efficiently on quantum computers: model inference for nuclear magnetic resonance (NMR) spectroscopy, which is important for biological and medical research. Our results are based on three interconnected studies. Firstly, we use methods from classical machine learning to analyze a dataset of NMR spectra of small molecules. We perform a stochastic neighborhood embedding and identify clusters of spectra, and demonstrate that these clusters are correlated with the covalent structure of the molecules. Secondly, we propose a simple and efficient method, aided by a quantum simulator, to extract the NMR spectrum of any hypothetical molecule described by a parametric Heisenberg model. Thirdly, we propose a simple variational Bayesian inference procedure for extracting Hamiltonian parameters of experimentally relevant NMR spectra.
Project description:The skyline plot is a graphical representation of historical effective population sizes as a function of time. Past population sizes for these plots are estimated from genetic data, without a priori assumptions on the mathematical function defining the shape of the demographic trajectory. Because of this flexibility in shape, skyline plots can, in principle, provide realistic descriptions of the complex demographic scenarios that occur in natural populations. Currently, demographic estimates needed for skyline plots are estimated using coalescent samplers or a composite likelihood approach. Here, we provide a way to estimate historical effective population sizes using an Approximate Bayesian Computation (ABC) framework. We assess its performance using simulated and actual microsatellite datasets. Our method correctly retrieves the signal of contracting, constant and expanding populations, although the graphical shape of the plot is not always an accurate representation of the true demographic trajectory, particularly for recent changes in size and contracting populations. Because of the flexibility of ABC, similar approaches can be extended to other types of data, to multiple populations, or to other parameters that can change through time, such as the migration rate.
Project description:Formal models and historyComputational models are increasingly being used to study historical dynamics. This new trend, which could be named Model-Based History, makes use of recently published datasets and innovative quantitative methods to improve our understanding of past societies based on their written sources. The extensive use of formal models allows historians to re-evaluate hypotheses formulated decades ago and still subject to debate due to the lack of an adequate quantitative framework. The initiative has the potential to transform the discipline if it solves the challenges posed by the study of historical dynamics. These difficulties are based on the complexities of modelling social interaction, and the methodological issues raised by the evaluation of formal models against data with low sample size, high variance and strong fragmentation.Case studyThis work examines an alternate approach to this evaluation based on a Bayesian-inspired model selection method. The validity of the classical Lanchester's laws of combat is examined against a dataset comprising over a thousand battles spanning 300 years. Four variations of the basic equations are discussed, including the three most common formulations (linear, squared, and logarithmic) and a new variant introducing fatigue. Approximate Bayesian Computation is then used to infer both parameter values and model selection via Bayes Factors.ImpactResults indicate decisive evidence favouring the new fatigue model. The interpretation of both parameter estimations and model selection provides new insights into the factors guiding the evolution of warfare. At a methodological level, the case study shows how model selection methods can be used to guide historical research through the comparison between existing hypotheses and empirical evidence.
Project description:In the development of new cancer treatment, an essential step is to determine the maximum tolerated dose in a phase I clinical trial. In general, phase I trial designs can be classified as either model-based or algorithm-based approaches. Model-based phase I designs are typically more efficient by using all observed data, while there is a potential risk of model misspecification that may lead to unreliable dose assignment and incorrect maximum tolerated dose identification. In contrast, most of the algorithm-based designs are less efficient in using cumulative information, because they tend to focus on the observed data in the neighborhood of the current dose level for dose movement. To use the data more efficiently yet without any model assumption, we propose a novel approximate Bayesian computation approach to phase I trial design. Not only is the approximate Bayesian computation design free of any dose-toxicity curve assumption, but it can also aggregate all the available information accrued in the trial for dose assignment. Extensive simulation studies demonstrate its robustness and efficiency compared with other phase I trial designs. We apply the approximate Bayesian computation design to the MEK inhibitor selumetinib trial to demonstrate its satisfactory performance. The proposed design can be a useful addition to the family of phase I clinical trial designs due to its simplicity, efficiency and robustness.
Project description:The inference of genome rearrangement events has been extensively studied, as they play a major role in molecular evolution. However, probabilistic evolutionary models that explicitly imitate the evolutionary dynamics of such events, as well as methods to infer model parameters, are yet to be fully utilized. Here, we developed a probabilistic approach to infer genome rearrangement rate parameters using an Approximate Bayesian Computation (ABC) framework. We developed two genome rearrangement models, a basic model, which accounts for genomic changes in gene order, and a more sophisticated one which also accounts for changes in chromosome number. We characterized the ABC inference accuracy using simulations and applied our methodology to both prokaryotic and eukaryotic empirical datasets. Knowledge of genome-rearrangement rates can help elucidate their role in evolution as well as help simulate genomes with evolutionary dynamics that reflect empirical genomes.
Project description:Admixture is a fundamental evolutionary process that has influenced genetic patterns in numerous species. Maximum-likelihood approaches based on allele frequencies and linkage-disequilibrium have been extensively used to infer admixture processes from genome-wide data sets, mostly in human populations. Nevertheless, complex admixture histories, beyond one or two pulses of admixture, remain methodologically challenging to reconstruct. We developed an Approximate Bayesian Computation (ABC) framework to reconstruct highly complex admixture histories from independent genetic markers. We built the software package MetHis to simulate independent SNPs or microsatellites in a two-way admixed population for scenarios with multiple admixture pulses, monotonically decreasing or increasing recurring admixture, or combinations of these scenarios. MetHis allows users to draw model-parameter values from prior distributions set by the user, and, for each simulation, MetHis can calculate numerous summary statistics describing genetic diversity patterns and moments of the distribution of individual admixture fractions. We coupled MetHis with existing machine-learning ABC algorithms and investigated the admixture history of admixed populations. Results showed that random forest ABC scenario-choice could accurately distinguish among most complex admixture scenarios, and errors were mainly found in regions of the parameter space where scenarios were highly nested, and, thus, biologically similar. We focused on African American and Barbadian populations as two study-cases. We found that neural network ABC posterior parameter estimation was accurate and reasonably conservative under complex admixture scenarios. For both admixed populations, we found that monotonically decreasing contributions over time, from Europe and Africa, explained the observed data more accurately than multiple admixture pulses. This approach will allow for reconstructing detailed admixture histories when maximum-likelihood methods are intractable.