Joint linkage and segregation analysis under multiallelic trait inheritance: simplifying interpretations for complex traits.
ABSTRACT: Identification of the genetic basis of common traits may be hindered by underlying complex genetic architectures that are inadequately captured by existing models, including both multiallelic and multilocus modes of inheritance (MOI). One useful approach for localizing genes underlying continuous complex traits is the joint oligogenic linkage and segregation analysis implemented in the package Loki. The method uses reversible jump Markov chain Monte Carlo to eliminate the need to prespecify the number of quantitative trait loci (QTLs) in the trait model, thus providing posterior distributions for the number of QTLs in a Bayesian framework. The current implementation assumes QTLs are diallelic, and therefore can overestimate the number of linked QTLs in the presence of a multiallelic QTL. To address the possibility of multiple alleles, we extended the QTL model to allow for a variable number of additive alleles at each locus. Application to simulated data shows that, under a diallelic MOI, the multiallelic and diallelic analysis models give similar results. Under a multiallelic MOI, the multiallelic analysis model provides better mixing and improved convergence, and leads to a more accurate estimate of the underlying trait MOI and model parameter values, than does the diallelic model. Application to real data shows the multiallelic model results in fewer estimated linked QTLs and that the predominant QTL model is similar to one of two predominant models estimated from the diallelic analysis. Our results indicate that use of a multiallelic analysis model can lead to better understanding of the genetic architecture underlying complex traits.
Project description:The emergence of multiparental mapping populations enabled plant geneticists to gain deeper insights into the genetic architecture of major agronomic traits and to map quantitative trait loci (QTLs) controlling the expression of these traits. Although the investigated mapping populations are similar, one open question is whether genotype data should be modelled as identical by state (IBS) or identical by descent (IBD). Whereas IBS simply makes use of raw genotype scores to distinguish alleles, IBD data are derived from parental offspring information. We report on comparing IBS and IBD by applying two multiple regression models on four traits studied in the barley nested association mapping (NAM) population HEB-25. We observed that modelling parent-specific IBD genotypes produced a lower number of significant QTLs with increased prediction abilities compared with modelling IBS genotypes. However, at lower trait heritabilities the IBS model produced higher prediction abilities. We developed a method to estimate multiallelic QTL effects in multiparental populations from simple biallelic IBS data. This method is based on cumulating IBS-derived single-nucleotide polymorphism (SNP) effect estimates in a defined genetic region surrounding a QTL. Comparing the resulting parent-specific QTL effects with those obtained from IBD approaches revealed high accordance that could be confirmed through simulations. The method turned out to be also applicable to a barley multiparent advanced generation inter-cross (MAGIC) population. The 'cumulation method' represents a universal approach to differentiate parent-specific QTL effects in multiparental populations, even if no IBD information is available. In future, the method could further benefit from the availability of much denser SNP maps.
Project description:Complex traits important for humans are often correlated phenotypically and genetically. Joint mapping of quantitative-trait loci (QTLs) for multiple correlated traits plays an important role in unraveling the genetic architecture of complex traits. Compared with single-trait analysis, joint mapping addresses more questions and has advantages for power of QTL detection and precision of parameter estimation. Some statistical methods have been developed to map QTLs underlying multiple traits, most of which are based on maximum-likelihood methods. We develop here a multivariate version of the Bayes methodology for joint mapping of QTLs, using the Markov chain-Monte Carlo (MCMC) algorithm. We adopt a variance-components method to model complex traits in outbred populations (e.g., humans). The method is robust, can deal with an arbitrary number of alleles with arbitrary patterns of gene actions (such as additive and dominant), and allows for multiple phenotype data of various types in the joint analysis (e.g., multiple continuous traits and mixtures of continuous traits and discrete traits). Under a Bayesian framework, parameters--including the number of QTLs--are estimated on the basis of their marginal posterior samples, which are generated through two samplers, the Gibbs sampler and the reversible-jump MCMC. In addition, we calculate the Bayes factor related to each identified QTL, to test coincident linkage versus pleiotropy. The performance of our method is evaluated in simulations with full-sib families. The results show that our proposed Bayesian joint-mapping method performs well for mapping multiple QTLs in situations of either bivariate continuous traits or mixed data types. Compared with the analysis for each trait separately, Bayesian joint mapping improves statistical power, provides stronger evidence of QTL detection, and increases precision in estimation of parameter and QTL position. We also applied the proposed method to a set of real data and detected a coincident linkage responsible for determining bone mineral density and areal bone size of wrist in humans.
Project description:BACKGROUND: Complex binary traits are influenced by many factors including the main effects of many quantitative trait loci (QTLs), the epistatic effects involving more than one QTLs, environmental effects and the effects of gene-environment interactions. Although a number of QTL mapping methods for binary traits have been developed, there still lacks an efficient and powerful method that can handle both main and epistatic effects of a relatively large number of possible QTLs. RESULTS: In this paper, we use a Bayesian logistic regression model as the QTL model for binary traits that includes both main and epistatic effects. Our logistic regression model employs hierarchical priors for regression coefficients similar to the ones used in the Bayesian LASSO linear model for multiple QTL mapping for continuous traits. We develop efficient empirical Bayesian algorithms to infer the logistic regression model. Our simulation study shows that our algorithms can easily handle a QTL model with a large number of main and epistatic effects on a personal computer, and outperform five other methods examined including the LASSO, HyperLasso, BhGLM, RVM and the single-QTL mapping method based on logistic regression in terms of power of detection and false positive rate. The utility of our algorithms is also demonstrated through analysis of a real data set. A software package implementing the empirical Bayesian algorithms in this paper is freely available upon request. CONCLUSIONS: The EBLASSO logistic regression method can handle a large number of effects possibly including the main and epistatic QTL effects, environmental effects and the effects of gene-environment interactions. It will be a very useful tool for multiple QTLs mapping for complex binary traits.
Project description:The dissection of the genetic architecture of quantitative traits, including the number and locations of quantitative trait loci (QTL) and their main and epistatic effects, has been an important topic in current QTL mapping. We extend the Bayesian model selection framework for mapping multiple epistatic QTL affecting continuous traits to dynamic traits in experimental crosses. The extension inherits the efficiency of Bayesian model selection and the flexibility of the Legendre polynomial model fitting to the change in genetic and environmental effects with time. We illustrate the proposed method by simultaneously detecting the main and epistatic QTLs for the growth of leaf age in a doubled-haploid population of rice. The behavior and performance of the method are also shown by computer simulation experiments. The results show that our method can more quickly identify interacting QTLs for dynamic traits in the models with many numbers of genetic effects, enhancing our understanding of genetic architecture for dynamic traits. Our proposed method can be treated as a general form of mapping QTL for continuous quantitative traits, being easier to extend to multiple traits and to a single trait with repeat records.
Project description:Genetic improvement of aluminum (Al) tolerance is one of the cost-effective solutions to improve plant productivity in acidic soils around the world. This study was performed to progress our understanding of the genetic mechanisms of aluminum tolerance underlying wheat (Triticum aestivum L.) flag leaf morphological and physiological traits. A recombinant inbred line population derived from SeriM82 and Babax was used for mapping quantitative trait loci (QTL) in wheat for tolerance to Al toxicity through 477 DNA markers. Based on a single-locus analysis, 48 QTLs including 16 putative and 32 suggestive QTLs were identified for all studied traits. Individual QTL explained 4.57-11.29% of the phenotypic variance in different environments during both the crop seasons. These QTLs located unevenly throughout the wheat genome. Among them, 52.08%, 29.17%, and 18.75% were in the A, B, and D genomes, respectively. Based on two-locus analysis, 54 additive QTLs and 6 pairs of epistatic effects were detected, among which 29 additive and 5 pairs of epistatic QTLs showed significant QTL?×?environment interactions. The highest number of stable QTLs was identified on genome A. Determining a number of QTL clusters indicated tight linkage or pleiotropy in the inheritance of different traits. The stable and major QTLs controlling traits in this research can be applied for verification in different environments and genetic backgrounds and identifying superior allelic variations in wheat to increase the performance of selection of high yielding lines adapted to Al stress in breeding programs.
Project description:BACKGROUND: Hybrid poplars species are candidates for biomass production but breeding efforts are needed to combine productivity and water use efficiency in improved cultivars. The understanding of the genetic architecture of growth in poplar by a Quantitative Trait Loci (QTL) approach can help us to elucidate the molecular basis of such integrative traits but identifying candidate genes underlying these QTLs remains difficult. Nevertheless, the increase of genomic information together with the accessibility to a reference genome sequence (Populus trichocarpa Nisqually-1) allow to bridge QTL information on genetic maps and physical location of candidate genes on the genome. The objective of the study is to identify QTLs controlling productivity, architecture and leaf traits in a P. deltoides x P. trichocarpa F1 progeny and to identify candidate genes underlying QTLs based on the anchoring of genetic maps on the genome and the gene ontology information linked to genome annotation. The strategy to explore genome annotation was to use Gene Ontology enrichment tools to test if some functional categories are statistically over-represented in QTL regions. RESULTS: Four leaf traits and 7 growth traits were measured on 330 F1 P. deltoides x P. trichocarpa progeny. A total of 77 QTLs controlling 11 traits were identified explaining from 1.8 to 17.2% of the variation of traits. For 58 QTLs, confidence intervals could be projected on the genome. An extended functional annotation was built based on data retrieved from the plant genome database Phytozome and from an inference of function using homology between Populus and the model plant Arabidopsis. Genes located within QTL confidence intervals were retrieved and enrichments in gene ontology (GO) terms were determined using different methods. Significant enrichments were found for all traits. Particularly relevant biological processes GO terms were identified for QTLs controlling number of sylleptic branches: intervals were enriched in GO terms of biological process like 'ripening' and 'adventitious roots development'. CONCLUSION: Beyond the simple identification of QTLs, this study is the first to use a global approach of GO terms enrichment analysis to fully explore gene function under QTLs confidence intervals in plants. This global approach may lead to identification of new candidate genes for traits of interest.
Project description:Many traits play essential roles in determining crop yield. Wide variation for morphological traits exists in Hordeum vulgare L., but the genetic basis of this morphological variation is largely unknown. To understand genetic basis controlling morphological traits affecting yield, a barley doubled haploid population (146 individuals) derived from Clipper × Sahara 3771 was used to map chromosome regions underlying days to awn appearance, plant height, fertile spike number, flag leaf length, spike length, harvest index, seed number per plant, thousands kernel weight, and grain yield. Twenty-seven QTLs for nine traits were mapped to the barley genome that described 3-69% of phenotypic variations; and some genomic regions harbor a given QTL for more than one trait. Out of 27 QTLs identified, 19 QTLs were novel. Chromosomal regions on 1H, 2H, 4H, and 6H associated with seed grain yield, and chromosome regions on 2H and 6H had major effects on grain yield (GY). One major QTL for seed number per plant was flanked by marker VRS1-KSUF15 on chromosome 2H. This QTL was also associated with GY. Some loci controlling thousands kernel weight (TKW), fertile spike number (FSN), and GY were the same. The major grain yield QTL detected on linkage PSR167 co-localized with TAM10. Two major QTLs controlling TKW and FSN were also mapped at this locus. Eight QTLs on chromosomes 1H, 2H, 3H, 4H, 5H, 6H, and 7H consistently affected spike characteristics. One major QTL (ANIONT1A-TACMD) on 4H affected both spike length (SL) and spike number explained 9 and 5% of the variation of SL and FSN, respectively. In conclusion, this study could cast some light on the genetic basis of the studied pivotal traits. Moreover, fine mapping of the identified major effect markers may facilitate the application of molecular markers in barley breeding programs.
Project description:Identification of Quantitative Trait Loci (QTL) has been a challenge for complex traits due to the use of populations with narrow genetic base. Most of QTL mapping studies were carried out from crosses made within the subspecies, either indica × indica or japonica × japonica. In this study we report advantages of using Multi-parent Advanced Generation Inter-Crosses global population, derived from a combination of eight indica and eight japonica elite parents, in QTL discovery for yield and grain quality traits. Genome-wide association study and interval mapping identified 38 and 34 QTLs whereas Bayesian networking detected 60 QTLs with 22 marker-marker associations, 32 trait-trait associations and 65 marker-trait associations. Notably, nine known QTLs/genes qPH<sub>1</sub>/OsGA20ox2, qDF<sub>3</sub>/OsMADS50, PL, QDg1, qGW-5b, grb7-2, qGL<sub>3</sub>/GS3, Amy6/Wx gene and OsNAS3 were consistently identified by all approaches for nine traits whereas qDF<sub>3</sub>/OsMADS50 was co-located for both yield and days-to-flowering traits on chromosome 3. Moreover, we identified a number of candidate QTLs in either one or two analyses but further validations will be needed. The results indicate that this new population has enabled identifications of significant QTLs and interactions for 16 traits through multiple approaches. Pyramided recombinant inbred lines provide a valuable source for integration into future breeding programs.
Project description:The crop seed is a complex organ that may be composed of the diploid embryo, the triploid endosperm and the diploid maternal tissues. According to the genetic features of seed characters, two genetic models for mapping quantitative trait loci (QTLs) of crop seed traits are proposed, with inclusion of maternal effects, embryo or endosperm effects of QTL, environmental effects and QTL-by-environment (QE) interactions. The mapping population can be generated either from double back-cross of immortalized F2 (IF2) to the two parents, from random-cross of IF2 or from selfing of IF2 population. Candidate marker intervals potentially harboring QTLs are first selected through one-dimensional scanning across the whole genome. The selected candidate marker intervals are then included in the model as cofactors to control background genetic effects on the putative QTL(s). Finally, a QTL full model is constructed and model selection is conducted to eliminate false positive QTLs. The genetic main effects of QTLs, QE interaction effects and the corresponding P-values are computed by Markov chain Monte Carlo algorithm for Gaussian mixed linear model via Gibbs sampling. Monte Carlo simulations were performed to investigate the reliability and efficiency of the proposed method. The simulation results showed that the proposed method had higher power to accurately detect simulated QTLs and properly estimated effect of these QTLs. To demonstrate the usefulness, the proposed method was used to identify the QTLs underlying fiber percentage in an upland cotton IF2 population. A computer software, QTLNetwork-Seed, was developed for QTL analysis of seed traits.
Project description:In this article, we propose a model selection method, the Bayesian composite model space approach, to map quantitative trait loci (QTL) in a half-sib population for continuous and binary traits. In our method, the identity-by-descent-based variance component model is used. To demonstrate the performance of this model, the method was applied to map QTL underlying production traits on BTA6 in a Chinese half-sib dairy cattle population. A total of four QTLs were detected, whereas only one QTL was identified using the traditional least square (LS) method. We also conducted two simulation experiments to validate the efficiency of our method. The results suggest that the proposed method based on a multiple-QTL model is efficient in mapping multiple QTL for an outbred half-sib population and is more powerful than the LS method based on a single-QTL model.