Properties and power of the Drosophila Synthetic Population Resource for the routine dissection of complex traits.
ABSTRACT: The Drosophila Synthetic Population Resource (DSPR) is a newly developed multifounder advanced intercross panel consisting of >1600 recombinant inbred lines (RILs) designed for the genetic dissection of complex traits. Here, we describe the inference of the underlying mosaic founder structure for the full set of RILs from a dense set of semicodominant restriction-site-associated DNA (RAD) markers and use simulations to explore how variation in marker density and sequencing coverage affects inference. For a given sequencing effort, marker density is more important than sequence coverage per marker in terms of the amount of genetic information we can infer. We also assessed the power of the DSPR by assigning genotypes at a hidden QTL to each RIL on the basis of the inferred founder state and simulating phenotypes for different experimental designs, different genetic architectures, different sample sizes, and QTL of varying effect sizes. We found the DSPR has both high power (e.g., 84% power to detect a 5% QTL) and high mapping resolution (e.g., ∼1.5 cM for a 5% QTL).
Project description:Natural populations exhibit a great deal of interindividual genetic variation in the response to toxins, exemplified by the variable clinical efficacy of pharmaceutical drugs in humans, and the evolution of pesticide resistant insects. Such variation can result from several phenomena, including variable metabolic detoxification of the xenobiotic, and differential sensitivity of the molecular target of the toxin. Our goal is to genetically dissect variation in the response to xenobiotics, and characterize naturally-segregating polymorphisms that modulate toxicity. Here, we use the Drosophila Synthetic Population Resource (DSPR), a multiparent advanced intercross panel of recombinant inbred lines, to identify QTL (Quantitative Trait Loci) underlying xenobiotic resistance, and employ caffeine as a model toxic compound. Phenotyping over 1,700 genotypes led to the identification of ten QTL, each explaining 4.5-14.4% of the broad-sense heritability for caffeine resistance. Four QTL harbor members of the cytochrome P450 family of detoxification enzymes, which represent strong a priori candidate genes. The case is especially strong for Cyp12d1, with multiple lines of evidence indicating the gene causally impacts caffeine resistance. Cyp12d1 is implicated by QTL mapped in both panels of DSPR RILs, is significantly upregulated in the presence of caffeine, and RNAi knockdown robustly decreases caffeine tolerance. Furthermore, copy number variation at Cyp12d1 is strongly associated with phenotype in the DSPR, with a trend in the same direction observed in the DGRP (Drosophila Genetic Reference Panel). No additional plausible causative polymorphisms were observed in a full genomewide association study in the DGRP, or in analyses restricted to QTL regions mapped in the DSPR. Just as in human populations, replicating modest-effect, naturally-segregating causative variants in an association study framework in flies will likely require very large sample sizes.
Project description:Genetic dissection of complex, polygenic trait variation is a key goal of medical and evolutionary genetics. Attempts to identify genetic variants underlying complex traits have been plagued by low mapping resolution in traditional linkage studies, and an inability to identify variants that cumulatively explain the bulk of standing genetic variation in genome-wide association studies (GWAS). Thus, much of the heritability remains unexplained for most complex traits. Here we describe a novel, freely available resource for the Drosophila community consisting of two sets of recombinant inbred lines (RILs), each derived from an advanced generation cross between a different set of eight highly inbred, completely resequenced founders. The Drosophila Synthetic Population Resource (DSPR) has been designed to combine the high mapping resolution offered by multiple generations of recombination, with the high statistical power afforded by a linkage-based design. Here, we detail the properties of the mapping panel of >1600 genotyped RILs, and provide an empirical demonstration of the utility of the approach by genetically dissecting alcohol dehydrogenase (ADH) enzyme activity. We confirm that a large fraction of the variation in this classic quantitative trait is due to allelic variation at the Adh locus, and additionally identify several previously unknown modest-effect trans-acting QTL (quantitative trait loci). Using a unique property of multiparental linkage mapping designs, for each QTL we highlight a relatively small set of candidate causative variants for follow-up work. The DSPR represents an important step toward the ultimate goal of a complete understanding of the genetics of complex traits in the Drosophila model system.
Project description:Animals in nature are frequently challenged by toxic compounds, from those that occur naturally in plants as a defense against herbivory, to pesticides used to protect crops. On exposure to such xenobiotic substances, animals mount a transcriptional response, generating detoxification enzymes and transporters that metabolize and remove the toxin. Genetic variation in this response can lead to variation in the susceptibility of different genotypes to the toxic effects of a given xenobiotic. Here we use Drosophila melanogaster to dissect the genetic basis of larval resistance to nicotine, a common plant defense chemical and widely used addictive drug in humans. We identified quantitative trait loci (QTL) for the trait using the DSPR (Drosophila Synthetic Population Resource), a panel of multiparental advanced intercross lines. Mapped QTL collectively explain 68.4% of the broad-sense heritability for nicotine resistance. The two largest-effect loci-contributing 50.3 and 8.5% to the genetic variation-map to short regions encompassing members of classic detoxification gene families. The largest QTL resides over a cluster of ten UDP-glucuronosyltransferase (UGT) genes, while the next largest QTL harbors a pair of cytochrome P450 genes. Using RNAseq we measured gene expression in a pair of DSPR founders predicted to harbor different alleles at both QTL and showed that Ugt86Dd, Cyp28d1, and Cyp28d2 had significantly higher expression in the founder carrying the allele conferring greater resistance. These genes are very strong candidates to harbor causative, regulatory polymorphisms that explain a large fraction of the genetic variation in larval nicotine resistance in the DSPR.
Project description:We leverage two complementary Drosophila melanogaster mapping panels to genetically dissect starvation resistance-an important fitness trait. Using >1600 genotypes from the multiparental Drosophila Synthetic Population Resource (DSPR), we map numerous starvation stress QTL that collectively explain a substantial fraction of trait heritability. Mapped QTL effects allowed us to estimate DSPR founder phenotypes, predictions that were correlated with the actual phenotypes of these lines. We observe a modest phenotypic correlation between starvation resistance and triglyceride level, traits that have been linked in previous studies. However, overlap among QTL identified for each trait is low. Since we also show that DSPR strains with extreme starvation phenotypes differ in desiccation resistance and activity level, our data imply multiple physiological mechanisms contribute to starvation variability. We additionally exploited the Drosophila Genetic Reference Panel (DGRP) to identify sequence variants associated with starvation resistance. Consistent with prior work these sites rarely fall within QTL intervals mapped in the DSPR. We were offered a unique opportunity to directly compare association mapping results across laboratories since two other groups previously measured starvation resistance in the DGRP. We found strong phenotypic correlations among studies, but extremely low overlap in the sets of genomewide significant sites. Despite this, our analyses revealed that the most highly associated variants from each study typically showed the same additive effect sign in independent studies, in contrast to otherwise equivalent sets of random variants. This consistency provides evidence for reproducible trait-associated sites in a widely used mapping panel, and highlights the polygenic nature of starvation resistance.
Project description:Soybean BAC-based physical maps provide a useful platform for gene and QTL map-based cloning, EST mapping, marker development, genome sequencing, and comparative genomic research. Soybean physical maps for "Forrest" and "Williams 82" representing the southern and northern US soybean germplasm base, respectively, have been constructed with different fingerprinting methods. These physical maps are complementary for coverage of gaps on the 20 soybean linkage groups. More than 5,000 genetic markers have been anchored onto the Williams 82 physical map, but only a limited number of markers have been anchored to the Forrest physical map. A mapping population of Forrest × Williams 82 made up of 1,025 F(8) recombinant inbred lines (RILs) was used to construct a reference genetic map. A framework map with almost 1,000 genetic markers was constructed using a core set of these RILs. The core set of the population was evaluated with the theoretical population using equality, symmetry and representativeness tests. A high-resolution genetic map will allow integration and utilization of the physical maps to target QTL regions of interest, and to place a larger number of markers into a map in a more efficient way using a core set of RILs.
Project description:Recombinant inbred lines (RILs) derived from bi-parental populations are stable genetic resources, which are widely used for constructing genetic linkage maps. These genetic maps are essential for QTL mapping and can aid contig and scaffold anchoring in the final stages of genome assembly. In this study, two Lotus sp. RIL populations, Lotus japonicus MG-20 × Gifu and Gifu × L. burttii, were characterized by Illumina re-sequencing. Genotyping of 187 MG-20 × Gifu RILs at 87,140 marker positions and 96 Gifu × L. burttii RILs at 357,973 marker positions allowed us to accurately identify 1,929 recombination breakpoints in the MG-20 × Gifu RILs and 1,044 breakpoints in the Gifu × L. burttii population. The resulting high-density genetic maps now facilitate high-accuracy QTL mapping, identification of reference genome mis-assemblies, and characterization of structural variants.
Project description:Multiparent Advanced Generation Inter-Cross (MAGIC) populations are now being utilized to more accurately identify the underlying genetic basis of quantitative traits through quantitative trait loci (QTL) analyses and subsequent gene discovery. The expanded genetic diversity present in such populations and the amplified number of recombination events mean that QTL can be identified at a higher resolution. Most QTL analyses are conducted separately for each trait within a single environment. Separate analysis does not take advantage of the underlying correlation structure found in multienvironment or multitrait data. By using this information in a joint analysis-be it multienvironment or multitrait - it is possible to gain a greater understanding of genotype- or QTL-by-environment interactions or of pleiotropic effects across traits. Furthermore, this can result in improvements in accuracy for a range of traits or in a specific target environment and can influence selection decisions. Data derived from MAGIC populations allow for founder probabilities of all founder alleles to be calculated for each individual within the population. This presents an additional layer of complexity and information that can be utilized to identify QTL. A whole-genome approach is proposed for multienvironment and multitrait QTL analysis in MAGIC. The whole-genome approach simultaneously incorporates all founder probabilities at each marker for all individuals in the analysis, rather than using a genome scan. A dimension reduction technique is implemented, which allows for high-dimensional genetic data. For each QTL identified, sizes of effects for each founder allele, the percentage of genetic variance explained, and a score to reflect the strength of the QTL are found. The approach was demonstrated to perform well in a small simulation study and for two experiments, using a wheat MAGIC population.
Project description:Considerable natural variation for lifespan exists within human and animal populations. Genetically dissecting this variation can elucidate the pathways and genes involved in aging, and help uncover the genetic mechanisms underlying risk for age-related diseases. Studying aging in model systems is attractive due to their relatively short lifespan, and the ability to carry out programmed crosses under environmentally-controlled conditions. Here we investigate the genetic architecture of lifespan using the Drosophila Synthetic Population Resource (DSPR), a multiparental advanced intercross mapping population.We measured lifespan in females from 805 DSPR lines, mapping five QTL (Quantitative Trait Loci) that each contribute 4-5 % to among-line lifespan variation in the DSPR. Each of these QTL co-localizes with the position of at least one QTL mapped in 13 previous studies of lifespan variation in flies. However, given that these studies implicate >90 % of the genome in the control of lifespan, this level of overlap is unsurprising. DSPR QTL intervals harbor 11-155 protein-coding genes, and we used RNAseq on samples of young and old flies to help resolve pathways affecting lifespan, and identify potentially causative loci present within mapped QTL intervals. Broad age-related patterns of expression revealed by these data recapitulate results from previous work. For example, we see an increase in antimicrobial defense gene expression with age, and a decrease in expression of genes involved in the electron transport chain. Several genes within QTL intervals are highlighted by our RNAseq data, such as Relish, a critical immune response gene, that shows increased expression with age, and UQCR-14, a gene involved in mitochondrial electron transport, that has reduced expression in older flies.The five QTL we isolate collectively explain a considerable fraction of the genetic variation for female lifespan in the DSPR, and implicate modest numbers of genes. In several cases the candidate loci we highlight reside in biological pathways already implicated in the control of lifespan variation. Thus, our results provide further evidence that functional genetics tests targeting these genes will be fruitful, lead to the identification of natural sequence variants contributing to lifespan variation, and help uncover the mechanisms of aging.
Project description:A major goal in the analysis of complex traits is to partition the observed genetic variation in a trait into components due to individual loci and perhaps variants within those loci. However, in both QTL mapping and genetic association studies, the estimated percent variation attributable to a QTL is upwardly biased conditional on it being discovered. This bias was first described in two-way QTL mapping experiments by William Beavis, and has been referred to extensively as "the Beavis effect." The Beavis effect is likely to occur in multiparent population (MPP) panels as well as collections of sequenced lines used for genome-wide association studies (GWAS). However, the strength of the Beavis effect is unknown-and often implicitly assumed to be negligible-when "hits" are obtained from an association panel consisting of hundreds of inbred lines tested across millions of SNPs, or in multiparent mapping populations where mapping involves fitting a complex statistical model with several d.f. at thousands of genetic intervals. To estimate the size of the effect in more complex panels, we performed simulations of both biallelic and multiallelic QTL in two major Drosophila melanogaster mapping panels, the GWAS-based Drosophila Genetic Reference Panel (DGRP), and the MPP the Drosophila Synthetic Population Resource (DSPR). Our results show that overestimation is determined most strongly by sample size and is only minimally impacted by the mapping design. When < 100, 200, 500, and 1000 lines are employed, the variance attributable to hits is inflated by factors of 6, 3, 1.5, and 1.1, respectively, for a QTL that truly contributes 5% to the variation in the trait. This overestimation indicates that QTL could be difficult to validate in follow-up replication experiments where additional individuals are examined. Further, QTL could be difficult to cross-validate between the two Drosophila resources. We provide guidelines for: (1) the sample sizes necessary to accurately estimate the percent variance to an identified QTL, (2) the conditions under which one is likely to replicate a mapped QTL in a second study using the same mapping population, and (3) the conditions under which a QTL mapped in one mapping panel is likely to replicate in the other (DGRP and DSPR).
Project description:High-density genetic maps are essential for high resolution mapping of quantitative traits. Here, we present a new genetic map for an Arabidopsis Bayreuth × Shahdara recombinant inbred line (RIL) population, built on RNA-seq data. RNA-seq analysis on 160 RILs of this population identified 30,049 single-nucleotide polymorphisms (SNPs) covering the whole genome. Based on a 100-kbp window SNP binning method, 1059 bin-markers were identified, physically anchored on the genome. The total length of the RNA-seq genetic map spans 471.70 centimorgans (cM) with an average marker distance of 0.45 cM and a maximum marker distance of 4.81 cM. This high resolution genotyping revealed new recombination breakpoints in the population. To highlight the advantages of such high-density map, we compared it to two publicly available genetic maps for the same population, comprising 69 PCR-based markers and 497 gene expression markers derived from microarray data, respectively. In this study, we show that SNP markers can effectively be derived from RNA-seq data. The new RNA-seq map closes many existing gaps in marker coverage, saturating the previously available genetic maps. Quantitative trait locus (QTL) analysis for published phenotypes using the available genetic maps showed increased QTL mapping resolution and reduced QTL confidence interval using the RNA-seq map. The new high-density map is a valuable resource that facilitates the identification of candidate genes and map-based cloning approaches.