CRISPR-directed mitotic recombination enables genetic mapping without crosses.
ABSTRACT: Linkage and association studies have mapped thousands of genomic regions that contribute to phenotypic variation, but narrowing these regions to the underlying causal genes and variants has proven much more challenging. Resolution of genetic mapping is limited by the recombination rate. We developed a method that uses CRISPR (clustered, regularly interspaced, short palindromic repeats) to build mapping panels with targeted recombination events. We tested the method by generating a panel with recombination events spaced along a yeast chromosome arm, mapping trait variation, and then targeting a high density of recombination events to the region of interest. Using this approach, we fine-mapped manganese sensitivity to a single polymorphism in the transporter Pmr1. Targeting recombination events to regions of interest allows us to rapidly and systematically identify causal variants underlying trait differences.
Project description:Powdery mildew, caused by Leveillula taurica, is a major fungal disease affecting greenhouse-grown pepper (Capsicum annuum). Powdery mildew resistance has a complex mode of inheritance. In the present study, we investigated a novel powdery mildew resistance locus, PMR1, using two mapping populations: 102 'VK515' F2:3 families (derived from a cross between resistant parental line 'VK515R' and susceptible parental line 'VK515S') and 80 'PM Singang' F2 plants (derived from the F1 'PM Singang' commercial hybrid). Genetic analysis of the F2:3 'VK515' and F2 'PM Singang' populations revealed a single dominant locus for inheritance of the powdery mildew resistance trait. Genetic mapping showed that the PMR1 locus is located on syntenic regions of pepper chromosome 4 in a 4-Mb region between markers CZ2_11628 and HRM4.1.6 in 'VK515R'. Six molecular markers including one SCAR marker and five SNP markers were localized to a region 0 cM from the PMR1 locus. Two putative nucleotide-binding site leucine-rich repeat (NBS-LRR)-type disease resistance genes were identified in this PMR1 region. Genotyping-by-sequencing (GBS) and genetic mapping analysis revealed suppressed recombination in the PMR1 region, perhaps due to alien introgression. In addition, a comparison of species-specific InDel markers as well as GBS-derived SNP markers indicated that C. baccatum represents a possible source of such alien introgression of powdery mildew resistance into 'VK515R'. The molecular markers developed in this study will be especially helpful for marker-assisted selection in pepper breeding programs for powdery mildew resistance.
Project description:The genes underlying variation in skeletal muscle mass are poorly understood. Although many quantitative trait loci (QTLs) have been mapped in crosses of mouse strains, the limited resolution inherent in these conventional studies has made it difficult to reliably pinpoint the causal genetic variants. The accumulated recombination events in an advanced intercross line (AIL), in which mice from two inbred strains are mated at random for several generations, can improve mapping resolution. We demonstrate these advancements in mapping QTLs for hindlimb muscle weights in an AIL (n = 832) of the C57BL/6J (B6) and DBA/2J (D2) strains, generations F8-F13. We mapped muscle weight QTLs using the high-density MegaMUGA SNP panel. The QTLs highlight the shared genetic architecture of four hindlimb muscles and suggest that the genetic contributions to muscle variation are substantially different in males and females, at least in the B6D2 lineage. Out of the 15 muscle weight QTLs identified in the AIL, nine overlapped the genomic regions discovered in an earlier B6D2 F2 intercross. Mapping resolution, however, was substantially improved in our study to a median QTL interval of 12.5 Mb. Subsequent sequence analysis of the QTL regions revealed 20 genes with nonsense or potentially damaging missense mutations. Further refinement of the muscle weight QTLs using additional functional information, such as gene expression differences between alleles, will be important for discerning the causal genes.
Project description:Adaptation of domesticated species to diverse agroclimatic regions has led to abundant trait diversity. However, the resulting population structure and genetic heterogeneity confounds association mapping of adaptive traits. To address this challenge in sorghum [Sorghum bicolor (L.) Moench]-a widely adapted cereal crop-we developed a nested association mapping (NAM) population using 10 diverse global lines crossed with an elite reference line RTx430. We characterized the population of 2214 recombinant inbred lines at 90,000 SNPs using genotyping-by-sequencing. The population captures ?70% of known global SNP variation in sorghum, and 57,411 recombination events. Notably, recombination events were four- to fivefold enriched in coding sequences and 5' untranslated regions of genes. To test the power of the NAM population for trait dissection, we conducted joint linkage mapping for two major adaptive traits, flowering time and plant height. We precisely mapped several known genes for these two traits, and identified several additional QTL. Considering all SNPs simultaneously, genetic variation accounted for 65% of flowering time variance and 75% of plant height variance. Further, we directly compared NAM to genome-wide association mapping (using panels of the same size) and found that flowering time and plant height QTL were more consistently identified with the NAM population. Finally, for simulated QTL under strong selection in diversity panels, the power of QTL detection was up to three times greater for NAM vs. association mapping with a diverse panel. These findings validate the NAM resource for trait mapping in sorghum, and demonstrate the value of NAM for dissection of adaptive traits.
Project description:Very few causal genes have been identified by quantitative trait loci (QTL) mapping because of the large size of QTL, and most of them were identified thanks to functional links already known with the targeted phenotype. Here, we propose to combine selection signature detection, coding SNP annotation, and cis-expression QTL analyses to identify potential causal genes underlying QTL identified in divergent line designs. As a model, we chose experimental chicken lines divergently selected for only one trait, the abdominal fat weight, in which several QTL were previously mapped. Using new haplotype-based statistics exploiting the very high SNP density generated through whole-genome resequencing, we found 129 significant selective sweeps. Most of the QTL colocalized with at least one sweep, which markedly narrowed candidate region size. Some of those sweeps contained only one gene, therefore making them strong positional causal candidates with no presupposed function. We then focused on two of these QTL/sweeps. The absence of nonsynonymous SNPs in their coding regions strongly suggests the existence of causal mutations acting in cis on their expression, confirmed by cis-eQTL identification using either allele-specific expression or genetic mapping analyses. Additional expression analyses of those two genes in the chicken and mice contrasted for adiposity reinforces their link with this phenotype. This study shows for the first time the interest of combining selective sweeps mapping, coding SNP annotation and cis-eQTL analyses for identifying causative genes for a complex trait, in the context of divergent lines selected for this specific trait. Moreover, it highlights two genes, JAG2 and PARK2, as new potential negative and positive key regulators of adiposity in chicken and mice.
Project description:The success of high resolution genetic mapping of disease predisposition and quantitative trait loci in humans and experimental animals depends on the positions of key crossover events around the gene of interest. In mammals, the majority of recombination occurs at highly delimited 1-2 kb long sites known as recombination hotspots, whose locations and activities are distributed unevenly along the chromosomes and are tightly regulated in a sex specific manner. The factors determining the location of hotspots started to emerge with the finding of PRDM9 as a major hotspot regulator in mammals, however, additional factors modulating hotspot activity and sex specificity are yet to be defined. To address this limitation, we have collected and mapped the locations of 4829 crossover events occurring on mouse chromosome 11 in 5858 meioses of male and female reciprocal F1 hybrids of C57BL/6J and CAST/EiJ mice. This chromosome was chosen for its medium size and high gene density and provided a comparison with our previous analysis of recombination on the longest mouse chromosome 1. Crossovers were mapped to an average resolution of 127 kb, and thirteen hotspots were mapped to <8 kb. Most crossovers occurred in a small number of the most active hotspots. Females had higher recombination rate than males as a consequence of differences in crossover interference and regional variation of sex specific rates along the chromosome. Comparison with chromosome 1 showed that recombination events tend to be positioned in similar fashion along the centromere-telomere axis but independently of the local gene density. It appears that mammalian recombination is regulated on at least three levels, chromosome-wide, regional, and at individual hotspots, and these regulation levels are influenced by sex and genetic background but not by gene content.
Project description:Recombinations occur nonuniformly across the maize genome. To dissect the genetic mechanisms underlying the nonuniformity of recombination, we performed quantitative trait locus (QTL) mapping using recombinant inbred line populations. Genome-wide QTL scan identified hundreds of QTLs with both cis-prone and trans- effects for recombination number variation. To provide detailed insights into cis- factors associated with recombination variation, we examined the genomic features around recombination hot regions, including density of genes, DNA transposons, retrotransposons, and some specific motifs. Compared to recombination variation in whole genome, more QTLs were mapped for variations in recombination hot regions. The majority QTLs for recombination hot regions are trans-QTLs and co-localized with genes from the recombination pathway. We also found that recombination variation was positively associated with the presence of genes and DNA transposons, but negatively related to the presence of long terminal repeat retrotransposons. Additionally, 41 recombination hot regions were fine-mapped. The high-resolution genotyping of five randomly selected regions in two F2 populations verified that they indeed have ultra-high recombination frequency, which is even higher than that of the well-known recombination hot regions sh1-bz and a1-sh2. Taken together, our results further our understanding of recombination variation in plants.
Project description:Whole genome sequence data for small pedigrees has been shown to provide sufficient information to resolve detailed haplotypes in small pedigrees. Using such information, recombinations can be mapped onto chromosomes, compared with the segregation of a disease of interest and used to filter genome sequence variants. We now show that relatively inexpensive SNP array data from small pedigrees can be used in a similar manner to provide a means of identifying regions of interest in exome sequencing projects. We demonstrate that in those situations where one can assume complete penetrance and parental DNA is available, SNP recombination mapping using Boolean logic identifies chromosomal regions identical to those detected by multipoint linkage using microsatellites but with much less computation. We further show that this approach is successful because the probability of a double crossover between informative SNP loci is negligible. Our observations provide a rationale for using SNP arrays and recombination mapping as a rapid and cost-effective means of incorporating chromosome segregation information into exome sequencing projects intended for disease-gene identification.
Project description:We developed an algorithm named ViReMa (Viral-Recombination-Mapper) to provide a versatile platform for rapid, sensitive and nucleotide-resolution detection of recombination junctions in viral genomes using next-generation sequencing data. Rather than mapping read segments of pre-defined lengths and positions, ViReMa dynamically generates moving read segments. ViReMa initially attempts to align the 5' end of a read to the reference genome(s) with the Bowtie seed-based alignment. A new read segment is then made by either extracting any unaligned nucleotides at the 3' end of the read or by trimming the first nucleotide from the read. This continues iteratively until all portions of the read are either mapped or trimmed. With multiple reference genomes, it is possible to detect virus-to-host or inter-virus recombination. ViReMa is also capable of detecting insertion and substitution events and multiple recombination junctions within a single read. By mapping the distribution of recombination events in the genome of flock house virus, we demonstrate that this information can be used to discover de novo functional motifs located in conserved regions of the viral genome.
Project description:BACKGROUND:Genetic linkage maps are useful tools for mapping quantitative trait loci (QTL) influencing variation in traits of interest in a population. Genotyping-by-sequencing approaches such as Restriction-site Associated DNA sequencing (RAD-Seq) now enable the rapid discovery and genotyping of genome-wide SNP markers suitable for the development of dense SNP linkage maps, including in non-model organisms such as Atlantic salmon (Salmo salar). This paper describes the development and characterisation of a high density SNP linkage map based on SbfI RAD-Seq SNP markers from two Atlantic salmon reference families. RESULTS:Approximately 6,000 SNPs were assigned to 29 linkage groups, utilising markers from known genomic locations as anchors. Linkage maps were then constructed for the four mapping parents separately. Overall map lengths were comparable between male and female parents, but the distribution of the SNPs showed sex-specific patterns with a greater degree of clustering of sire-segregating SNPs to single chromosome regions. The maps were integrated with the Atlantic salmon draft reference genome contigs, allowing the unique assignment of ~4,000 contigs to a linkage group. 112 genome contigs mapped to two or more linkage groups, highlighting regions of putative homeology within the salmon genome. A comparative genomics analysis with the stickleback reference genome identified putative genes closely linked to approximately half of the ordered SNPs and demonstrated blocks of orthology between the Atlantic salmon and stickleback genomes. A subset of 47 RAD-Seq SNPs were successfully validated using a high-throughput genotyping assay, with a correspondence of 97% between the two assays. CONCLUSIONS:This Atlantic salmon RAD-Seq linkage map is a resource for salmonid genomics research as genotyping-by-sequencing becomes increasingly common. This is aided by the integration of the SbfI RAD-Seq SNPs with existing reference maps and the draft reference genome, as well as the identification of putative genes proximal to the SNPs. Differences in the distribution of recombination events between the sexes is evident, and regions of homeology have been identified which are reflective of the recent salmonid whole genome duplication.
Project description:Cacao is a crop of global relevance that faces constant demands for improved bean yield. However, little is known about the genomic regions controlling the crop yield and genes involved in cacao bean filling. Hence, to identify the quantitative trait loci (QTL) associated with cacao yield and bean filling, we performed a QTL mapping in a segregating mapping population comprising 459 trees of a cross between 'TSH 1188' and 'CCN 51'. All variables showed considerable phenotypic variation and had moderate to high heritability values. We identified 24 QTLs using a genetic linkage map that contains 3526 single nucleotide polymorphism (SNP) markers. Haplotype analysis at the significant QTL region on chromosome IV pointed to the alleles from the maternal parent, 'TSH 1188', as the ones that affect the cacao yield components the most. The recombination events identified within these QTL regions allowed us to identify candidate genes that may take part in the different steps of pod growth and bean filling. Such candidate genes seem to play a significant role in the source-to-sink transport of sugars and amino acids, and lipid metabolism, such as fatty acid production. The SNP markers mapped in our study are now being used to select potential high-yielding cacao varieties through marker-assisted selection in our existing cacao-breeding experiments.