Nucleotide polymorphism and copy number variant detection using exome capture and next-generation sequencing in the polyploid grass Panicum virgatum.
ABSTRACT: Switchgrass (Panicum virgatum) is a polyploid, outcrossing grass species native to North America and has recently been recognized as a potential biofuel feedstock crop. Significant phenotypic variation including ploidy is present across the two primary ecotypes of switchgrass, referred to as upland and lowland switchgrass. The tetraploid switchgrass genome is approximately 1400 Mbp, split between two subgenomes, with significant repetitive sequence content limiting the efficiency of re-sequencing approaches for determining genome diversity. To characterize genetic diversity in upland and lowland switchgrass as a first step in linking genotype to phenotype, we designed an exome capture probe set based on transcript assemblies that represent approximately 50 Mb of annotated switchgrass exome sequences. We then evaluated and optimized the probe set using solid phase comparative genome hybridization and liquid phase exome capture followed by next-generation sequencing. Using the optimized probe set, we assessed variation in the exomes of eight switchgrass genotypes representing tetraploid lowland and octoploid upland cultivars to benchmark our exome capture probe set design. We identified ample variation in the switchgrass genome including 1,395,501 single nucleotide polymorphisms (SNPs), 8173 putative copy number variants and 3336 presence/absence variants. While the majority of the SNPs (84%) detected was bi-allelic, a substantial number was tri-allelic with limited occurrence of tetra-allelic polymorphisms consistent with the heterozygous and polyploid nature of the switchgrass genome. Collectively, these data demonstrate the efficacy of exome capture for discovery of genome variation in a polyploid species with a large, repetitive and heterozygous genome.
Project description:Low-temperature related abiotic stress is an important factor affecting winter survival in lowland switchgrass when grown in northern latitudes in the United States. A better understanding of the genetic architecture of freezing tolerance in switchgrass will aid the development of lowland switchgrass cultivars with improved winter survival. The objectives of this study were to conduct a freezing tolerance assessment, generate a genetic map using single nucleotide polymorphism (SNP) markers, and identify QTL (quantitative trait loci) associated with freezing tolerance in a lowland × upland switchgrass population. A pseudo-F2 mapping population was generated from an initial cross between the lowland population Ellsworth and the upland cultivar Summer. The segregating progenies were screened for freezing tolerance in a controlled-environment facility. Two clonal replicates of each genotype were tested at six different treatment temperatures ranging from -15 to -5°C at an interval of 2°C for two time periods. Tiller emergence (days) and tiller number were recorded following the recovery of each genotype with the hypothesis that upland genotype is the source for higher tiller number and early tiller emergence. Survivorship of the pseudo-F2 population ranged from 89% at -5°C to 5% at -15°C with an average LT50 of -9.7°C. Genotype had a significant effect on all traits except tiller number at -15°C. A linkage map was constructed from bi-allelic single nucleotide polymorphism markers generated using exome capture sequencing. The final map consisted of 1618 markers and 2626 cM, with an average inter-marker distance of 1.8 cM. Six significant QTL were identified, one each on chromosomes 1K, 5K, 5N, 6K, 6N, and 9K, for the following traits: tiller number, tiller emergence days and LT50. A comparative genomics study revealed important freezing tolerance genes/proteins, such as COR47, DREB2B, zinc finger-CCCH, WRKY, GIGANTEA, HSP70, and NRT2, among others that reside within the 1.5 LOD confidence interval of the identified QTL.
Project description:Switchgrass (Panicum virgatum L.) exists at multiple ploidies and two phenotypically distinct ecotypes. To facilitate interploidal comparisons and to understand the extent of sequence variation within existing breeding pools, two complete switchgrass chloroplast genomes were sequenced from individuals representative of the upland and lowland ecotypes. The results demonstrated a very high degree of conservation in gene content and order with other sequenced plastid genomes. The lowland ecotype reference sequence (Kanlow Lin1) was 139,677 base pairs while the upland sequence (Summer Lin2) was 139,619 base pairs. Alignments between the lowland reference sequence and short-read sequence data from existing sequence datasets identified as either upland or lowland confirmed known polymorphisms and indicated the presence of other differences. Insertions and deletions principally occurred near stretches of homopolymer simple sequence repeats in intergenic regions while most Single Nucleotide Polymorphisms (SNPs) occurred in intergenic regions and introns within the single copy portions of the genome. The polymorphism rate between upland and lowland switchgrass ecotypes was found to be similar to rates reported between chloroplast genomes of indica and japonica subspecies of rice which were believed to have diverged 0.2-0.4 million years ago.
Project description:The lowland ecotype of switchgrass has generated considerable interest because of its higher biomass yield and late flowering characteristics compared to the upland ecotype. However, lowland ecotypes planted in northern latitudes exhibit very low winter survival. Implementation of genomic selection could potentially enhance switchgrass breeding for winter survival by reducing generation time while eliminating the dependence on weather. The objectives of this study were to assess the potential of genomic selection for winter survival in lowland switchgrass by combining multiple populations in the training set and applying the selected model in two independent testing datasets for validation. Marker data were generated using exome capture sequencing. Validation was conducted using (1) indirect indicators of winter adaptation based on geographic and climatic variables of accessions from different source locations and (2) winter survival estimates of the phenotype. The prediction accuracies were significantly higher when the training dataset comprising all populations was used in fivefold cross validation but its application was not useful in the independent validation dataset. Nevertheless, modeling for population heterogeneity improved the prediction accuracy to some extent but the genetic relationship between the training and validation populations was found to be more influential. The predicted winter survival of lowland switchgrass indicated latitudinal and longitudinal variability, with the northeast USA the region for most cold tolerant lowland populations. Our results suggested that GS could provide valuable opportunities for improving winter survival and accelerate the lowland switchgrass breeding programs toward the development of cold tolerant cultivars suitable for northern latitudes.
Project description:<h4>Key message</h4>Transcriptomes of two switchgrass genotypes representing the upland and lowland ecotypes will be key tools in switchgrass genome annotation and biotic and abiotic stress functional genomics. Switchgrass (Panicum virgatum L.) is an important bioenergy feedstock for cellulosic ethanol production. We report genome-wide transcriptome profiling of two contrasting tetraploid switchgrass genotypes, VS16 and AP13, representing the upland and lowland ecotypes, respectively. A total of 268 million Illumina short reads (50 nt) were generated, of which, 133 million were obtained in AP13 and the rest 135 million in VS16. More than 90% of these reads were mapped to the switchgrass reference genome (V1.1). We identified 6619 and 5369 differentially expressed genes in VS16 and AP13, respectively. Gene ontology and KEGG pathway analysis identified key genes that regulate important pathways including C4 photosynthesis, photorespiration and phenylpropanoid metabolism. A series of genes (33) involved in photosynthetic pathway were up-regulated in AP13 but only two genes showed higher expression in VS16. We identified three dicarboxylate transporter homologs that were highly expressed in AP13. Additionally, genes that mediate drought, heat, and salinity tolerance were also identified. Vesicular transport proteins, syntaxin and signal recognition particles were seen to be up-regulated in VS16. Analyses of selected genes involved in biosynthesis of secondary metabolites, plant-pathogen interaction, membrane transporters, heat, drought and salinity stress responses confirmed significant variation in the relative expression reflected in RNA-Seq data between VS16 and AP13 genotypes. The phenylpropanoid pathway genes identified here are potential targets for biofuel conversion.
Project description:BACKGROUND:Advances in genomic technologies have expanded our ability to accurately and exhaustively detect natural genomic variants that can be applied in crop improvement and to increase our knowledge of plant evolution and adaptation. Switchgrass (Panicum virgatum L.), an allotetraploid (2n?=?4×?=?36) perennial C4 grass (Poaceae family) native to North America and a feedstock crop for cellulosic biofuel production, has a large potential for genetic improvement due to its high genotypic and phenotypic variation. In this study, we analyzed single nucleotide polymorphism (SNP) variation in 372 switchgrass genotypes belonging to 36 accessions for 12 genes putatively involved in biomass production to investigate signatures of selection that could have led to ecotype differentiation and to population adaptation to geographic zones. RESULTS:A total of 11,682 SNPs were mined from ~?15 Gb of sequence data, out of which 251 SNPs were retained after filtering. Population structure analysis largely grouped upland accessions into one subpopulation and lowland accessions into two additional subpopulations. The most frequent SNPs were in homozygous state within accessions. Sixty percent of the exonic SNPs were non-synonymous and, of these, 45% led to non-conservative amino acid changes. The non-conservative SNPs were largely in linkage disequilibrium with one haplotype being predominantly present in upland accessions while the other haplotype was commonly present in lowland accessions. Tajima's test of neutrality indicated that PHYB, a gene involved in photoperiod response, was under positive selection in the switchgrass population. PHYB carried a SNP leading to a non-conservative amino acid change in the PAS domain, a region that acts as a sensor for light and oxygen in signal transduction. CONCLUSIONS:Several non-conservative SNPs in genes potentially involved in plant architecture and adaptation have been identified and led to population structure and genetic differentiation of ecotypes in switchgrass. We suggest here that PHYB is a key gene involved in switchgrass natural selection. Further analyses are needed to determine whether any of the non-conservative SNPs identified play a role in the differential adaptation of upland and lowland switchgrass.
Project description:Connecting broad-scale patterns of genetic variation and population structure to genetic diversity on a landscape is a key step towards understanding historical processes of migration and adaptation. New genomic approaches can be used to increase the resolution of phylogeographic studies while reducing locus sampling effects and circumventing ascertainment bias. Here, we use a novel approach based on high-throughput sequencing to characterize genetic diversity in complete chloroplast genomes and >10,000 nuclear loci in switchgrass, at continental and landscape scales. Switchgrass is a North American tallgrass species, which is widely used in conservation and perennial biomass production, and shows strong ecotypic adaptation and population structure across the continental range. We sequenced 40.9 billion base pairs from 24 individuals from across the species' range and 20 individuals from the Indiana Dunes. Analysis of plastome sequence revealed 203 variable SNP sites that define eight haplogroups, which are differentiated by 4-127 SNPs and confirmed by patterns of indel variation. These include three deeply divergent haplogroups, which correspond to the previously described lowland-upland ecotypic split and a novel upland haplogroup split that dates to the mid-Pleistocene. Most of the plastome haplogroup diversity present in the northern switchgrass range, including in the Indiana Dunes, originated in the mid- or upper Pleistocene prior to the most recent postglacial recolonization. Furthermore, a recently colonized landscape feature (approximately 150 ya) in the Indiana Dunes contains several deeply divergent upland haplogroups. Nuclear markers also support a deep lowland-upland split, followed by limited gene flow, and show extensive gene flow in the local population of the Indiana Dunes.
Project description:The response of plant growth and development to nutrient and water availability is an important adaptation for abiotic stress tolerance. Roots need to intercept both passing nutrients and water while foraging into new soil layers for further resources. Substantial amounts of nitrate can be lost in the field when leaching into groundwater, yet very little is known about how deep rooting affects this process. Here, we phenotyped root system traits and deep 15N nitrate capture across 1.5 m vertical profiles of solid media using tall mesocosms in switchgrass (Panicum virgatum L.), a promising cellulosic bioenergy feedstock. Root and shoot biomass traits, photosynthesis and respiration measures, and nutrient uptake and accumulation traits were quantified in response to a water and nitrate stress factorial experiment for switchgrass upland (VS16) and lowland (AP13) ecotypes. The two switchgrass ecotypes shared common plastic abiotic responses to nitrogen (N) and water availability, and yet had substantial genotypic variation for root and shoot traits. A significant interaction between N and water stress combination treatments for axial and lateral root traits represents a complex and shared root development strategy for stress mitigation. Deep root growth and 15N capture were found to be closely linked to aboveground growth. Together, these results represent the wide genetic pool of switchgrass and show that deep rooting promotes nitrate capture, plant productivity, and sustainability.
Project description:Switchgrass (Panicum virgatum) is a native prairie grass and valuable bio-energy crop. The physiological change from juvenile to reproductive adult can draw important resources away from growth into producing reproductive structures, thereby limiting the growth potential of early flowering plants. Delaying the flowering of switchgrass is one approach by which to increase total biomass. The objective of this research was to identify genetic variants and candidate genes for controlling heading and anthesis in segregating switchgrass populations. Four pseudo-F2 populations (two pairs of reciprocal crosses) were developed from lowland (late flowering) and upland (early flowering) ecotypes, and heading and anthesis dates of these populations were collected in Lafayette, IN and DeKalb, IL in 2015 and 2016. Across 2 years, there was a 34- and 73-day difference in heading and a 52- and 75-day difference in anthesis at the Lafayette and DeKalb locations, respectively. A total of 37,901 single nucleotide polymorphisms obtained by exome capture sequencing of the populations were used in a genome-wide association study (GWAS) that identified five significant signals at three loci for heading and two loci for anthesis. Among them, a homolog of FLOWERING LOCUS T on chromosome 5b associated with heading date was identified at the Lafayette location across 2 years. A homolog of ARABIDOPSIS PSEUDO-RESPONSE REGULATOR 5, a light modulator in the circadian clock associated with heading date was detected on chromosome 8a across locations and years. These results demonstrate that genetic variants related to floral development could lend themselves to a long-term goal of developing late flowering varieties of switchgrass with high biomass yield.
Project description:Switchgrass (Panicum virgatum L.) is a perennial grass that has been designated as an herbaceous model biofuel crop for the United States of America. To facilitate accelerated breeding programs of switchgrass, we developed both an association panel and linkage populations for genome-wide association study (GWAS) and genomic selection (GS). All of the 840 individuals were then genotyped using genotyping by sequencing (GBS), generating 350 GB of sequence in total. As a highly heterozygous polyploid (tetraploid and octoploid) species lacking a reference genome, switchgrass is highly intractable with earlier methodologies of single nucleotide polymorphism (SNP) discovery. To access the genetic diversity of species like switchgrass, we developed a SNP discovery pipeline based on a network approach called the Universal Network-Enabled Analysis Kit (UNEAK). Complexities that hinder single nucleotide polymorphism discovery, such as repeats, paralogs, and sequencing errors, are easily resolved with UNEAK. Here, 1.2 million putative SNPs were discovered in a diverse collection of primarily upland, northern-adapted switchgrass populations. Further analysis of this data set revealed the fundamentally diploid nature of tetraploid switchgrass. Taking advantage of the high conservation of genome structure between switchgrass and foxtail millet (Setaria italica (L.) P. Beauv.), two parent-specific, synteny-based, ultra high-density linkage maps containing a total of 88,217 SNPs were constructed. Also, our results showed clear patterns of isolation-by-distance and isolation-by-ploidy in natural populations of switchgrass. Phylogenetic analysis supported a general south-to-north migration path of switchgrass. In addition, this analysis suggested that upland tetraploid arose from upland octoploid. All together, this study provides unparalleled insights into the diversity, genomic complexity, population structure, phylogeny, phylogeography, ploidy, and evolutionary dynamics of switchgrass.
Project description:Geographic patterns of genetic variation are shaped by multiple evolutionary processes, including genetic drift, migration and natural selection. Switchgrass (Panicum virgatum L.) has strong genetic and adaptive differentiation despite life history characteristics that promote high levels of gene flow and can homogenize intraspecific differences, such as wind-pollination and self-incompatibility. To better understand how historical and contemporary factors shape variation in switchgrass, we use genotyping-by-sequencing to characterize switchgrass from across its range at 98 042 SNPs. Population structuring reflects biogeographic and ploidy differences within and between switchgrass ecotypes and indicates that biogeographic history, ploidy incompatibilities and differential adaptation each have important roles in shaping ecotypic differentiation in switchgrass. At one extreme, we determine that two Panicum taxa are not separate species but are actually conspecific, ecologically divergent types of switchgrass adapted to the extreme conditions of coastal sand dune habitats. Conversely, we identify natural hybrids among lowland and upland ecotypes and visualize their genome-wide patterns of admixture. Furthermore, we determine that genetic differentiation between primarily tetraploid and octoploid lineages is not caused solely by ploidy differences. Rather, genetic diversity in primarily octoploid lineages is consistent with a history of admixture. This suggests that polyploidy in switchgrass is promoted by admixture of diverged lineages, which may be important for maintaining genetic differentiation between switchgrass ecotypes where they are sympatric. These results provide new insights into the mechanisms shaping variation in widespread species and provide a foundation for dissecting the genetic basis of adaptation in switchgrass.