Comparative population genetics of the panicoid grasses: sequence polymorphism, linkage disequilibrium and selection in a diverse sample of sorghum bicolor.
ABSTRACT: Levels of genetic variation and linkage disequilibrium (LD) are critical factors in association mapping methods as well as in identification of loci that have been targets of selection. Maize, an outcrosser, has a high level of sequence variation and a limited extent of LD. Sorghum, a closely related but largely self-pollinating panicoid grass, is expected to have higher levels of LD. As a first step in estimation of population genetic parameters in sorghum, we surveyed 27 diverse S. bicolor accessions for sequence variation at a total of 29,186 bp in 95 short regions derived from genetically mapped RFLPs located throughout the genome. Consistent with its higher level of inbreeding, the extent of LD is at least severalfold greater in sorghum than in maize. Total sequence variation in sorghum is about fourfold lower than that in maize, while synonymous variation is fivefold lower, suggesting a smaller effective population size in sorghum. Because we surveyed a species-wide sample, the mating system, which primarily affects population-level diversity, may not be primarily responsible for this difference. Comparisons of polymorphism and divergence suggest that both directional and diversifying selection have played important roles in shaping variation in the sorghum genome.
Project description:BACKGROUND:The circadian clock drives endogenous 24-h rhythms that allow organisms to adapt and prepare for predictable and repeated changes in their environment throughout the day-night (diurnal) cycle. Many components of the circadian clock in Arabidopsis thaliana have been functionally characterized, but comparatively little is known about circadian clocks in grass species including major crops like maize and sorghum. RESULTS:Comparative research based on protein homology and diurnal gene expression patterns suggests the function of some predicted clock components in grasses is conserved with their Arabidopsis counterparts, while others have diverged in function. Our analysis of diurnal gene expression in three panicoid grasses sorghum, maize, and foxtail millet revealed conserved and divergent evolution of expression for core circadian clock genes and for the overall transcriptome. We find that several classes of core circadian clock genes in these grasses differ in copy number compared to Arabidopsis, but mostly exhibit conservation of both protein sequence and diurnal expression pattern with the notable exception of maize paralogous genes. We predict conserved cis-regulatory motifs shared between maize, sorghum, and foxtail millet through identification of diurnal co-expression clusters for a subset of 27,196 orthologous syntenic genes. In this analysis, a Cochran-Mantel-Haenszel based method to control for background variation identified significant enrichment for both expected and novel 6-8 nucleotide motifs in the promoter regions of genes with shared diurnal regulation predicted to function in common physiological activities. CONCLUSIONS:This study illustrates the divergence and conservation of circadian clocks and diurnal regulatory networks across syntenic orthologous genes in panacoid grass species. Further, conserved local regulatory sequences contribute to the architecture of these diurnal regulatory networks that produce conserved patterns of diurnal gene expression.
Project description:Sample sequence analysis was employed to investigate the repetitive DNAs that were most responsible for the evolved variation in genome content across seven panicoid grasses with >5-fold variation in genome size and different histories of polyploidy. In all cases, the most abundant repeats were LTR retrotransposons, but the particular families that had become dominant were found to be different in the Pennisetum, Saccharum, Sorghum and Zea lineages. One element family, Huck, has been very active in all of the studied species over the last few million years. This suggests the transmittal of an active or quiescent autonomous set of Huck elements to this lineage at the founding of the panicoids. Similarly, independent recent activity of Ji and Opie elements in Zea and of Leviathan elements in Sorghum and Saccharum species suggests that members of these families with exceptional activation potential were present in the genome(s) of the founders of these lineages. In a detailed analysis of the Zea lineage, the combined action of several families of LTR retrotransposons were observed to have approximately doubled the genome size of Zea luxurians relative to Zea mays and Zea diploperennis in just the last few million years. One of the LTR retrotransposon amplification bursts in Zea may have been initiated by polyploidy, but the great majority of transposable element activations are not. Instead, the results suggest random activation of a few or many LTR retrotransposons families in particular lineages over evolutionary time, with some families especially prone to future activation and hyper-amplification.
Project description:The architecture of a plant affects its ability to compete for light and to respond to environmental stresses, thus affecting overall fitness and productivity. Two components of architecture, branching and height, were studied in 182 F7 recombinant inbred lines (RILs) at the vegetative, flowering and mature developmental stages in the panicoid C4 model grass system, Setaria. The RIL population was derived from a cross between domesticated S. italica (foxtail millet) and its wild relative S. viridis (green foxtail). In both field and greenhouse trials the wild parent was taller initially, started branching earlier, and flowered earlier, while the domesticated parent was shorter initially, but flowered later, producing a robust tall plant architecture with more nodes and leaves on the main culm and few or no branches. Biomass was highly correlated with height of the plant and number of nodes on the main culm, and generally showed a negative relationship with branch number. However, several of the RILs with the highest biomass in both trials were significantly more branched than the domesticated parent of the cross. Quantitative trait loci (QTL) analyses indicate that both height and branching are controlled by multiple genetic regions, often with QTL for both traits colocalizing in the same genomic regions. Genomic positions of several QTL colocalize with QTL in syntenic regions in other species and contain genes known to control branching and height in sorghum, maize, and switchgrass. Included in these is the ortholog of the rice SD-1 semi-dwarfing gene, which underlies one of the major Setaria height QTL. Understanding the relationships between height and branching patterns in Setaria, and their genetic control, is an important step to gaining a comprehensive knowledge of the development and genetic regulation of panicoid grass architecture.
Project description:We report the first study on the genetic control of flowering in Setaria, a panicoid grass closely related to switchgrass, and in the same subfamily as maize and sorghum. A recombinant inbred line mapping population derived from a cross between domesticated Setaria italica (foxtail millet) and its wild relative Setaria viridis (green millet), was grown in eight trials with varying environmental conditions to identify a small number of quantitative trait loci (QTL) that control differences in flowering time. Many of the QTL across trials colocalize, suggesting that the genetic control of flowering in Setaria is robust across a range of photoperiod and other environmental factors. A detailed comparison of QTL for flowering in Setaria, sorghum, and maize indicates that several of the major QTL regions identified in maize and sorghum are syntenic orthologs with Setaria QTL, although the maize large effect QTL on chromosome 10 is not. Several Setaria QTL intervals had multiple LOD peaks and were composed of multiple syntenic blocks, suggesting that observed QTL represent multiple tightly linked loci. Candidate genes from flowering time pathways identified in rice and Arabidopsis were identified in Setaria QTL intervals, including those involved in the CONSTANS photoperiod pathway. However, only three of the approximately seven genes cloned for flowering time in maize colocalized with Setaria QTL. This suggests that variation in flowering time in separate grass lineages is controlled by a combination of conserved and lineage specific genes.
Project description:Of central importance in adapting plants of tropical origin to temperate cultivation has been selection of daylength-neutral genotypes that flower early in the temperate summer and take full advantage of its long days. A cross between tropical and temperate sorghums [Sorghum propinquum (Kunth) Hitchc.×S. bicolor (L.) Moench], revealed a quantitative trait locus (QTL), FlrAvgD1, accounting for 85.7% of variation in flowering time under long days. Fine-scale genetic mapping placed FlrAvgD1 on chromosome 6 within the physically largest centiMorgan in the genome. Forward genetic data from "converted" sorghums validated the QTL. Association genetic evidence from a diversity panel delineated the QTL to a 10-kb interval containing only one annotated gene, Sb06g012260, that was shown by reverse genetics to complement a recessive allele. Sb06g012260 (SbFT12) contains a phosphatidylethanolamine-binding (PEBP) protein domain characteristic of members of the "FT" family of flowering genes acting as a floral suppressor. Sb06g012260 appears to have evolved ∼40 Ma in a panicoid ancestor after divergence from oryzoid and pooid lineages. A species-specific Sb06g012260 mutation may have contributed to spread to temperate regions by S. halepense ("Johnsongrass"), one of the world's most widespread invasives. Alternative alleles for another family member, Sb02g029725 (SbFT6), mapping near another flowering QTL, also showed highly significant association with photoperiod response index (P = 1.53×10 (-) (6)). The evolution of Sb06g012260 adds to evidence that single gene duplicates play large roles in important environmental adaptations. Increased knowledge of Sb06g012260 opens new doors to improvement of sorghum and other grain and cellulosic biomass crops.
Project description:The number of plant species with genomic and transcriptomic data has been increasing rapidly. The grasses-Poaceae-have been well represented among species with published reference genomes. However, as a result the genomes of wild grasses are less frequently targeted by sequencing efforts. Sequence data from wild relatives of crop species in the grasses can aid the study of domestication, gene discovery for breeding and crop improvement, and improve our understanding of the evolution of C4 photosynthesis. Here, we used long-read sequencing technology to characterize the transcriptomes of three C3 panicoid grass species: Dichanthelium oligosanthes, Chasmanthium laxum, and Hymenachne amplexicaulis. Based on alignments to the sorghum genome, we estimate that assembled consensus transcripts from each species capture between 54.2% and 65.7% of the conserved syntenic gene space in grasses. Genes co-opted into C4 were also well represented in this dataset, despite concerns that because these genes might play roles unrelated to photosynthesis in the target species, they would be expressed at low levels and missed by transcript-based sequencing. A combined analysis using syntenic orthologous genes from grasses with published reference genomes and consensus long-read sequences from these wild species was consistent with previously published phylogenies. It is hoped that these data, targeting underrepresented classes of species within the PACMAD grasses-wild species and species utilizing C3 photosynthesis-will aid in future studies of domestication and C4 evolution by decreasing the evolutionary distance between C4 and C3 species within this clade, enabling more accurate comparisons associated with evolution of the C4 pathway.
Project description:Phylogenetic analysis of sequences from gene families and homologous genes from species of varying divergence can be used to identify conserved noncoding regulatory elements. In this study, phylogenetic analysis of 5'-noncoding sequences was optimized using rab17, a well-characterized ABA-responsive gene from maize, and five additional rab16/17 homologs from sorghum and rice. Conserved 5'-noncoding sequences among the maize, sorghum, and rice rab16/17 homologs were identified with the aid of the software program FootPrinter and by screening for known transcription-factor-binding sites. Searches for 7 of 8 (7/8)bp sequence matches within aligned 5'-noncoding segments of the rab genes identified many of the cis-elements previously characterized by biochemical analysis in maize rab17 plus several additional putative regulatory elements. Differences in the composition of conserved noncoding sequences among rab16/17 genes were related to variation in rab gene mRNA levels in different tissues and to response to ABA treatment using qRT-PCR. Absence of a GRA-like element in the promoter of sorghum dhn2 relative to maize rab17 was correlated with an approximately 85-fold reduction of dhn2 RNA in sorghum shoots. Overall, we conclude that phylogenetic analysis of gene families among rice, sorghum, and maize will help identify regulatory sequences in the noncoding regions of genes and contribute to our understanding of grass gene regulatory networks.
Project description:BACKGROUND AND AIMS: Inflorescence forms of panicoid grasses (Panicoideae s.s.) are remarkably diverse and they look very labile to human eyes; however, when performing a close inspection one can identify just a small subset of inflorescence types among a huge morphospace of possibilities. Consequently, some evolutionary constraints have restricted, to some extent, the diversification of their inflorescence. Developmental and genetic mechanisms, the photosynthetic type and plant longevity have been postulated as candidate constraints for angiosperms and panicoids in particular; however, it is not clear how these factors operate and which of these have played a key role during the grass inflorescence evolution. To gain insight into this matter the macroevolutionary aspects of panicoid inflorescences are investigated. METHODS: The inflorescence aspect (lax versus condensed), homogenization, truncation of the terminal spikelet, plant longevity and photosynthetic type were the traits selected for this study. Maximum likelihood and Bayesian Markov chain Monte Carlo methods were used to test different models of evolution and to evaluate the existence of evolutionary correlation among the traits. Both, models and evolutionary correlation were tested and analysed in a phylogenetic context by plotting the characters on a series of trees. For those cases in which the correlation was confirmed, test of contingency and order of trait acquisition were preformed to explore further the patterns of such co-evolution. KEY RESULTS: The data reject the independent model of inflorescence trait evolution and confirmed the existence of evolutionary contingency. The results support the general trend of homogenization being a prerequisite for the loss of the terminal spikelet of the main axis. There was no evidence for temporal order in the gain of homogenization and condensation; consequently, the homogenization and condensation could occur simultaneously. The correlation between inflorescence traits with plant longevity and photosynthetic type is not confirmed. CONCLUSIONS: The findings indicate that the lability of the panicoid inflorescence is apparent, not real. The results indicate that the history of the panicoids inflorescence is a combination of inflorescence trait contingency and order of character acquisition. These indicate that developmental and genetic mechanisms may be important constraints that have limited the diversification of the inflorescence form in panicoid grasses.
Project description:Orthologous adh regions of the sorghum and maize genomes were sequenced and analyzed. Nine known or candidate genes, including adh1, were found in a 225-kilobase (kb) maize sequence. In a 78-kb space of sorghum, the nine homologues of the maize genes were identified in a colinear order, plus five additional genes. The major fraction of DNA in maize, occupying 166 kb (74%), is represented by 22 long terminal repeat (LTR) retrotransposons. About 6% of the sequence belongs to 33 miniature inverted-repeat transposable elements (MITEs), remnants of DNA transposons, 4 simple sequence repeats, and low-copy-number DNAs of unknown origin. In contrast, no LTR retroelements were detected in the orthologous sorghum region. The unconserved sorghum DNA is composed of 20 putative MITEs, transposon-like elements, 5 simple sequence repeats, and low-copy-number DNAs of unknown origin. No MITEs were discovered in the 166 kb of DNA occupied by the maize LTR retrotransposons. In both species, MITEs were found in the space between genes and inside introns, indicating specific insertion and/or retention for these elements. Two adjacent sorghum genes, including one gene missing in maize, had colinear homologues on Arabidopsis chromosome IV, suggesting two rearrangements in the sorghum and three in the maize genome in comparison to a four-gene region of Arabidopsis. Hence, multiple small rearrangements may be present even in largely colinear genomic regions. These studies revealed a much higher degree of diversity at a microstructural level than predicted by genetic mapping studies for closely related grass species, as well as for comparisons of monocots and dicots.
Project description:BACKGROUND: Foxtail millet (Setaria italica (L.) P. Beauv.), one of the most ancient domesticated crops, is becoming a model system for studying biofuel crops and comparative genomics in the grasses. However, knowledge on the level of genetic diversity and linkage disequilibrium (LD) is very limited in this crop and its wild ancestor, green foxtail (Setaria viridis (L.) P. Beauv.). Such information would help us to understand the domestication process of cultivated species and will allow further research in these species, including association mapping and identification of agricultural significant genes involved in domestication. RESULTS: In this study, we surveyed DNA sequence for nine loci across 50 accessions of cultivated foxtail millet and 34 of its wild progenitor. We found a low level of genetic diversity in wild green foxtail (? = 0.0059), ? means Watterson's estimator of ?. Despite of a 55% loss of its wild diversity, foxtail millet still harbored a considerable level of diversity (? = 0.0027) when compared to rice and sorghum (? = 0.0024 and 0.0034, respectively). The level of LD in the domesticated foxtail millet extends to 1 kb, while it decayed rapidly to a negligible level within 150 bp in wild green foxtail. Using coalescent simulation, we estimated the bottleneck severity at k = 0.6095 when ?/? = 1. These results indicated that the domestication bottleneck of foxtail millet was more severe than that of maize but slightly less pronounced than that of rice. CONCLUSIONS: The results in this study establish a general framework for the domestication history of foxtail millet. The low level of genetic diversity and the increased level of LD in foxtail millet are mainly caused by a population bottleneck, although gene flow from foxtail millet to green foxtail is another factor that may have shaped the pattern of genetic diversity of these two related gene pools. The knowledge provided in this study will benefit future population based studies in foxtail millet.