Applying the latest advances in genomics and phenomics for trait discovery in polyploid wheat.
ABSTRACT: Improving traits in wheat has historically been challenging due to its large and polyploid genome, limited genetic diversity and in-field phenotyping constraints. However, within recent years many of these barriers have been lowered. The availability of a chromosome-level assembly of the wheat genome now facilitates a step-change in wheat genetics and provides a common platform for resources, including variation data, gene expression data and genetic markers. The development of sequenced mutant populations and gene-editing techniques now enables the rapid assessment of gene function in wheat directly. The ability to alter gene function in a targeted manner will unmask the effects of homoeolog redundancy and allow the hidden potential of this polyploid genome to be discovered. New techniques to identify and exploit the genetic diversity within wheat wild relatives now enable wheat breeders to take advantage of these additional sources of variation to address challenges facing food production. Finally, advances in phenomics have unlocked rapid screening of populations for many traits of interest both in greenhouses and in the field. Looking forwards, integrating diverse data types, including genomic, epigenetic and phenomics data, will take advantage of big data approaches including machine learning to understand trait biology in wheat in unprecedented detail.
Project description:Bread wheat (Triticum aestivum L.) is an allopolyploid species containing three ancestral genomes. Therefore, three homoeologous copies exist for the majority of genes in the wheat genome. Whether different homoeologs are differentially expressed (homoeolog expression bias) in response to biotic and abiotic stresses is poorly understood. In this study, we applied a RNA-seq approach to analyse homoeolog-specific global gene expression patterns in wheat during infection by the fungal pathogen Fusarium pseudograminearum, which causes crown rot disease in cereals. To ensure specific detection of homoeologs, we first optimized read alignment methods and validated the results experimentally on genes with known patterns of subgenome-specific expression. Our global analysis identified widespread patterns of differential expression among homoeologs, indicating homoeolog expression bias underpins a large proportion of the wheat transcriptome. In particular, genes differentially expressed in response to Fusarium infection were found to be disproportionately contributed from B and D subgenomes. In addition, we found differences in the degree of responsiveness to pathogen infection among homoeologous genes with B and D homoeologs exhibiting stronger responses to pathogen infection than A genome copies. We call this latter phenomenon as 'homoeolog induction bias'. Understanding how homoeolog expression and induction biases operate may assist the improvement of biotic stress tolerance in wheat and other polyploid crop species.
Project description:BACKGROUND: Interaction between parental genomes is accompanied by global changes in gene expression which, eventually, contributes to growth vigor and the broader phenotypic diversity of allopolyploid species. In order to gain a better understanding of the effects of allopolyploidization on the regulation of diverged gene networks, we performed a genome-wide analysis of homoeolog-specific gene expression in re-synthesized allohexaploid wheat created by the hybridization of a tetraploid derivative of hexaploid wheat with the diploid ancestor of the wheat D genome Ae. tauschii. RESULTS: Affymetrix wheat genome arrays were used for both the discovery of divergent homoeolog-specific mutations and analysis of homoeolog-specific gene expression in re-synthesized allohexaploid wheat. More than 34,000 detectable parent-specific features (PSF) distributed across the wheat genome were used to assess AB genome (could not differentiate A and B genome contributions) and D genome parental expression in the allopolyploid transcriptome. In re-synthesized polyploid 81% of PSFs detected mid-parent levels of gene expression, and only 19% of PSFs showed the evidence of non-additive expression. Non-additive expression in both AB and D genomes was strongly biased toward up-regulation of parental type of gene expression with only 6% and 11% of genes, respectively, being down-regulated. Of all the non-additive gene expression, 84% can be explained by differences in the parental genotypes used to make the allopolyploid. Homoeolog-specific co-regulation of several functional gene categories was found, particularly genes involved in photosynthesis and protein biosynthesis in wheat. CONCLUSIONS: Here, we have demonstrated that the establishment of interactions between the diverged regulatory networks in allopolyploids is accompanied by massive homoeolog-specific up- and down-regulation of gene expression. This study provides insights into interactions between homoeologous genomes and their role in growth vigor, development, and fertility of allopolyploid species.
Project description:BACKGROUND: Bread wheat is one of the world's most important food crops and considerable efforts have been made to develop genomic resources for this species. This includes an on-going project by the International Wheat Genome Sequencing Consortium to assemble its large and complex genome, which is hexaploid and contains three closely related 'homoeologous' copies for each chromosome. This multi-national effort avoids the complications polyploidy entails for correct assembly of the genome by sequencing flow-sorted chromosome arms one at a time. Here we report on an alternate approach, a direct homoeolog-specific assembly of the expressed portion of the genome, the transcriptome. RESULTS: After assessment of the ability of various assemblers to generate homoeolog-specific assemblies, we employed a two-stage assembly process to produce a high-quality assembly of the transcriptome of hexaploid wheat from Roche-454 and Illumina GAIIx paired-end sequence reads. The assembly process made use of a rapid partitioning of expressed sequences into homoeologous clusters, followed by a parallel high-fidelity assembly of each cluster on a 1150-processor compute cloud. We assessed assembly quality through comparison to known wheat gene sequences and found that in ca. 98.5% of cases the assembly was sufficiently accurate for homoeologous triplets to be cleanly separated into either two or three separate contigs. Comparison to publicly available transcript collections suggests that the assembly covers ~75-80% of the complete transcriptome. CONCLUSIONS: This work therefore describes the first homoeolog-specific sequence assembly of the wheat transcriptome and provides a reference transcriptome for future wheat research. Furthermore, our assembly methodology is transferable to other polyploid organisms.
Project description:Graphical abstract Highlights • Phenome mirrors the expression of a genome, metabolic traits rely on the phenotype.• Phenomics may provide data to depict the microbial genotypic-phenotypic landscape.• Phenotype switching tracks short-term environmental pressure on microbial metabolism.• Meta-phenomics studies the physiological state of microbial meta-communities.• The application of novel data analysis approaches for phenomics has been limited. The phenotype-genotype landscape is a projection coming from detailed phenotypic and genotypic data under environmental pressure. Although phenome of microbes or microbial consortia mirrors the functional expression of a genome or set of genomes, metabolic traits rely on the phenotype. Phenomics has the potential to revolution functional genomics. In this review, we discuss why and how phenomics was developed. We described how phenomics may extend our understanding of the assembly of microbial consortia and their functionality, and then we outlined the novel applications within the study of phenomes using Omnilog platform together with a revision of its current application to study lactic acid bacteria (LAB) metabolic traits during food processing. LAB were proposed as a suitable model system to analyze and discuss the implementation and exploitation of this emerging omics approach. We introduced the ‘phenotype switching’, as a new phenotype microarray approach to get insights in bacterial physiology. An overview of methodologies and tools to manage and analyze the generated data was provided. Finally, pro and cons of pipelines developed so far, including the most innovative ones were critically analyzed. We propose an R pipeline, recently deposited, which allows to automatically analyze Omnilog data integrating the latest approaches and implementing the new concepts described here.
Project description:Phenomics is the comprehensive study of phenotypes at every level of biology: from metabolites to organisms. With high throughput technologies increasing the scope of biological discoveries, the field of phenomics has been developing rapid and precise methods to collect, catalog, and analyze phenotypes. Such methods have allowed phenotypic data to be widely used in medical applications, from assisting clinical diagnoses to prioritizing genomic diagnoses. To channel the benefits of phenomics into the field of inborn errors of metabolism (IEM), we have recently launched IEMbase, an expert-curated knowledgebase of IEM and their disease-characterizing phenotypes. While our efforts with IEMbase have realized benefits, taking full advantage of phenomics requires a comprehensive curation of IEM phenotypes in core phenomics projects, which is dependent upon contributions from the IEM clinical and research community. Here, we assess the inclusion of IEM biochemical phenotypes in a core phenomics project, the Human Phenotype Ontology. We then demonstrate the utility of biochemical phenotypes using a text-based phenomics method to predict gene-disease relationships, showing that the prediction of IEM genes is significantly better using biochemical rather than clinical profiles. The findings herein provide a motivating goal for the IEM community to expand the computationally accessible descriptions of biochemical phenotypes associated with IEM in phenomics resources.
Project description:In polyploid genomes, homoeologs are a specific subtype of homologs, and can be thought of as orthologs between subgenomes. In Orthologous MAtrix, we infer homoeologs in three polyploid plant species: upland cotton (Gossypium hirsutum), rapeseed (Brassica napus), and bread wheat (Triticum aestivum). While we can typically recognize the features of a "good" homoeolog prediction (a consistent evolutionary distance, high synteny, and a one-to-one relationship), none of them is a hard-fast criterion. We devised a novel fuzzy logic-based method to assign confidence scores to each pair of predicted homoeologs. We inferred homoeolog pairs and used the new and improved method to assign confidence scores, which ranged from 0 to 100. Most confidence scores were between 70 and 100, but the distribution varied between genomes. The new confidence scores show an improvement over our previous method and were manually evaluated using a subset from various confidence ranges.
Project description:BACKGROUND: Polyploidy (whole-genome duplication) is an important speciation mechanism, particularly in plants. Gene loss, silencing, and the formation of novel gene complexes are some of the consequences that the new polyploid genome may experience. Despite the recurrent nature of polyploidy, little is known about the genomic outcome of independent polyploidization events. Here, we analyze the fate of genes duplicated by polyploidy (homoeologs) in multiple individuals from ten natural populations of Tragopogon miscellus (Asteraceae), all of which formed independently from T. dubius and T. pratensis less than 80 years ago. RESULTS: Of the 13 loci analyzed in 84 T. miscellus individuals, 11 showed loss of at least one parental homoeolog in the young allopolyploids. Two loci were retained in duplicate for all polyploid individuals included in this study. Nearly half (48%) of the individuals examined lost a homoeolog of at least one locus, with several individuals showing loss at more than one locus. Patterns of loss were stochastic among individuals from the independently formed populations, except that the T. dubius copy was lost twice as often as T. pratensis. CONCLUSION: This study represents the most extensive survey of the fate of genes duplicated by allopolyploidy in individuals from natural populations. Our results indicate that the road to genome downsizing and ultimate genetic diploidization may occur quickly through homoeolog loss, but with some genes consistently maintained as duplicates. Other genes consistently show evidence of homoeolog loss, suggesting repetitive aspects to polyploid genome evolution.
Project description:Allopolyploidy is an evolutionary and mechanistically intriguing process, in that it entails the reconciliation of two or more sets of diverged genomes and regulatory interactions. In this study, we explored gene expression patterns in interspecific hybrid F(1), and synthetic and natural allopolyploid cotton using RNA-Seq reads from leaf transcriptomes. We determined how the extent and direction of expression level dominance (total level of expression for both homoeologs) and homoeolog expression bias (relative contribution of homoeologs to the transcriptome) changed from hybridization through evolution at the polyploid level and following cotton domestication. Genome-wide expression level dominance was biased toward the A-genome in the diploid hybrid and natural allopolyploids, whereas the direction was reversed in the synthetic allopolyploid. This biased expression level dominance was mainly caused by up- or downregulation of the homoeolog from the 'non-dominant' parent. Extensive alterations in homoeolog expression bias and expression level dominance accompany the initial merger of two diverged diploid genomes, suggesting a combination of regulatory (cis or trans) and epigenetic interactions that may arise and propagate through the transcriptome network. The extent of homoeolog expression bias and expression level dominance increases over time, from genome merger through evolution at the polyploid level. Higher rates of transgressive and novel gene expression patterns as well as homoeolog silencing were observed in natural allopolyploids than in F(1) hybrid and synthetic allopolyploid cottons. These observations suggest that natural selection reconciles the regulatory mismatches caused by initial genomic merger, while new gene expression conditions are generated for evaluation by selection.
Project description:The phylogenies of allopolyploids take the shape of networks and cannot be adequately represented as bifurcating trees. Especially for high polyploids (i.e., organisms with more than six sets of nuclear chromosomes), the signatures of gene homoeolog loss, deep coalescence, and polyploidy may become confounded, with the result that gene trees may be congruent with more than one species network. Herein, we obtained the most parsimonious species network by objective comparison of competing scenarios involving polyploidization and homoeolog loss in a high-polyploid lineage of violets (Viola, Violaceae) mostly or entirely restricted to North America, Central America, or Hawaii. We amplified homoeologs of the low-copy nuclear gene, glucose-6-phosphate isomerase (GPI), by single-molecule polymerase chain reaction (PCR) and the chloroplast trnL-F region by conventional PCR for 51 species and subspecies. Topological incongruence among GPI homoeolog subclades, owing to deep coalescence and two instances of putative loss (or lack of detection) of homoeologs, were reconciled by applying the maximum tree topology for each subclade. The most parsimonious species network and the fossil-based calibration of the homoeolog tree favored monophyly of the high polyploids, which has resulted from allodecaploidization 9-14 Ma, involving sympatric ancestors from the extant Viola sections Chamaemelanium (diploid), Plagiostigma (paleotetraploid), and Viola (paleotetraploid). Although two of the high-polyploid lineages (Boreali-Americanae, Pedatae) remained decaploid, recurrent polyploidization with tetraploids of section Plagiostigma within the last 5 Ma has resulted in two 14-ploid lineages (Mexicanae, Nosphinium) and one 18-ploid lineage (Langsdorffianae). This implies a more complex phylogenetic and biogeographic origin of the Hawaiian violets (Nosphinium) than that previously inferred from rDNA data and illustrates the necessity of considering polyploidy in phylogenetic and biogeographic reconstruction.
Project description:Bread wheat (Triticum aestivum, 2n = 6x = 42, AABBDD) has a complex allohexaploid genome, which makes it difficult to differentiate between the homoeologous sequences and assign them to the chromosome A, B, or D subgenomes. The chromosome-based draft genome sequence of the 'Chinese Spring' common wheat cultivar enables the large-scale development of polymerase chain reaction (PCR)-based markers specific for homoeologs. Based on high-confidence 'Chinese Spring' genes with known functions, we developed 183 putative homoeolog-specific markers for chromosomes 4B and 7B. These markers were used in PCR assays for the 4B and 7B nullisomes and their euploid synthetic hexaploid wheat (SHW) line that was newly generated from a hybridization between Triticum turgidum (AABB) and the wild diploid species Aegilops tauschii (DD). Up to 64% of the markers for chromosomes 4B or 7B in the SHW background were confirmed to be homoeolog-specific. Thus, these markers were highly transferable between the 'Chinese Spring' bread wheat and SHW lines. Homoeolog-specific markers designed using genes with known functions may be useful for genetic investigations involving homoeologous chromosome tracking and homoeolog expression and interaction analyses.