Genome duplication and gene loss affect the evolution of heat shock transcription factor genes in legumes.
ABSTRACT: Whole-genome duplication events (polyploidy events) and gene loss events have played important roles in the evolution of legumes. Here we show that the vast majority of Hsf gene duplications resulted from whole genome duplication events rather than tandem duplication, and significant differences in gene retention exist between species. By searching for intraspecies gene colinearity (microsynteny) and dating the age distributions of duplicated genes, we found that genome duplications accounted for 42 of 46 Hsf-containing segments in Glycine max, while paired segments were rarely identified in Lotus japonicas, Medicago truncatula and Cajanus cajan. However, by comparing interspecies microsynteny, we determined that the great majority of Hsf-containing segments in Lotus japonicas, Medicago truncatula and Cajanus cajan show extensive conservation with the duplicated regions of Glycine max. These segments formed 17 groups of orthologous segments. These results suggest that these regions shared ancient genome duplication with Hsf genes in Glycine max, but more than half of the copies of these genes were lost. On the other hand, the Glycine max Hsf gene family retained approximately 75% and 84% of duplicated genes produced from the ancient genome duplication and recent Glycine-specific genome duplication, respectively. Continuous purifying selection has played a key role in the maintenance of Hsf genes in Glycine max. Expression analysis of the Hsf genes in Lotus japonicus revealed their putative involvement in multiple tissue-/developmental stages and responses to various abiotic stimuli. This study traces the evolution of Hsf genes in legume species and demonstrates that the rates of gene gain and loss are far from equilibrium in different species.
Project description:Lipoxygenase (LOX) genes are widely distributed in plants and play crucial roles in resistance to biotic and abiotic stress. Although they have been characterized in various plants, little is known about the evolution of legume LOX genes. In this study, we identified 122 full-length LOX genes in Arachis duranensis, Arachis ipaënsis, Cajanus cajan, Cicer arietinum, Glycine max, Lotus japonicus and Medicago truncatula. In total, 64 orthologous and 36 paralogous genes were identified. The full-length, polycystin-1, lipoxygenase, alpha-toxin (PLAT) and lipoxygenase domain sequences from orthologous and paralogous genes exhibited a signature of purifying selection. However, purifying selection influenced orthologues more than paralogues, indicating greater functional conservation of orthologues than paralogues. Neutrality and effective number of codons plot results showed that natural selection primarily shapes codon usage, except for C. arietinum, L. japonicas and M. truncatula LOX genes. GCG, ACG, UCG, CGG and CCG codons exhibited low relative synonymous codon usage (RSCU) values, while CCA, GGA, GCU, CUU and GUU had high RSCU values, indicating that the latter codons are strongly preferred. LOX expression patterns differed significantly between wild-type peanut and cultivated peanut infected with Aspergillus flavus, which could explain the divergent disease resistance of wild progenitor and cultivars.
Project description:Single-nucleotide polymorphisms (SNPs, >2000) were discovered by using RNA-seq and allele-specific sequencing approaches in pigeonpea (Cajanus cajan). For making the SNP genotyping cost-effective, successful competitive allele-specific polymerase chain reaction (KASPar) assays were developed for 1616 SNPs and referred to as PKAMs (pigeonpea KASPar assay markers). Screening of PKAMs on 24 genotypes [23 from cultivated species and 1 wild species (Cajanus scarabaeoides)] defined a set of 1154 polymorphic markers (77.4%) with a polymorphism information content (PIC) value from 0.04 to 0.38. One thousand and ninety-four PKAMs showed polymorphisms between parental lines of the reference mapping population (C. cajan ICP 28 × C. scarabaeoides ICPW 94). By using high-quality marker genotyping data on 167 F(2) lines from the population, a comprehensive genetic map comprising 875 PKAMs with an average inter-marker distance of 1.11 cM was developed. Previously mapped 35 simple sequence repeat markers were integrated into the PKAM map and an integrated genetic map of 996.21 cM was constructed. Mapped PKAMs showed a higher degree of synteny with the genome of Glycine max followed by Medicago truncatula and Lotus japonicus and least with Vigna unguiculata. These PKAMs will be useful for genetics research and breeding applications in pigeonpea and for utilizing genome information from other legume species.
Project description:Plant bZIP proteins characteristically harbor a highly conserved bZIP domain with two structural features: a DNA-binding basic region and a leucine (Leu) zipper dimerization region. They have been shown to be diverse transcriptional regulators, playing crucial roles in plant development, physiological processes, and biotic/abiotic stress responses. Despite the availability of six completely sequenced legume genomes, a comprehensive investigation of bZIP family members in legumes has yet to be presented.In this study, we identified 428 bZIP genes encoding 585 distinct proteins in six legumes, Glycine max, Medicago truncatula, Phaseolus vulgaris, Cicer arietinum, Cajanus cajan, and Lotus japonicus. The legume bZIP genes were categorized into 11 groups according to their phylogenetic relationships with genes from Arabidopsis. Four kinds of intron patterns (a-d) within the basic and hinge regions were defined and additional conserved motifs were identified, both presenting high group specificity and supporting the group classification. We predicted the DNA-binding patterns and the dimerization properties, based on the characteristic features in the basic and hinge regions and the Leu zipper, respectively, which indicated that some highly conserved amino acid residues existed across each major group. The chromosome distribution and analysis for WGD-derived duplicated blocks revealed that the legume bZIP genes have expanded mainly by segmental duplication rather than tandem duplication. Expression data further revealed that the legume bZIP genes were expressed constitutively or in an organ-specific, development-dependent manner playing roles in multiple seed developmental stages and tissues. We also detected several key legume bZIP genes involved in drought- and salt-responses by comparing fold changes of expression values in drought-stressed or salt-stressed roots and leaves.In summary, this genome-wide identification, characterization and expression analysis of legume bZIP genes provides valuable information for understanding the molecular functions and evolution of the legume bZIP transcription factor family, and highlights potential legume bZIP genes involved in regulating tissue development and abiotic stress responses.
Project description:Legumes play an important role as food and forage crops in international agriculture especially in developing countries. Legumes have a unique biological process called nitrogen fixation (NF) by which they convert atmospheric nitrogen to ammonia. Although legume genomes have undergone polyploidization, duplication and divergence, NF-related genes, because of their essential functional role for legumes, might have remained conserved. To understand the relationship of divergence and evolutionary processes in legumes, this study analyzes orthologs and paralogs for selected 20 NF-related genes by using comparative genomic approaches in six legumes i.e., Medicago truncatula (Mt), Cicer arietinum, Lotus japonicus, Cajanus cajan (Cc), Phaseolus vulgaris (Pv), and Glycine max (Gm). Subsequently, sequence distances, numbers of synonymous substitutions per synonymous site (Ks) and non-synonymous substitutions per non-synonymous site (Ka) between orthologs and paralogs were calculated and compared across legumes. These analyses suggest the closest relationship between Gm and Cc and the highest distance between Mt and Pv in six legumes. Ks proportional plots clearly showed ancient genome duplication in all legumes, whole genome duplication event in Gm and also speciation pattern in different legumes. This study also reports some interesting observations e.g., no peak at Ks 0.4 in Gm-Gm, location of two independent genes next to each other in Mt and low Ks values for outparalogs for three genes as compared to other 12 genes. In summary, this study underlines the importance of NF-related genes and provides important insights in genome organization and evolutionary aspects of six legume species analyzed.
Project description:Highly polymorphic and transferable microsatellites (SSRs) are important for comparative genomics, genome analysis and phylogenetic studies. Development of novel species-specific microsatellite markers remains a costly and labor-intensive project. Therefore, interest has been shifted from genomic to genic markers owing to their high inter-species transferability as they are developed from conserved coding regions of the genome. This study concentrates on comparative analysis of genic microsatellites in nine important legume (Arachis hypogaea, Cajanus cajan, Cicer arietinum, Glycine max, Lotus japonicus, Medicago truncatula, Phaseolus vulgaris, Pisum sativum and Vigna unguiculata) and two model plant species (Oryza sativa and Arabidopsis thaliana). Screening of a total of 228090 putative unique sequences spanning 219610522 bp using a microsatellite search tool, MISA, identified 12.18% of the unigenes containing 36248 microsatellite motifs excluding mononucleotide repeats. Frequency of legume unigene-derived SSRs was one SSR in every 6.0 kb of analyzed sequences. The trinucleotide repeats were predominant in all the unigenes with the exception of C. cajan, which showed prevalence of dinucleotide repeats over trinucleotide repeats. Dinucleotide repeats along with trinucleotides counted for more than 90% of the total microsatellites. Among dinucleotide and trinucleotide repeats, AG and AAG motifs, respectively, were the most frequent. Microsatellite positive chickpea unigenes were assigned Gene Ontology (GO) terms to identify the possible role of unigenes in various molecular and biological functions. These unigene based microsatellite markers will prove valuable for recording allelic variance across germplasm collections, gene tagging and searching for putative candidate genes.
Project description:The Nictaba family groups all proteins that show homology to Nictaba, the tobacco lectin. So far, Nictaba and an Arabidopsis thaliana homologue have been shown to be implicated in the plant stress response. The availability of more than 50 sequenced plant genomes provided the opportunity for a genome-wide identification of Nictaba -like genes in 15 species, representing members of the Fabaceae, Poaceae, Solanaceae, Musaceae, Arecaceae, Malvaceae and Rubiaceae. Additionally, phylogenetic relationships between the different species were explored. Furthermore, this study included domain organization analysis, searching for orthologous genes in the legume family and transcript profiling of the Nictaba -like lectin genes in soybean.Using a combination of BLASTp, InterPro analysis and hidden Markov models, the genomes of Medicago truncatula , Cicer arietinum , Lotus japonicus , Glycine max , Cajanus cajan , Phaseolus vulgaris , Theobroma cacao , Solanum lycopersicum , Solanum tuberosum , Coffea canephora , Oryza sativa , Zea mays, Sorghum bicolor , Musa acuminata and Elaeis guineensis were searched for Nictaba -like genes. Phylogenetic analysis was performed using RAxML and additional protein domains in the Nictaba-like sequences were identified using InterPro. Expression analysis of the soybean Nictaba -like genes was investigated using microarray data.Nictaba -like genes were identified in all studied species and analysis of the duplication events demonstrated that both tandem and segmental duplication contributed to the expansion of the Nictaba gene family in angiosperms. The single-domain Nictaba protein and the multi-domain F-box Nictaba architectures are ubiquitous among all analysed species and microarray analysis revealed differential expression patterns for all soybean Nictaba-like genes.Taken together, the comparative genomics data contributes to our understanding of the Nictaba -like gene family in species for which the occurrence of Nictaba domains had not yet been investigated. Given the ubiquitous nature of these genes, they have probably acquired new functions over time and are expected to take on various roles in plant development and defence.
Project description:The narrow-leafed lupin (Lupinus angustifolius) was recently considered as a legume reference species. Genetic resources have been developed, including a draft genome sequence, linkage maps, nuclear DNA libraries, and cytogenetic chromosome-specific landmarks. Here, we used a complex approach, involving DNA fingerprinting, sequencing, genetic mapping, and molecular cytogenetics, to localize and analyze L. angustifolius gene-rich regions (GRRs). A L. angustifolius genomic bacterial artificial chromosome (BAC) library was screened with short sequence repeat (SSR)-based probes. Selected BACs were fingerprinted and assembled into contigs. BAC-end sequence (BES) annotation allowed us to choose clones for sequencing, targeting GRRs. Additionally, BESs were aligned to the scaffolds of the genome sequence. The genetic map was supplemented with 35 BES-derived markers, distributed in 14 linkage groups and tagging 37 scaffolds. The identified GRRs had an average gene density of 19.6 genes/100 kb and physical-to-genetic distance ratios of 11 to 109 kb/cM. Physical and genetic mapping was supported by multi-BAC-fluorescence in situ hybridization (FISH), and five new linkage groups were assigned to the chromosomes. Syntenic links to the genome sequences of five legume species (Medicago truncatula, Glycine max, Lotus japonicus, Phaseolus vulgaris, and Cajanus cajan) were identified. The comparative mapping of the two largest lupin GRRs provides novel evidence for ancient duplications in all of the studied species. These regions are conserved among representatives of the main clades of Papilionoideae. Furthermore, despite the complex evolution of legumes, some segments of the nuclear genome were not substantially modified and retained their quasi-ancestral structures. Cytogenetic markers anchored in these regions constitute a platform for heterologous mapping of legume genomes.
Project description:The mitogen-activated protein kinase (MAPK)-mediated phosphorylation cascade is a vital component of plant cellular signalling. Despite this, MAPK signalling cascade is less characterized in crop legumes. To fill this void, we present here a comprehensive phylogeny of MAPK kinases (MKKs) and MAPKs identified from 16 legume species belonging to genistoid (Lupinus angustifolius), dalbergioid (Arachis spp.), phaseoloid (Glycine max, Cajanus cajan, Phaseolus vulgaris, and Vigna spp.), and galegoid (Cicer arietinum, Lotus japonicus, Medicago truncatula, Pisum sativum, Trifolium spp., and Vicia faba) clades. Using the genes of the diploid crop chickpea (C. arietinum), an exhaustive interaction analysis was performed between MKKs and MAPKs by split-ubiquitin based yeast two-hybrid (Y2H). Twenty seven interactions of varying strengths were identified between chickpea MKKs and MAPKs. These interactions were verified in planta by bimolecular fluorescence complementation (BiFC). As a first report in plants, four intra-molecular interactions of weak strength were identified within chickpea MKKs. Additionally; two TEOSINTE-BRANCHED1/CYCLOIDEA/PCF (TCP) transcription factors of class I were identified as novel down-stream interacting partners of seven MAPKs. We propose that this highly reliable MAPK interaction network, presented here for chickpea, can be utilized as a reference for legumes and thus will help in deciphering their role in legume-specific events.
Project description:The complex, tetraploid genome structure of peanut (Arachis hypogaea) has obstructed advances in genetics and genomics in the species. The aim of this study is to understand the genome structure of Arachis by developing a high-density integrated consensus map. Three recombinant inbred line populations derived from crosses between the A genome diploid species, Arachis duranensis and Arachis stenosperma; the B genome diploid species, Arachis ipaënsis and Arachis magna; and between the AB genome tetraploids, A. hypogaea and an artificial amphidiploid (A. ipaënsis × A. duranensis)(4×), were used to construct genetic linkage maps: 10 linkage groups (LGs) of 544 cM with 597 loci for the A genome; 10 LGs of 461 cM with 798 loci for the B genome; and 20 LGs of 1442 cM with 1469 loci for the AB genome. The resultant maps plus 13 published maps were integrated into a consensus map covering 2651 cM with 3693 marker loci which was anchored to 20 consensus LGs corresponding to the A and B genomes. The comparative genomics with genome sequences of Cajanus cajan, Glycine max, Lotus japonicus, and Medicago truncatula revealed that the Arachis genome has segmented synteny relationship to the other legumes. The comparative maps in legumes, integrated tetraploid consensus maps, and genome-specific diploid maps will increase the genetic and genomic understanding of Arachis and should facilitate molecular breeding.
Project description:BACKGROUND:Legumes can establish on nitrogen-deprived soils a symbiotic interaction with Rhizobia bacteria, leading to the formation of nitrogen-fixing root nodules. Cytokinin phytohormones are critical for triggering root cortical cell divisions at the onset of nodule initiation. Cytokinin signaling is based on a Two-Component System (TCS) phosphorelay cascade, involving successively Cytokinin-binding Histidine Kinase receptors, phosphorelay proteins shuttling between the cytoplasm and the nucleus, and Type-B Response Regulator (RRB) transcription factors activating the expression of cytokinin primary response genes. Among those, Type-A Response Regulators (RRA) exert a negative feedback on the TCS signaling. To determine whether the legume plant nodulation capacity is linked to specific features of TCS proteins, a genome-wide identification was performed in six legume genomes (Cajanus cajan, pigeonpea; Cicer arietinum, chickpea; Glycine max, soybean; Phaseolus vulgaris, common bean; Lotus japonicus; Medicago truncatula). The diversity of legume TCS proteins was compared to the one found in two non-nodulating species, Arabidopsis thaliana and Vitis vinifera, which are references for functional analyses of TCS components and phylogenetic analyses, respectively. RESULTS:A striking expansion of non-canonical RRBs was identified, notably leading to the emergence of proteins where the conserved phosphor-accepting aspartate residue is replaced by a glutamate or an asparagine. M. truncatula genome-wide expression datasets additionally revealed that only a limited subset of cytokinin-related TCS genes is highly expressed in different organs, namely MtCHK1/MtCRE1, MtHPT1, and MtRRB3, suggesting that this "core" module potentially acts in most plant organs including nodules. CONCLUSIONS:Further functional analyses are required to determine the relevance of these numerous non-canonical TCS RRBs in symbiotic nodulation, as well as of canonical MtHPT1 and MtRRB3 core signaling elements.