Project description:BACKGROUND: Plant polyphenol oxidases (PPOs) are enzymes that typically use molecular oxygen to oxidize ortho-diphenols to ortho-quinones. These commonly cause browning reactions following tissue damage, and may be important in plant defense. Some PPOs function as hydroxylases or in cross-linking reactions, but in most plants their physiological roles are not known. To better understand the importance of PPOs in the plant kingdom, we surveyed PPO gene families in 25 sequenced genomes from chlorophytes, bryophytes, lycophytes, and flowering plants. The PPO genes were then analyzed in silico for gene structure, phylogenetic relationships, and targeting signals. RESULTS: Many previously uncharacterized PPO genes were uncovered. The moss, Physcomitrella patens, contained 13 PPO genes and Selaginella moellendorffii (spike moss) and Glycine max (soybean) each had 11 genes. Populus trichocarpa (poplar) contained a highly diversified gene family with 11 PPO genes, but several flowering plants had only a single PPO gene. By contrast, no PPO-like sequences were identified in several chlorophyte (green algae) genomes or Arabidopsis (A. lyrata and A. thaliana). We found that many PPOs contained one or two introns often near the 3' terminus. Furthermore, N-terminal amino acid sequence analysis using ChloroP and TargetP 1.1 predicted that several putative PPOs are synthesized via the secretory pathway, a unique finding as most PPOs are predicted to be chloroplast proteins. Phylogenetic reconstruction of these sequences revealed that large PPO gene repertoires in some species are mostly a consequence of independent bursts of gene duplication, while the lineage leading to Arabidopsis must have lost all PPO genes. CONCLUSION: Our survey identified PPOs in gene families of varying sizes in all land plants except in the genus Arabidopsis. While we found variation in intron numbers and positions, overall PPO gene structure is congruent with the phylogenetic relationships based on primary sequence data. The dynamic nature of this gene family differentiates PPO from other oxidative enzymes, and is consistent with a protein important for a diversity of functions relating to environmental adaptation.
Project description:Gene duplication, expansion, and subsequent diversification are features of the evolutionary process. Duplicated genes can be lost, modified, or altered to generate novel functions over evolutionary timescales. These features make gene duplication a powerful engine of evolutionary change. In this study, we explore these features in the MADF-BESS family of transcriptional regulators. In Drosophila melanogaster, the family contains 16 similar members, each containing an N-terminal, DNA-binding MADF domain and a C-terminal, protein-interacting, BESS domain. Phylogenetic analysis shows that members of the MADF-BESS family are expanded in the Drosophila lineage. Three members, which we name hinge1, hinge2, and hinge3 are required for wing development, with a critical role in the wing hinge. hinge1 is a negative regulator of Winglesss expression and interacts with core wing-hinge patterning genes such as teashirt, homothorax, and jing. Double knockdowns along with heterologous rescue experiments are used to demonstrate that members of the MADF-BESS family retain function in the wing hinge, in spite of expansion and diversification for over 40 million years. The wing hinge connects the blade to the thorax and has critical roles in fluttering during flight. MADF-BESS family genes appear to retain redundant functions to shape and form elements of the wing hinge in a robust and fail-safe manner.
Project description:We developed the CLfinder-OrthNet pipeline that detects co-linearity among multiple closely related genomes, finds orthologous gene groups, and encodes the evolutionary history of each orthologue group into a representative network (OrthNet). Using a search based on network topology, we identified 1,394 OrthNets that included gene transposition-duplication (tr-d) events, out of 17,432 identified in six Brassicaceae genomes. Occurrences of tr-d shared by subsets of Brassicaceae genomes mirrored the divergence times between the genomes and their repeat contents. The majority of tr-d events resulted in truncated open reading frames (ORFs) in the duplicated loci. However, the duplicates with complete ORFs were significantly more frequent than expected from random events. These were derived from older tr-d events and had a higher chance of being expressed. We also found an enrichment of tr-d events with complete loss of intergenic sequence conservation between the original and duplicated loci. Finally, we identified tr-d events uniquely found in two extremophytes among the six Brassicaceae genomes, including tr-d of SALT TOLERANCE 32 and ZINC TRANSPORTER 3 that relate to their adaptive evolution. CLfinder-OrthNet provides a flexible toolkit to compare gene order, visualize evolutionary paths among orthologues as networks, and identify gene loci that share an evolutionary history.
Project description:BackgroundPapain-like cysteine proteases (PLCPs), a large group of cysteine proteases structurally related to papain, play important roles in plant development, senescence, and defense responses. Papain, the first cysteine protease whose structure was determined by X-ray crystallography, plays a crucial role in protecting papaya from herbivorous insects. Except the four major PLCPs purified and characterized in papaya latex, the rest of the PLCPs in papaya genome are largely unknown.ResultsWe identified 33 PLCP genes in papaya genome. Phylogenetic analysis clearly separated plant PLCP genes into nine subfamilies. PLCP genes are not equally distributed among the nine subfamilies and the number of PLCPs in each subfamily does not increase or decrease proportionally among the seven selected plant species. Papaya showed clear lineage-specific gene expansion in the subfamily III. Interestingly, all four major PLCPs purified from papaya latex, including papain, chymopapain, glycyl endopeptidase and caricain, were grouped into the lineage-specific expansion branch in the subfamily III. Mapping PLCP genes on chromosomes of five plant species revealed that lineage-specific expansions of PLCP genes were mostly derived from tandem duplications. We estimated divergence time of papaya PLCP genes of subfamily III. The major duplication events leading to lineage-specific expansion of papaya PLCP genes in subfamily III were estimated at 48 MYA, 34 MYA, and 16 MYA. The gene expression patterns of the papaya PLCP genes in different tissues were assessed by transcriptome sequencing and qRT-PCR. Most of the papaya PLCP genes of subfamily III expressed at high levels in leaf and green fruit tissues.ConclusionsTandem duplications played the dominant role in affecting copy number of PLCPs in plants. Significant variations in size of the PLCP subfamilies among species may reflect genetic adaptation of plant species to different environments. The lineage-specific expansion of papaya PLCPs of subfamily III might have been promoted by the continuous reciprocal selective effects of herbivore attack and plant defense.
Project description:In mammals, IFIT (Interferon [IFN]-induced proteins with Tetratricopeptide Repeat [TPR] motifs) family genes are involved in many cellular and viral processes, which are tightly related to mammalian IFN response. However, little is known about non-mammalian IFIT genes. In the present study, IFIT genes are identified in the genome databases from the jawed vertebrates including the cartilaginous elephant shark but not from non-vertebrates such as lancelet, sea squirt and acorn worm, suggesting that IFIT gene family originates from a vertebrate ancestor about 450 million years ago. IFIT family genes show conserved gene structure and gene arrangements. Phylogenetic analyses reveal that this gene family has expanded through lineage-specific and species-specific gene duplication. Interestingly, IFN gene family seem to share a common ancestor and a similar evolutionary mechanism; the function link of IFIT genes to IFN response is present early since the origin of both gene families, as evidenced by the finding that zebrafish IFIT genes are upregulated by fish IFNs, poly(I:C) and two transcription factors IRF3/IRF7, likely via the IFN-stimulated response elements (ISRE) within the promoters of vertebrate IFIT family genes. These coevolution features creates functional association of both family genes to fulfill a common biological process, which is likely selected by viral infection during evolution of vertebrates. Our results are helpful for understanding of evolution of vertebrate IFN system.
Project description:Rosids are a monophyletic group that includes approximately 70,000 species in 140 families, and they are found in a variety of habitats and life forms. Many important crops such as fruit trees and legumes are rosids. The evolutionary success of this group may have been influenced by their ability to produce flavonoids, secondary metabolites that are synthetized through a branch of the phenylpropanoid pathway where chalcone synthase is a key enzyme. In this work, we studied the evolution of the chalcone synthase gene family in 12 species belonging to the rosid clade. Our results show that the last common ancestor of the rosid clade possessed six chalcone synthase gene lineages that were differentially retained during the evolutionary history of the group. In fact, of the six gene lineages that were present in the last common ancestor, 7 species retained 2 of them, whereas the other 5 only retained one gene lineage. We also show that one of the gene lineages was disproportionately expanded in species that belonged to the order Fabales (soybean, barrel medic and Lotus japonicas). Based on the available literature, we suggest that this gene lineage possesses stress-related biological functions (e.g., response to UV light, pathogen defense). We propose that the observed expansion of this clade was a result of a selective pressure to increase the amount of enzymes involved in the production of phenylpropanoid pathway-derived secondary metabolites, which is consistent with the hypothesis that suggested that lineage-specific expansions fuel plant adaptation.
Project description:BackgroundThe threespine stickleback (Gasterosteus aculeatus) has a characteristic reproductive mode; mature males build nests using a secreted glue-like protein called spiggin. Although recent studies reported multiple occurrences of genes that encode this glue-like protein spiggin in threespine and ninespine sticklebacks, it is still unclear how many genes compose the spiggin multi-gene family.ResultsGenome sequence analysis of threespine stickleback showed that there are at least five spiggin genes and two pseudogenes, whereas a single spiggin homolog occurs in the genomes of other fishes. Comparative genome sequence analysis demonstrated that Muc19, a single-copy mucous gene in human and mouse, is an ortholog of spiggin. Phylogenetic and molecular evolutionary analyses of these sequences suggested that an ancestral spiggin gene originated from a member of the mucin gene family as a single gene in the common ancestor of teleosts, and gene duplications of spiggin have occurred in the stickleback lineage. There was inter-population variation in the copy number of spiggin genes and positive selection on some codons, indicating that additional gene duplication/deletion events and adaptive evolution at some amino acid sites may have occurred in each stickleback population.ConclusionA number of spiggin genes exist in the threespine stickleback genome. Our results provide insight into the origin and dynamic evolutionary process of the spiggin multi-gene family in the threespine stickleback lineage. The dramatic evolution of genes for mucous substrates may have contributed to the generation of distinct characteristics such as "bio-glue" in vertebrates.
Project description:Ammonia-oxidising archaea of the phylum Thaumarchaeota are important organisms in the nitrogen cycle, but the mechanisms driving their radiation into diverse ecosystems remain underexplored. Here, existing thaumarchaeotal genomes are complemented with 12 genomes belonging to the previously under-sampled Nitrososphaerales to investigate the impact of lateral gene transfer (LGT), gene duplication and loss across thaumarchaeotal evolution. We reveal a major role for gene duplication in driving genome expansion subsequent to early LGT. In particular, two large LGT events are identified into Nitrososphaerales and the fate of these gene families is highly lineage-specific, being lost in some descendant lineages, but undergoing extensive duplication in others, suggesting niche-specific roles. Notably, some genes involved in carbohydrate transport or coenzyme metabolism were duplicated, likely facilitating niche specialisation in soils and sediments. Overall, our results suggest that LGT followed by gene duplication drives Nitrososphaerales evolution, highlighting a previously under-appreciated mechanism of genome expansion in archaea.
Project description:The comparative analysis of plant gene families in a phylogenetic framework has greatly accelerated due to advances in next generation sequencing. In this study, we provide an evolutionary analysis of the L-type lectin receptor kinase and L-type lectin domain proteins (L-type LecRKs and LLPs) that are considered as components in plant immunity, in the plant family Brassicaceae and related outgroups. We combine several lines of evidence provided by sequence homology, HMM-driven protein domain annotation, phylogenetic analysis, and gene synteny for large-scale identification of L-type LecRK and LLP genes within nine core-eudicot genomes. We show that both polyploidy and local duplication events (tandem duplication and gene transposition duplication) have played a major role in L-type LecRK and LLP gene family expansion in the Brassicaceae. We also find significant differences in rates of molecular evolution based on the mode of duplication. Additionally, we show that LLPs share a common evolutionary origin with L-type LecRKs and provide a consistent gene family nomenclature. Finally, we demonstrate that the largest and most diverse L-type LecRK clades are lineage-specific. Our evolutionary analyses of these plant immune components provide a framework to support future plant resistance breeding.
Project description:BackgroundGene duplication has had a major impact on genome evolution. Localized (or tandem) duplication resulting from unequal crossing over and whole genome duplication are believed to be the two dominant mechanisms contributing to vertebrate genome evolution. While much scrutiny has been directed toward discerning patterns indicative of whole-genome duplication events in teleost species, less attention has been paid to the continuous nature of gene duplications and their impact on the size, gene content, functional diversity, and overall architecture of teleost genomes.ResultsHere, using a Markov clustering algorithm directed approach we catalogue and analyze patterns of gene duplication in the four model teleost species with chromosomal coordinates: zebrafish, medaka, stickleback, and Tetraodon. Our analyses based on set size, duplication type, synonymous substitution rate (Ks), and gene ontology emphasize shared and lineage-specific patterns of genome evolution via gene duplication. Most strikingly, our analyses highlight the extraordinary duplication and retention rate of recent duplicates in zebrafish and their likely role in the structural and functional expansion of the zebrafish genome. We find that the zebrafish genome is remarkable in its large number of duplicated genes, small duplicate set size, biased Ks distribution toward minimal mutational divergence, and proportion of tandem and intra-chromosomal duplicates when compared with the other teleost model genomes. The observed gene duplication patterns have played significant roles in shaping the architecture of teleost genomes and appear to have contributed to the recent functional diversification and divergence of important physiological processes in zebrafish.ConclusionsWe have analyzed gene duplication patterns and duplication types among the available teleost genomes and found that a large number of genes were tandemly and intrachromosomally duplicated, suggesting their origin of independent and continuous duplication. This is particularly true for the zebrafish genome. Further analysis of the duplicated gene sets indicated that a significant portion of duplicated genes in the zebrafish genome were of recent, lineage-specific duplication events. Most strikingly, a subset of duplicated genes is enriched among the recently duplicated genes involved in immune or sensory response pathways. Such findings demonstrated the significance of continuous gene duplication as well as that of whole genome duplication in the course of genome evolution.