Identification and characterization of insect-specific proteins by genome data analysis.
ABSTRACT: BACKGROUND: Insects constitute the vast majority of known species with their importance including biodiversity, agricultural, and human health concerns. It is likely that the successful adaptation of the Insecta clade depends on specific components in its proteome that give rise to specialized features. However, proteome determination is an intensive undertaking. Here we present results from a computational method that uses genome analysis to characterize insect and eukaryote proteomes as an approximation complementary to experimental approaches. RESULTS: Homologs in common to Drosophila melanogaster, Anopheles gambiae, Bombyx mori, Tribolium castaneum, and Apis mellifera were compared to the complete genomes of three non-insect eukaryotes (opisthokonts) Homo sapiens, Caenorhabditis elegans and Saccharomyces cerevisiae. This operation yielded 154 groups of orthologous proteins in Drosophila to be insect-specific homologs; 466 groups were determined to be common to eukaryotes (represented by three opisthokonts). ESTs from the hemimetabolous insect Locust migratoria were also considered in order to approximate their corresponding genes in the insect-specific homologs. Stress and stimulus response proteins were found to constitute a higher fraction in the insect-specific homologs than in the homologs common to eukaryotes. CONCLUSION: The significant representation of stress response and stimulus response proteins in proteins determined to be insect-specific, along with specific cuticle and pheromone/odorant binding proteins, suggest that communication and adaptation to environments may distinguish insect evolution relative to other eukaryotes. The tendency for low Ka/Ks ratios in the insect-specific protein set suggests purifying selection pressure. The generally larger number of paralogs in the insect-specific proteins may indicate adaptation to environment changes. Instances in our insect-specific protein set have been arrived at through experiments reported in the literature, supporting the accuracy of our approach.
Project description:Comparative studies of the mitochondrial proteome have identified a conserved core of proteins descended from the α-proteobacterial endosymbiont that gave rise to the mitochondrion and was the source of the mitochondrial genome in contemporary eukaryotes. A surprising result of phylogenetic analyses is the relatively small proportion (10-20%) of the mitochondrial proteome displaying a clear α-proteobacterial ancestry. A large fraction of mitochondrial proteins typically has detectable homologs only in other eukaryotes and is presumed to represent proteins that emerged specifically within eukaryotes. A further significant fraction of the mitochondrial proteome consists of proteins with homologs in prokaryotes, but without a robust phylogenetic signal affiliating them with specific prokaryotic lineages. The presumptive evolutionary source of these proteins is quite different in contending models of mitochondrial origin.
Project description:BACKGROUND: The establishment of the nuclear membrane resulted in the physical separation of transcription and translation, and presented early eukaryotes with a formidable challenge: how to shuttle RNA from the nucleus to the locus of protein synthesis. In prokaryotes, mRNA is translated as it is being synthesized, whereas in eukaryotes mRNA is synthesized and processed in the nucleus, and it is then exported to the cytoplasm. In metazoa and fungi, the different RNA species are exported from the nucleus by specialized pathways. For example, tRNA is exported by exportin-t in a RanGTP-dependent fashion. By contrast, mRNAs are associated to ribonucleoproteins (RNPs) and exported by an essential shuttling complex (TAP-p15 in human, Mex67-mtr2 in yeast) that transports them through the nuclear pore. The different RNA export pathways appear to be well conserved among members of Opisthokonta, the eukaryotic supergroup that includes Fungi and Metazoa. However, it is not known whether RNA export in the other eukaryotic supergroups follows the same export routes as in opisthokonts. METHODS: Our objective was to reconstruct the evolutionary history of the different RNA export pathways across eukaryotes. To do so, we screened an array of eukaryotic genomes for the presence of homologs of the proteins involved in RNA export in Metazoa and Fungi, using human and yeast proteins as queries. RESULTS: Our genomic comparisons indicate that the basic components of the RanGTP-dependent RNA pathways are conserved across eukaryotes, and thus we infer that these are traceable to the last eukaryotic common ancestor (LECA). On the other hand, several of the proteins involved in RanGTP-independent mRNA export pathways are less conserved, which would suggest that they represent innovations that appeared later in the evolution of eukaryotes. CONCLUSIONS: Our analyses suggest that the LECA possessed the basic components of the different RNA export mechanisms found today in opisthokonts, and that these mechanisms became more specialized throughout eukaryotic evolution.
Project description:Endosomal sorting complexes required for transport (ESCRTs) are heteromeric protein complexes required for multivesicular body (MVB) morphogenesis. ESCRTs I, II, III and III-associated are ubiquitous in eukaryotes and presumably ancient in origin. ESCRT 0 recruits cargo to the MVB and appears to be opisthokont-specific, bringing into question aspects of the current model of ESCRT mechanism. One caveat to the restricted distribution of ESCRT 0 was the previous limited availability of amoebozoan genomes, the supergroup closest to opisthokonts. Here, we significantly expand the sampling of ESCRTs in Amoebozoa. Our electron micrographic and bioinformatics evidence confirm the presence of MVBs in the amoeboflagellate Breviata anathema. Searches of genomic databases of amoebozoans confirm the ubiquitous nature of ESCRTs I-III-associated and the restriction of ESCRT 0 to opisthokonts. Recently, an alternate ESCRT 0 complex, centering on Tom1 proteins, has been proposed. We determine the distribution of Tom1 family proteins across eukaryotes and show that the Tom1, Tom1L1 and Tom1L2 proteins are a vertebrate-specific expansion of the single Tom1 family ancestor, which has indeed been identified in at least one member of each of the major eukaryotic supergroups. This implies a more widely conserved and ancient role for the Tom1 family in endocytosis than previously suspected.
Project description:The haloarchaeal-type tyrosyl tRNA synthetase (tyrRS) have previously been proposed to be a molecular synapomorphy of the opisthokonts. To re-evaluate this we have performed a taxon-wide genomic survey of tyrRS in eukaryotes and prokaryotes. Our phylogenetic trees group eukaryotes with archaea, with all opisthokonts sharing the haloarchaeal-type tyrRS. However, this type of tyrRS is not exclusive to opisthokonts, since it also encoded by two amoebozoans. Whether this is a consequence of lateral gene transfer or lineage sorting remains unsolved, but in any case haloarchaeal-type tyrRS is not a synapomorphy of opisthokonts. This demonstrates that molecular markers should be re-evaluated once a better taxon sampling becomes available.
Project description:BACKGROUND: Coronins belong to the superfamily of the eukaryotic-specific WD40-repeat proteins and play a role in several actin-dependent processes like cytokinesis, cell motility, phagocytosis, and vesicular trafficking. Two major types of coronins are known: First, the short coronins consisting of an N-terminal coronin domain, a unique region and a short coiled-coil region, and secondly the tandem coronins comprising two coronin domains. RESULTS: 723 coronin proteins from 358 species have been identified by analyzing the whole-genome assemblies of all available sequenced eukaryotes (March 2011). The organisms analyzed represent most eukaryotic kingdoms but also cover every taxon several times to provide a better statistical sampling. The phylogenetic tree of the coronin domains based on the Bayesian method is in accordance with the most recent grouping of the major kingdoms of the eukaryotes and also with the grouping of more recently separated branches. Based on this "holistic" approach the coronins group into four classes: class-1 (Type I) and class-2 (Type II) are metazoan/choanoflagellate specific classes, class-3 contains the tandem-coronins (Type III), and the new class-4 represents the coronins fused to villin (Type IV). Short coronins from non-metazoans are equally related to class-1 and class-2 coronins and thus remain unclassified. CONCLUSIONS: The coronin class distribution suggests that the last common eukaryotic ancestor possessed a single and a tandem-coronin, and most probably a class-4 coronin of which homologs have been identified in Excavata and Opisthokonts although most of these species subsequently lost the class-4 homolog. The most ancient short coronin already contained the trimerization motif in the coiled-coil domain.
Project description:The organelle paralogy hypothesis is one model for the acquisition of nonendosymbiotic organelles, generated from molecular evolutionary analyses of proteins encoding specificity in the membrane traffic system. GTPase activating proteins (GAPs) for the ADP-ribosylation factor (Arfs) GTPases are additional regulators of the kinetics and fidelity of membrane traffic. Here we describe molecular evolutionary analyses of the Arf GAP protein family. Of the 10 subfamilies previously defined in humans, we find that 5 were likely present in the last eukaryotic common ancestor. Of the 3 most recently derived subfamilies, 1 was likely present in the ancestor of opisthokonts (animals and fungi) and apusomonads (flagellates classified as the sister lineage to opisthokonts), while 2 arose in the holozoan lineage. We also propose to have identified a novel ancient subfamily (ArfGAPC2), present in diverse eukaryotes but which is lost frequently, including in the opisthokonts. Surprisingly few ancient domains accompanying the ArfGAP domain were identified, in marked contrast to the extensively decorated human Arf GAPs. Phylogenetic analyses of the subfamilies reveal patterns of single and multiple gene duplications specific to the Holozoa, to some degree mirroring evolution of Arf GAP targets, the Arfs. Conservation, and lack thereof, of various residues in the ArfGAP structure provide contextualization of previously identified functional amino acids and their application to Arf GAP biology in general. Overall, our results yield insights into current Arf GAP biology, reveal complexity in the ancient eukaryotic ancestor and integrate the Arf GAP family into a proposed mechanism for the evolution of nonendosymbiotic organelles.
Project description:Given the large number of RNA-binding proteins and regulatory RNAs within genomes, posttranscriptional regulation may be an underappreciated aspect of cis-regulatory evolution. Here, we focus on nematode germ cells, which are known to rely heavily upon translational control to regulate meiosis and gametogenesis. GLD-1 belongs to the STAR-domain family of RNA-binding proteins, conserved throughout eukaryotes, and functions in Caenorhabditis elegans as a germline-specific translational repressor. A phylogenetic analysis across opisthokonts shows that GLD-1 is most closely related to Drosophila How and deuterostome Quaking, both implicated in alternative splicing. We identify messenger RNAs associated with C. briggsae GLD-1 on a genome-wide scale and provide evidence that many participate in aspects of germline development. By comparing our results with published C. elegans GLD-1 targets, we detect nearly 100 that are conserved between the two species. We also detected several hundred Cbr-GLD-1 targets whose homologs have not been reported to be associated with C. elegans GLD-1 in either of two independent studies. Low expression in C. elegans may explain the failure to detect most of them, but a highly expressed subset are strong candidates for Cbr-GLD-1-specific targets. We examine GLD-1-binding motifs among targets conserved in C. elegans and C. briggsae and find that most, but not all, display evidence of shared ancestral binding sites. Our work illustrates both the conservative and the dynamic character of evolution at the posttranslational level of gene regulation, even between congeners.
Project description:The nuclear lamina is a protein meshwork associated with the inner side of the nuclear envelope contributing structural, signalling and regulatory functions. Here, I report on the evolution of an important component of the lamina, the lamin intermediate filament proteins, across the eukaryotic tree of life. The lamins show a variety of protein domain and sequence motif architectures beyond the classical ?-helical rod, nuclear localisation signal, immunoglobulin domain and CaaX motif organisation, suggesting extension and adaptation of functions in many species. I identified lamin genes not only in metazoa and Amoebozoa as previously described, but also in other opisthokonts including Ichthyosporea and choanoflagellates, in oomycetes, a sub-family of Stramenopiles, and in Rhizaria, implying that they must have been present very early in eukaryotic evolution if not even the last common ancestor of all extant eukaryotes. These data considerably extend the current perception of lamin evolution and have important implications with regard to the evolution of the nuclear envelope.
Project description:The Kinetoplastida are flagellated protozoa evolutionary distant and divergent from yeast and humans. Kinetoplastida include trypanosomatids, and a number of important pathogens. Trypanosoma brucei, Trypanosoma cruzi and Leishmania spp. inflict significant morbidity and mortality on humans and livestock as the etiological agents of human African trypanosomiasis, Chagas' disease and leishmaniasis respectively. For all of these organisms, intracellular trafficking is vital for maintenance of the host-pathogen interface, modulation/evasion of host immune system responses and nutrient uptake. Soluble N-ethylmaleimide-sensitive factor attachment protein receptors (SNAREs) are critical components of the intracellular trafficking machinery in eukaryotes, mediating membrane fusion and contributing to organelle specificity. We asked how the SNARE complement evolved across the trypanosomatids. An in silico search of the predicted proteomes of T. b. brucei and T. cruzi was used to identify candidate SNARE sequences. Phylogenetic analysis, including comparisons with yeast and human SNAREs, allowed assignment of trypanosomatid SNAREs to the Q or R subclass, as well as identification of several SNAREs orthologous with those of opisthokonts. Only limited variation in number and identity of SNAREs was found, with Leishmania major having 27 and T. brucei 26, suggesting a stable SNARE complement post-speciation. Expression analysis of T. brucei SNAREs revealed significant differential expression between mammalian and insect infective forms, especially within R and Qb-SNARE subclasses, suggesting possible roles in adaptation to different environments. For trypanosome SNAREs with clear orthologs in opisthokonts, the subcellular localization of TbVAMP7C is endosomal while both TbSyn5 and TbSyn16B are at the Golgi complex, which suggests conservation of localization and possibly also function. Despite highly distinct life styles, the complement of trypanosomatid SNAREs is quite stable between the three pathogenic lineages, suggesting establishment in the last common ancestor of trypanosomes and Leishmania. Developmental changes to SNARE mRNA levels between blood steam and procyclic life stages suggest that trypanosomes modulate SNARE functions via expression. Finally, the locations of some conserved SNAREs have been retained across the eukaryotic lineage.
Project description:BACKGROUND: Bacterial penicillin-binding proteins and beta-lactamases (PBP-betaLs) constitute a large family of serine proteases that perform essential functions in the synthesis and maintenance of peptidoglycan. Intriguingly, genes encoding PBP-betaL homologs occur in many metazoan genomes including humans. The emerging role of LACTB, a mammalian mitochondrial PBP-betaL homolog, in metabolic signaling prompted us to investigate the evolutionary history of metazoan PBP-betaL proteins. RESULTS: Metazoan PBP-betaL homologs including LACTB share unique structural features with bacterial class B low molecular weight penicillin-binding proteins. The amino acid residues necessary for enzymatic activity in bacterial PBP-betaL proteins, including the catalytic serine residue, are conserved in all metazoan homologs. Phylogenetic analysis indicated that metazoan PBP-betaL homologs comprise four alloparalogus protein lineages that derive from alpha-proteobacteria. CONCLUSION: While most components of the peptidoglycan synthesis machinery were dumped by early eukaryotes, a few PBP-betaL proteins were conserved and are found in metazoans including humans. Metazoan PBP-betaL homologs are active-site-serine enzymes that probably have distinct functions in the metabolic circuitry. We hypothesize that PBP-betaL proteins in the early eukaryotic cell enabled the degradation of peptidoglycan from ingested bacteria, thereby maximizing the yield of nutrients and streamlining the cell for effective phagocytotic feeding.