Dataset Information


Identification and characterization of insect-specific proteins by genome data analysis.

ABSTRACT: BACKGROUND: Insects constitute the vast majority of known species with their importance including biodiversity, agricultural, and human health concerns. It is likely that the successful adaptation of the Insecta clade depends on specific components in its proteome that give rise to specialized features. However, proteome determination is an intensive undertaking. Here we present results from a computational method that uses genome analysis to characterize insect and eukaryote proteomes as an approximation complementary to experimental approaches. RESULTS: Homologs in common to Drosophila melanogaster, Anopheles gambiae, Bombyx mori, Tribolium castaneum, and Apis mellifera were compared to the complete genomes of three non-insect eukaryotes (opisthokonts) Homo sapiens, Caenorhabditis elegans and Saccharomyces cerevisiae. This operation yielded 154 groups of orthologous proteins in Drosophila to be insect-specific homologs; 466 groups were determined to be common to eukaryotes (represented by three opisthokonts). ESTs from the hemimetabolous insect Locust migratoria were also considered in order to approximate their corresponding genes in the insect-specific homologs. Stress and stimulus response proteins were found to constitute a higher fraction in the insect-specific homologs than in the homologs common to eukaryotes. CONCLUSION: The significant representation of stress response and stimulus response proteins in proteins determined to be insect-specific, along with specific cuticle and pheromone/odorant binding proteins, suggest that communication and adaptation to environments may distinguish insect evolution relative to other eukaryotes. The tendency for low Ka/Ks ratios in the insect-specific protein set suggests purifying selection pressure. The generally larger number of paralogs in the insect-specific proteins may indicate adaptation to environment changes. Instances in our insect-specific protein set have been arrived at through experiments reported in the literature, supporting the accuracy of our approach.


PROVIDER: S-EPMC1852559 | BioStudies | 2007-01-01T00:00:00Z

REPOSITORIES: biostudies

Similar Datasets

2015-01-01 | S-EPMC4547279 | BioStudies
1000-01-01 | S-EPMC3032688 | BioStudies
2011-01-01 | S-EPMC3031372 | BioStudies
2012-01-01 | S-EPMC4342543 | BioStudies
1000-01-01 | S-EPMC3203266 | BioStudies
2013-01-01 | S-EPMC3660519 | BioStudies
2014-01-01 | S-EPMC4316625 | BioStudies
2015-01-01 | S-EPMC4448529 | BioStudies
1000-01-01 | S-EPMC3979113 | BioStudies
2008-01-01 | S-EPMC2266909 | BioStudies