Dataset Information


Plant Proteins Are Smaller Because They Are Encoded by Fewer Exons than Animal Proteins.

ABSTRACT: Protein size is an important biochemical feature since longer proteins can harbor more domains and therefore can display more biological functionalities than shorter proteins. We found remarkable differences in protein length, exon structure, and domain count among different phylogenetic lineages. While eukaryotic proteins have an average size of 472 amino acid residues (aa), average protein sizes in plant genomes are smaller than those of animals and fungi. Proteins unique to plants are ?81aa shorter than plant proteins conserved among other eukaryotic lineages. The smaller average size of plant proteins could neither be explained by endosymbiosis nor subcellular compartmentation nor exon size, but rather due to exon number. Metazoan proteins are encoded on average by ?10 exons of small size [?176 nucleotides (nt)]. Streptophyta have on average only ?5.7 exons of medium size (?230nt). Multicellular species code for large proteins by increasing the exon number, while most unicellular organisms employ rather larger exons (>400nt). Among subcellular compartments, membrane proteins are the largest (?520aa), whereas the smallest proteins correspond to the gene ontology group of ribosome (?240aa). Plant genes are encoded by half the number of exons and also contain fewer domains than animal proteins on average. Interestingly, endosymbiotic proteins that migrated to the plant nucleus became larger than their cyanobacterial orthologs. We thus conclude that plants have proteins larger than bacteria but smaller than animals or fungi. Compared to the average of eukaryotic species, plants have ?34% more but ?20% smaller proteins. This suggests that photosynthetic organisms are unique and deserve therefore special attention with regard to the evolutionary forces acting on their genomes and proteomes.

SUBMITTER: Ramirez-Sanchez O 

PROVIDER: S-EPMC5200936 | BioStudies | 2016-01-01T00:00:00Z

REPOSITORIES: biostudies

Similar Datasets

1000-01-01 | S-EPMC148551 | BioStudies
1000-01-01 | S-EPMC2636830 | BioStudies
2001-01-01 | S-EPMC1222108 | BioStudies
2019-01-01 | S-EPMC6612939 | BioStudies
2002-01-01 | S-EPMC155282 | BioStudies
2000-01-01 | S-EPMC16942 | BioStudies
1000-01-01 | S-EPMC3296660 | BioStudies
1000-01-01 | S-EPMC4108047 | BioStudies
2003-01-01 | S-EPMC403649 | BioStudies
2020-01-01 | S-EPMC7642382 | BioStudies