<HashMap><database>biostudies-literature</database><scores/><additional><submitter>Sharma S</submitter><funding>NLM NIH HHS</funding><funding>NIGMS NIH HHS</funding><pagination>e0152731</pagination><full_dataset_link>https://www.ebi.ac.uk/biostudies/studies/S-EPMC4822787</full_dataset_link><repository>biostudies-literature</repository><omics_type>Unknown</omics_type><volume>11(4)</volume><pubmed_abstract>All translated proteins end with a carboxylic acid commonly called the C-terminus. Many short functional sequences (minimotifs) are located on or immediately proximal to the C-terminus. However, information about the function of protein C-termini has not been consolidated into a single source. Here, we built a new "C-terminome" database and web system focused on human proteins. Approximately 3,600 C-termini in the human proteome have a minimotif with an established molecular function. To help evaluate the function of the remaining C-termini in the human proteome, we inferred minimotifs identified by experimentation in rodent cells, predicted minimotifs based upon consensus sequence matches, and predicted novel highly repetitive sequences in C-termini. Predictions can be ranked by enrichment scores or Gene Evolutionary Rate Profiling (GERP) scores, a measurement of evolutionary constraint. By searching for new anchored sequences on the last 10 amino acids of proteins in the human proteome with lengths between 3-10 residues and up to 5 degenerate positions in the consensus sequences, we have identified new consensus sequences that predict instances in the majority of human genes. All of this information is consolidated into a database that can be accessed through a C-terminome web system with search and browse functions for minimotifs and human proteins. A known consensus sequence-based predicted function is assigned to nearly half the proteins in the human proteome. Weblink: http://cterminome.bio-toolkit.com.</pubmed_abstract><journal>PloS one</journal><pubmed_title>The Functional Human C-Terminome.</pubmed_title><pmcid>PMC4822787</pmcid><funding_grant_id>R15 GM107983</funding_grant_id><funding_grant_id>R01 GM079689</funding_grant_id><funding_grant_id>R01 LM010101</funding_grant_id><pubmed_authors>Thapar V</pubmed_authors><pubmed_authors>Schiller MR</pubmed_authors><pubmed_authors>Sharma S</pubmed_authors><pubmed_authors>Hedden M</pubmed_authors><pubmed_authors>Lyon KF</pubmed_authors><pubmed_authors>Toledo O</pubmed_authors><pubmed_authors>Williams SR</pubmed_authors><pubmed_authors>Rajasekaran S</pubmed_authors><pubmed_authors>Novakovic N</pubmed_authors><pubmed_authors>Brooks SB</pubmed_authors><pubmed_authors>David RP</pubmed_authors><pubmed_authors>Limtong J</pubmed_authors><pubmed_authors>Newsome JM</pubmed_authors></additional><is_claimable>false</is_claimable><name>The Functional Human C-Terminome.</name><description>All translated proteins end with a carboxylic acid commonly called the C-terminus. Many short functional sequences (minimotifs) are located on or immediately proximal to the C-terminus. However, information about the function of protein C-termini has not been consolidated into a single source. Here, we built a new "C-terminome" database and web system focused on human proteins. Approximately 3,600 C-termini in the human proteome have a minimotif with an established molecular function. To help evaluate the function of the remaining C-termini in the human proteome, we inferred minimotifs identified by experimentation in rodent cells, predicted minimotifs based upon consensus sequence matches, and predicted novel highly repetitive sequences in C-termini. Predictions can be ranked by enrichment scores or Gene Evolutionary Rate Profiling (GERP) scores, a measurement of evolutionary constraint. By searching for new anchored sequences on the last 10 amino acids of proteins in the human proteome with lengths between 3-10 residues and up to 5 degenerate positions in the consensus sequences, we have identified new consensus sequences that predict instances in the majority of human genes. All of this information is consolidated into a database that can be accessed through a C-terminome web system with search and browse functions for minimotifs and human proteins. A known consensus sequence-based predicted function is assigned to nearly half the proteins in the human proteome. Weblink: http://cterminome.bio-toolkit.com.</description><dates><release>2016-01-01T00:00:00Z</release><publication>2016</publication><modification>2024-11-21T08:50:58.503Z</modification><creation>2019-03-26T22:57:43Z</creation></dates><accession>S-EPMC4822787</accession><cross_references><pubmed>27050421</pubmed><doi>10.1371/journal.pone.0152731</doi></cross_references></HashMap>