Proteomics

Dataset Information

0

Phosphorylation in the Plasmodium falciparum proteome: A meta-analysis of publicly available data sets


ABSTRACT: Malaria is a deadly disease caused by Apicomplexan parasites of the Plasmodium genus. Several species of the Plasmodium genus are known to be infectious to human, of which P. falciparum is the deadliest. Post-translational modifications (PTMs) of proteins coordinate cell signalling and hence, regulate many biological processes in P. falciparum homeostasis and host infection, of which the most highly studied is phosphorylation. Phosphosites on proteins can be identified by tandem mass spectrometry (MS) performed on enriched samples (phosphoproteomics), followed by downstream computational analyses. We have performed a large-scale meta-analysis of 11 publicly available phosphoproteomics datasets, to build a comprehensive atlas of phosphosites in the P. falciparum proteome, using robust pipelines aimed at strict control of false identifications. We identified a total of 28,495 phosphorylated sites on P. falciparum proteins at 5% false localisation rate (FLR) and, of those, 18,100 at 1% FLR. We identified significant sequence motifs, likely indicative of different groups of kinases, responsible for different groups of phosphosites. Conservation analysis identified clusters of phosphoproteins that are highly conserved, and others that are evolving faster within the Plasmodium genus, and implicated in different pathways. We also explored the structural context of phosphosites, identifying a strong enrichment for phosphosites on fast evolving (low conservation) intrinsically disordered regions (IDRs) of proteins. In other species, IDRs have been shown to have an important role in modulating protein-protein interactions, particularly in signalling, and thus warranting further study for their roles in host-pathogen interactions. All data has made available via UniProt, PRIDE and PeptideAtlas, with visualisation interfaces for exploring phosphosites in the context of other data on Plasmodium proteins.We have re-analysed publicly available mass spectrometry (MS) data sets enriched for phosphopeptides from Asian rice (Oryza sativa). In total we have identified, 15522 phosphosites on Serine, Threonine and Tyrosine residues on rice proteins. The data has been loaded into UniProtKB, enabling researchers to visualise the sites alongside other stored data on rice proteins, including structural models from AlphaFold2, and into PeptideAtlas, enabling visualisation of the source evidence for each site, including scores and source mass spectra. We identified sequence motifs for phosphosites, and link motifs to enrichment of different biological processes, indicating different downstream regulation caused by different kinase groups. We cross-referenced phosphosites against single amino acid variation (SAAV) data sourced from the rice 3000 genomes data, to identify SAAVs within or proximal to phosphosites that could cause loss of a particular site in a given rice variety. The data was further clustered to identify groups of sites with similar patterns across rice family groups, allowing us to identify sites highly conserved in Japonica, but mostly absent in, for example, Aus type rice varieties - known to have different responses to drought. These resources can assist rice researchers to discover alleles with significantly different functional effects across rice varieties.

INSTRUMENT(S): LTQ Orbitrap Elite, Q Exactive

ORGANISM(S): Plasmodium Falciparum

SUBMITTER: Yasset Perez-Riverol  

LAB HEAD: Andrew R Jones

PROVIDER: PXD046874 | Pride | 2023-11-22

REPOSITORIES: Pride

Similar Datasets

2023-11-08 | PXD046188 | Pride
2016-02-15 | GSE35859 | GEO
2016-02-15 | E-GEOD-35859 | biostudies-arrayexpress
2015-09-23 | E-GEOD-63369 | biostudies-arrayexpress
2010-10-28 | E-GEOD-23865 | biostudies-arrayexpress
2019-10-01 | GSE113582 | GEO
2016-02-15 | E-GEOD-35860 | biostudies-arrayexpress
2015-09-23 | GSE63369 | GEO
2016-02-15 | GSE35860 | GEO
2016-02-15 | E-GEOD-35858 | biostudies-arrayexpress