Proteomics

Dataset Information

0

Enhancing Peptide Identification in Metaproteomics through Curriculum Learning in Deep Learning


ABSTRACT: Metaproteomics offers a powerful window into the active functions of microbial communities, but accurately identifying peptides remains challenging due to the size and incompleteness of protein databases derived from metagenomes. These databases often contain vastly more sequences than those from single organisms, creating a computational bottleneck in peptide-spectrum match (PSM) filtering. Here we present WinnowNet, a deep learning–based method for PSM filtering, available in two versions: one using transformers and the other convolutional neural networks. Both variants are designed to handle the unordered nature of PSM data and are trained using a curriculum learning strategy that moves from simple to complex examples. WinnowNet consistently achieves more true identifications at equivalent false discovery rates compared to leading tools, including Percolator, MS$^2$Rescore, and DeepFilter, and outperforms filters integrated into popular analysis pipelines. It also uncovers more gut microbiome biomarkers related to diet and health, highlighting its potential to support advances in personalized medicine

INSTRUMENT(S):

ORGANISM(S): Soil Metagenome Human Gut Metagenome Marine Metagenome

SUBMITTER: Shichao Feng  

LAB HEAD: Xuan Guo

PROVIDER: PXD067277 | Pride | 2025-10-20

REPOSITORIES: Pride

Dataset's files

Source:
Action DRS
Marine_shuffled.fasta Fasta
Mock_Comm_RefDB_V3_shuffled.fasta Fasta
OSU_D2_FASP_Elite_02262014_01.pepXML Pepxml
OSU_D2_FASP_Elite_02262014_01.raw Raw
OSU_D2_FASP_Elite_02262014_01.txt Txt
Items per page:
1 - 5 of 160
altmetric image

Publications

Enhancing peptide identification in metaproteomics through curriculum learning in deep learning.

Feng Shichao S   Zhang Bailu B   Wang Huan H   Xiong Yi Y   Tian Athena A   Yuan Xiaohui X   Pan Chongle C   Guo Xuan X  

Nature communications 20251008 1


Metaproteomics offers a powerful window into the active functions of microbial communities, but accurately identifying peptides remains challenging due to the size and incompleteness of protein databases derived from metagenomes. These databases often contain vastly more sequences than those from single organisms, creating a computational bottleneck in peptide-spectrum match (PSM) filtering. Here we present WinnowNet, a deep learning-based method for PSM filtering, available in two versions: one  ...[more]

Similar Datasets

2022-09-13 | PXD018996 | Pride
| PRJEB26278 | ENA
2025-02-11 | PXD056560 | Pride
2024-06-14 | PXD027675 | Pride
2016-01-05 | PXD002857 | Pride
2021-05-11 | MSV000087402 | MassIVE
2022-05-26 | MTBLS2841 | MetaboLights
2021-04-26 | PXD021398 | Pride
2025-01-29 | E-MTAB-14786 | biostudies-arrayexpress
2023-07-05 | E-MTAB-13124 | biostudies-arrayexpress