Unknown

Dataset Information

0

A proteomics sample metadata representation for multiomics integration and big data analysis.


ABSTRACT: The amount of public proteomics data is rapidly increasing but there is no standardized format to describe the sample metadata and their relationship with the dataset files in a way that fully supports their understanding or reanalysis. Here we propose to develop the transcriptomics data format MAGE-TAB into a standard representation for proteomics sample metadata. We implement MAGE-TAB-Proteomics in a crowdsourcing project to manually curate over 200 public datasets. We also describe tools and libraries to validate and submit sample metadata-related information to the PRIDE repository. We expect that these developments will improve the reproducibility and facilitate the reanalysis and integration of public proteomics datasets.

SUBMITTER: Dai C 

PROVIDER: S-EPMC8494749 | biostudies-literature | 2021 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

A proteomics sample metadata representation for multiomics integration and big data analysis.

Dai Chengxin C   Füllgrabe Anja A   Pfeuffer Julianus J   Solovyeva Elizaveta M EM   Deng Jingwen J   Moreno Pablo P   Kamatchinathan Selvakumar S   Kundu Deepti Jaiswal DJ   George Nancy N   Fexova Silvie S   Grüning Björn B   Föll Melanie Christine MC   Griss Johannes J   Vaudel Marc M   Audain Enrique E   Locard-Paulet Marie M   Turewicz Michael M   Eisenacher Martin M   Uszkoreit Julian J   Van Den Bossche Tim T   Schwämmle Veit V   Webel Henry H   Schulze Stefan S   Bouyssié David D   Jayaram Savita S   Duggineni Vinay Kumar VK   Samaras Patroklos P   Wilhelm Mathias M   Choi Meena M   Wang Mingxun M   Kohlbacher Oliver O   Brazma Alvis A   Papatheodorou Irene I   Bandeira Nuno N   Deutsch Eric W EW   Vizcaíno Juan Antonio JA   Bai Mingze M   Sachsenberg Timo T   Levitsky Lev I LI   Perez-Riverol Yasset Y  

Nature communications 20211006 1


The amount of public proteomics data is rapidly increasing but there is no standardized format to describe the sample metadata and their relationship with the dataset files in a way that fully supports their understanding or reanalysis. Here we propose to develop the transcriptomics data format MAGE-TAB into a standard representation for proteomics sample metadata. We implement MAGE-TAB-Proteomics in a crowdsourcing project to manually curate over 200 public datasets. We also describe tools and  ...[more]

Similar Datasets

| S-EPMC10373112 | biostudies-literature
| S-EPMC7116434 | biostudies-literature
| S-EPMC6779380 | biostudies-literature
| S-EPMC11284499 | biostudies-literature
| S-EPMC4476132 | biostudies-literature
| S-EPMC8008424 | biostudies-literature
| S-EPMC10834361 | biostudies-literature
| S-EPMC10068743 | biostudies-literature
| S-EPMC10481254 | biostudies-literature
| S-EPMC8728291 | biostudies-literature