Unknown

Dataset Information

0

Extraction and analysis of signatures from the Gene Expression Omnibus by the crowd.


ABSTRACT: Gene expression data are accumulating exponentially in public repositories. Reanalysis and integration of themed collections from these studies may provide new insights, but requires further human curation. Here we report a crowdsourcing project to annotate and reanalyse a large number of gene expression profiles from Gene Expression Omnibus (GEO). Through a massive open online course on Coursera, over 70 participants from over 25 countries identify and annotate 2,460 single-gene perturbation signatures, 839 disease versus normal signatures, and 906 drug perturbation signatures. All these signatures are unique and are manually validated for quality. Global analysis of these signatures confirms known associations and identifies novel associations between genes, diseases and drugs. The manually curated signatures are used as a training set to develop classifiers for extracting similar signatures from the entire GEO repository. We develop a web portal to serve these signatures for query, download and visualization.

SUBMITTER: Wang Z 

PROVIDER: S-EPMC5052684 | biostudies-literature | 2016 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

Extraction and analysis of signatures from the Gene Expression Omnibus by the crowd.

Wang Zichen Z   Monteiro Caroline D CD   Jagodnik Kathleen M KM   Fernandez Nicolas F NF   Gundersen Gregory W GW   Rouillard Andrew D AD   Jenkins Sherry L SL   Feldmann Axel S AS   Hu Kevin S KS   McDermott Michael G MG   Duan Qiaonan Q   Clark Neil R NR   Jones Matthew R MR   Kou Yan Y   Goff Troy T   Woodland Holly H   Amaral Fabio M R FMR   Szeto Gregory L GL   Fuchs Oliver O   Schüssler-Fiorenza Rose Sophia M SM   Sharma Shvetank S   Schwartz Uwe U   Bausela Xabier Bengoetxea XB   Szymkiewicz Maciej M   Maroulis Vasileios V   Salykin Anton A   Barra Carolina M CM   Kruth Candice D CD   Bongio Nicholas J NJ   Mathur Vaibhav V   Todoric Radmila D RD   Rubin Udi E UE   Malatras Apostolos A   Fulp Carl T CT   Galindo John A JA   Motiejunaite Ruta R   Jüschke Christoph C   Dishuck Philip C PC   Lahl Katharina K   Jafari Mohieddin M   Aibar Sara S   Zaravinos Apostolos A   Steenhuizen Linda H LH   Allison Lindsey R LR   Gamallo Pablo P   de Andres Segura Fernando F   Dae Devlin Tyler T   Pérez-García Vicente V   Ma'ayan Avi A  

Nature communications 20160926


Gene expression data are accumulating exponentially in public repositories. Reanalysis and integration of themed collections from these studies may provide new insights, but requires further human curation. Here we report a crowdsourcing project to annotate and reanalyse a large number of gene expression profiles from Gene Expression Omnibus (GEO). Through a massive open online course on Coursera, over 70 participants from over 25 countries identify and annotate 2,460 single-gene perturbation si  ...[more]

Similar Datasets

| S-EPMC4944384 | biostudies-literature
| S-EPMC8728209 | biostudies-literature
| S-EPMC1619900 | biostudies-literature
| S-EPMC6333964 | biostudies-literature
| S-EPMC1619899 | biostudies-other
| S-EPMC5604135 | biostudies-literature
| S-EPMC5928629 | biostudies-literature
| S-EPMC5643580 | biostudies-literature
| S-EPMC7336680 | biostudies-literature
| S-EPMC8061458 | biostudies-literature