Unknown

Dataset Information

0

IntegRATE: a desirability-based data integration framework for the prioritization of candidate genes across heterogeneous omics and its application to preterm birth.


ABSTRACT:

Background

The integration of high-quality, genome-wide analyses offers a robust approach to elucidating genetic factors involved in complex human diseases. Even though several methods exist to integrate heterogeneous omics data, most biologists still manually select candidate genes by examining the intersection of lists of candidates stemming from analyses of different types of omics data that have been generated by imposing hard (strict) thresholds on quantitative variables, such as P-values and fold changes, increasing the chance of missing potentially important candidates.

Methods

To better facilitate the unbiased integration of heterogeneous omics data collected from diverse platforms and samples, we propose a desirability function framework for identifying candidate genes with strong evidence across data types as targets for follow-up functional analysis. Our approach is targeted towards disease systems with sparse, heterogeneous omics data, so we tested it on one such pathology: spontaneous preterm birth (sPTB).

Results

We developed the software integRATE, which uses desirability functions to rank genes both within and across studies, identifying well-supported candidate genes according to the cumulative weight of biological evidence rather than based on imposition of hard thresholds of key variables. Integrating 10 sPTB omics studies identified both genes in pathways previously suspected to be involved in sPTB as well as novel genes never before linked to this syndrome. integRATE is available as an R package on GitHub ( https://github.com/haleyeidem/integRATE ).

Conclusions

Desirability-based data integration is a solution most applicable in biological research areas where omics data is especially heterogeneous and sparse, allowing for the prioritization of candidate genes that can be used to inform more targeted downstream functional analyses.

SUBMITTER: Eidem HR 

PROVIDER: S-EPMC6245874 | biostudies-literature | 2018 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

integRATE: a desirability-based data integration framework for the prioritization of candidate genes across heterogeneous omics and its application to preterm birth.

Eidem Haley R HR   Steenwyk Jacob L JL   Wisecaver Jennifer H JH   Capra John A JA   Abbot Patrick P   Rokas Antonis A  

BMC medical genomics 20181119 1


<h4>Background</h4>The integration of high-quality, genome-wide analyses offers a robust approach to elucidating genetic factors involved in complex human diseases. Even though several methods exist to integrate heterogeneous omics data, most biologists still manually select candidate genes by examining the intersection of lists of candidates stemming from analyses of different types of omics data that have been generated by imposing hard (strict) thresholds on quantitative variables, such as P-  ...[more]

Similar Datasets

| S-EPMC8514333 | biostudies-literature
| S-EPMC8853556 | biostudies-literature
| S-EPMC9034518 | biostudies-literature
| S-EPMC6434278 | biostudies-literature
| S-EPMC4657017 | biostudies-literature
| S-EPMC9061712 | biostudies-literature
| S-EPMC6010767 | biostudies-literature
| S-EPMC8696097 | biostudies-literature
| S-EPMC11602161 | biostudies-literature
| S-EPMC11229337 | biostudies-literature