Dataset Information

Empirical evaluation of gene and environment interactions: methods and potential.

ABSTRACT:

SUBMITTER: Prentice RL

PROVIDER: S-EPMC3156805 | biostudies-literature | 2011 Aug

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Empirical evaluation of gene and environment interactions: methods and potential.

Prentice Ross L RL

Journal of the National Cancer Institute 20110726 16

PMID: 21791675

Similar Datasets

Project description:BackgroundWe address the problem of integratively analyzing multiple gene expression, microarray datasets in order to reconstruct gene-gene interaction networks. Integrating multiple datasets is generally believed to provide increased statistical power and to lead to a better characterization of the system under study. However, the presence of systematic variation across different studies makes network reverse-engineering tasks particularly challenging. We contrast two approaches that have been frequently used in the literature for addressing systematic biases: meta-analysis methods, which first calculate opportune statistics on single datasets and successively summarize them, and data-merging methods, which directly analyze the pooled data after removing eventual biases. This comparative evaluation is performed on both synthetic and real data, the latter consisting of two manually curated microarray compendia comprising several E. coli and Yeast studies, respectively. Furthermore, the reconstruction of the regulatory network of the transcription factor Ikaros in human Peripheral Blood Mononuclear Cells (PBMCs) is presented as a case-study.ResultsThe meta-analysis and data-merging methods included in our experimentations provided comparable performances on both synthetic and real data. Furthermore, both approaches outperformed (a) the naïve solution of merging data together ignoring possible biases, and (b) the results that are expected when only one dataset out of the available ones is analyzed in isolation. Using correlation statistics proved to be more effective than using p-values for correctly ranking candidate interactions. The results from the PBMC case-study indicate that the findings of the present study generalize to different types of network reconstruction algorithms.ConclusionsIgnoring the systematic variations that differentiate heterogeneous studies can produce results that are statistically indistinguishable from random guessing. Meta-analysis and data merging methods have proved equally effective in addressing this issue, and thus researchers may safely select the approach that best suit their specific application.

Dataset Information

Empirical evaluation of gene and environment interactions: methods and potential.

Publications

Empirical evaluation of gene and environment interactions: methods and potential.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets