Dataset Information

Power and sample size estimation in microarray studies.

ABSTRACT:

Background

Before conducting a microarray experiment, one important issue that needs to be determined is the number of arrays required in order to have adequate power to identify differentially expressed genes. This paper discusses some crucial issues in the problem formulation, parameter specifications, and approaches that are commonly proposed for sample size estimation in microarray experiments. Common methods for sample size estimation are formulated as the minimum sample size necessary to achieve a specified sensitivity (proportion of detected truly differentially expressed genes) on average at a specified false discovery rate (FDR) level and specified expected proportion (pi1) of the true differentially expression genes in the array. Unfortunately, the probability of detecting the specified sensitivity in such a formulation can be low. We formulate the sample size problem as the number of arrays needed to achieve a specified sensitivity with 95% probability at the specified significance level. A permutation method using a small pilot dataset to estimate sample size is proposed. This method accounts for correlation and effect size heterogeneity among genes.

Results

A sample size estimate based on the common formulation, to achieve the desired sensitivity on average, can be calculated using a univariate method without taking the correlation among genes into consideration. This formulation of sample size problem is inadequate because the probability of detecting the specified sensitivity can be lower than 50%. On the other hand, the needed sample size calculated by the proposed permutation method will ensure detecting at least the desired sensitivity with 95% probability. The method is shown to perform well for a real example dataset using a small pilot dataset with 4-6 samples per group.

Conclusions

We recommend that the sample size problem should be formulated to detect a specified proportion of differentially expressed genes with 95% probability. This formulation ensures finding the desired proportion of true positives with high probability. The proposed permutation method takes the correlation structure and effect size heterogeneity into consideration and works well using only a small pilot dataset.

SUBMITTER: Lin WJ

PROVIDER: S-EPMC2837028 | biostudies-literature | 2010 Jan

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Power and sample size estimation in microarray studies.

Lin Wei-Jiun WJ Hsueh Huey-Miin HM Chen James J JJ

BMC bioinformatics 20100125

<h4>Background</h4>Before conducting a microarray experiment, one important issue that needs to be determined is the number of arrays required in order to have adequate power to identify differentially expressed genes. This paper discusses some crucial issues in the problem formulation, parameter specifications, and approaches that are commonly proposed for sample size estimation in microarray experiments. Common methods for sample size estimation are formulated as the minimum sample size necess ...[more]

PMID: 20100337

Dataset Information

Power and sample size estimation in microarray studies.

Background

Results

Conclusions

Publications

Power and sample size estimation in microarray studies.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Power analysis and sample size estimation for sequence-based association studies.
| S-EPMC4133582 | biostudies-literature

<i>pwrBRIDGE</i>: a user-friendly web application for power and sample size estimation in batch-confounded microarray studies with dependent samples.
| S-EPMC9550194 | biostudies-literature

Simple Power and Sample Size Estimation for Non-Randomized Longitudinal Difference in Differences Studies.
| S-EPMC6663085 | biostudies-literature

Power analysis and sample size estimation for RNA-Seq differential expression.
| S-EPMC4201821 | biostudies-literature

Power and sample size for observational studies of point exposure effects.
| S-EPMC8141060 | biostudies-literature

Designing image segmentation studies: Statistical power, sample size and reference standard quality.
| S-EPMC5666910 | biostudies-literature

Power and sample size estimation for epigenome-wide association scans to detect differential DNA methylation.
| S-EPMC4588864 | biostudies-literature

<i>AdmixPower</i>: Statistical Power and Sample Size Estimation for Mapping Genetic Loci in Admixed Populations.
| S-EPMC5676228 | biostudies-literature

Empirical power and sample size calculations for cluster-randomized and cluster-randomized crossover studies.
| S-EPMC3338707 | biostudies-literature

Estimation after blinded sample size reassessment.
| S-EPMC6047306 | biostudies-literature