Unknown

Dataset Information

0

Fast randomization of large genomic datasets while preserving alteration counts.


ABSTRACT:

Motivation

Studying combinatorial patterns in cancer genomic datasets has recently emerged as a tool for identifying novel cancer driver networks. Approaches have been devised to quantify, for example, the tendency of a set of genes to be mutated in a 'mutually exclusive' manner. The significance of the proposed metrics is usually evaluated by computing P-values under appropriate null models. To this end, a Monte Carlo method (the switching-algorithm) is used to sample simulated datasets under a null model that preserves patient- and gene-wise mutation rates. In this method, a genomic dataset is represented as a bipartite network, to which Markov chain updates (switching-steps) are applied. These steps modify the network topology, and a minimal number of them must be executed to draw simulated datasets independently under the null model. This number has previously been deducted empirically to be a linear function of the total number of variants, making this process computationally expensive.

Results

We present a novel approximate lower bound for the number of switching-steps, derived analytically. Additionally, we have developed the R package BiRewire, including new efficient implementations of the switching-algorithm. We illustrate the performances of BiRewire by applying it to large real cancer genomics datasets. We report vast reductions in time requirement, with respect to existing implementations/bounds and equivalent P-value computations. Thus, we propose BiRewire to study statistical properties in genomic datasets, and other data that can be modeled as bipartite networks.

Availability and implementation

BiRewire is available on BioConductor at http://www.bioconductor.org/packages/2.13/bioc/html/BiRewire.html.

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Gobbi A 

PROVIDER: S-EPMC4147926 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC5754031 | biostudies-literature
| S-EPMC2654044 | biostudies-literature
| S-EPMC4464544 | biostudies-other
| S-EPMC8580269 | biostudies-literature
| S-EPMC4221126 | biostudies-literature
| S-EPMC4433499 | biostudies-literature
| S-EPMC3052304 | biostudies-literature
| S-EPMC6635410 | biostudies-literature
| S-EPMC5519076 | biostudies-literature
| S-EPMC6586841 | biostudies-literature