Shiny-SoSV: A web-based performance calculator for somatic structural variant detection.
ABSTRACT: Somatic structural variants are an important contributor to cancer development and evolution. Accurate detection of these complex variants from whole genome sequencing data is influenced by a multitude of parameters. However, there are currently no tools for guiding study design nor are there applications that could predict the performance of somatic structural variant detection. To address this gap, we developed Shiny-SoSV, a user-friendly web-based calculator for determining the impact of common variables on the sensitivity, precision and F1 score of somatic structural variant detection, including choice of variant detection tool, sequencing depth of coverage, variant allele fraction, and variant breakpoint resolution. Using simulation studies, we determined singular and combinatoric effects of these variables, modelled the results using a generalised additive model, allowing structural variant detection performance to be predicted for any combination of predictors. Shiny-SoSV provides an interactive and visual platform for users to easily compare individual and combined impact of different parameters. It predicts the performance of a proposed study design, on somatic structural variant detection, prior to the commencement of benchwork. Shiny-SoSV is freely available at https://hcpcg.shinyapps.io/Shiny-SoSV with accompanying user's guide and example use-cases.
Project description:BACKGROUND:The phenotypes of cancer cells are driven in part by somatic structural variants. Structural variants can initiate tumors, enhance their aggressiveness, and provide unique therapeutic opportunities. Whole-genome sequencing of tumors can allow exhaustive identification of the specific structural variants present in an individual cancer, facilitating both clinical diagnostics and the discovery of novel mutagenic mechanisms. A plethora of somatic structural variant detection algorithms have been created to enable these discoveries; however, there are no systematic benchmarks of them. Rigorous performance evaluation of somatic structural variant detection methods has been challenged by the lack of gold standards, extensive resource requirements, and difficulties arising from the need to share personal genomic information. RESULTS:To facilitate structural variant detection algorithm evaluations, we create a robust simulation framework for somatic structural variants by extending the BAMSurgeon algorithm. We then organize and enable a crowdsourced benchmarking within the ICGC-TCGA DREAM Somatic Mutation Calling Challenge (SMC-DNA). We report here the results of structural variant benchmarking on three different tumors, comprising 204 submissions from 15 teams. In addition to ranking methods, we identify characteristic error profiles of individual algorithms and general trends across them. Surprisingly, we find that ensembles of analysis pipelines do not always outperform the best individual method, indicating a need for new ways to aggregate somatic structural variant detection approaches. CONCLUSIONS:The synthetic tumors and somatic structural variant detection leaderboards remain available as a community benchmarking resource, and BAMSurgeon is available at https://github.com/adamewing/bamsurgeon .
Project description:Interactive applications, developed using Shiny for the R programming language, have the potential to revolutionize the sharing and communication of pharmacometric model simulations. Shiny allows customization of the application's user-interface to provide an elegant environment for displaying user-input controls and simulation output-where the latter simultaneously updates with changing input. The flexible nature of the R language makes simulations of population variability possible thus promoting the combination of Shiny with R in model visualization.
Project description:It has long been recognized that sample size calculations for cluster randomized trials require consideration of the correlation between multiple observations within the same cluster. When measurements are taken at anything other than a single point in time, these correlations depend not only on the cluster but also on the time separation between measurements and additionally, on whether different participants (cross-sectional designs) or the same participants (cohort designs) are repeatedly measured. This is particularly relevant in trials with multiple periods of measurement, such as the cluster cross-over and stepped-wedge designs, but also to some degree in parallel designs. Several papers describing sample size methodology for these designs have been published, but this methodology might not be accessible to all researchers. In this article we provide a tutorial on sample size calculation for cluster randomized designs with particular emphasis on designs with multiple periods of measurement and provide a web-based tool, the Shiny CRT Calculator, to allow researchers to easily conduct these sample size calculations. We consider both cross-sectional and cohort designs and allow for a variety of assumed within-cluster correlation structures. We consider cluster heterogeneity in treatment effects (for designs where treatment is crossed with cluster), as well as individually randomized group-treatment trials with differential clustering between arms, for example designs where clustering arises from interventions being delivered in groups. The calculator will compute power or precision, as a function of cluster size or number of clusters, for a wide variety of designs and correlation structures. We illustrate the methodology and the flexibility of the Shiny CRT Calculator using a range of examples.
Project description:SynRio is a Shiny and R based web analysis portal for viewing Synechocystis PCC 6803 genome, a cyanobacterial genome with data analysis capabilities. The web based user interface is created using R programming language powered by Shiny package. This web interface helps in creating interactive genome visualization based on user provided data selection along with selective data download options.SinRio is available to download freely from Github - https://github.com/NFMC/SynRio or from http://www.nfmc.res.in/synrio/. In addition an online version of the platform is also hosted at nfmc.res.in/synrio, using shiny server (open source edition) installation.
Project description:SUMMARY:Single-cell RNA-sequencing (scRNA-seq) technology enables studying gene expression programs from individual cells. However, these data are subject to diverse sources of variation, including 'unwanted' variation that needs to be removed in downstream analyses (e.g. batch effects) and 'wanted' or biological sources of variation (e.g. variation associated with a cell type) that needs to be precisely described. Surrogate variable analysis (SVA)-based algorithms, are commonly used for batch correction and more recently for studying 'wanted' variation in scRNA-seq data. However, interpreting whether these variables are biologically meaningful or stemming from technical reasons remains a challenge. To facilitate the interpretation of surrogate variables detected by algorithms including IA-SVA, SVA or ZINB-WaVE, we developed an R Shiny application [Visual Surrogate Variable Analysis (V-SVA)] that provides a web-browser interface for the identification and annotation of hidden sources of variation in scRNA-seq data. This interactive framework includes tools for discovery of genes associated with detected sources of variation, gene annotation using publicly available databases and gene sets, and data visualization using dimension reduction methods. AVAILABILITY AND IMPLEMENTATION:The V-SVA Shiny application is publicly hosted at https://vsva.jax.org/ and the source code is freely available at https://github.com/nlawlor/V-SVA. CONTACT:email@example.com or firstname.lastname@example.org. SUPPLEMENTARY INFORMATION:Supplementary data are available at Bioinformatics online.
Project description:Scent is one of the most important economic traits in <i>Freesia hybrida</i>. "Shiny Gold", a popular cultivar in South Korea, is widely cultivated for its scent. The relative scent intensity of "Shiny Gold" was approximately 16% higher in full-bloomed flower when compared to the yellow bud stage, while tissue-specifically, tepals showed higher intensity in electronic-nose (e-nose) analysis. E-nose analysis also showed that the scent intensity of "Shiny Gold" was higher and lower than "10C3-424" and "10C3-894", respectively, and was similar to "Yvonne". These results correlated to those of the olfactory tests. In total, 19 volatile compounds, including linalool, ?-ocimene, D-limonene, trans-?-ionone were detected in gas chromatography-mass spectrometry analysis. Among these, linalool was the major volatile compound, accounting for 38.7% in "Shiny Gold". Linalool synthase and <i>TPS</i> gene expression corresponded to the scent intensity of the four cultivars, with the lowest expression in the "10C3-424". <i>TPS 2</i>, <i>TPS 3</i>, <i>TPS 5</i>, <i>TPS 6</i> and <i>TPS 8</i> were highly expressed in both bud and flower in "Shiny Gold", while the expression of <i>TPS 4</i> was lower, relative to other <i>TPS</i> genes in both the flowering stages. These results may aid in enhancing scent composition in <i>Freesia</i> cultivars using marker-assisted selection.
Project description:BACKGROUND:Precision medicine is a medical approach that takes into account individual genetic variability and often requires Next Generation Sequencing data in order to predict new treatments. Here we present GMIEC, Genomic Modules Identification et Characterization for genomics medicine, an application that is able to identify specific drugs at the level of single patient integrating multi-omics data such as RNA-sequencing, copy-number variation, methylation, Chromatin Immuno-Precipitation and Exome/Whole Genome sequencing. It is also possible to include clinical data related to each patient. GMIEC has been developed as a web-based R-Shiny platform and gives as output a table easy to use and explore. RESULTS:We present GMIEC, a Shiny application for genomics medicine. The tool allows the users the integration of two or more multiple omics datasets (e.g. gene-expression, copy-number), at sample level, to identify groups of genes that share common genomic and corresponding drugs. We demonstrate the characteristics of our application by using it to analyze a prostate cancer data set. CONCLUSIONS:GMIEC provides a simple interface for genomics medicine. GMIEC was develop with Shiny to provide an application that does not require advanced programming skills. GMIEC consists of three sub-application for the analysis (GMIEC-AN), the visualization (GMIEC-VIS) and the exploration of results (GMIEC-RES). GMIEC is an open source software and is available at https://github.com/guidmt/GMIEC-shiny.
Project description:Health economic evaluation models have traditionally been built in Microsoft Excel, but more sophisticated tools are increasingly being used as model complexity and computational requirements increase. Of all the programming languages, R is most popular amongst health economists because it has a plethora of user created packages and is highly flexible. However, even with an integrated development environment such as R Studio, R lacks a simple point and click user interface and therefore requires some programming ability. This might make the switch from Microsoft Excel to R seem daunting, and it might make it difficult to directly communicate results with decisions makers and other stakeholders. The R package Shiny has the potential to resolve this limitation. It allows programmers to embed health economic models developed in R into interactive web browser based user interfaces. Users can specify their own assumptions about model parameters and run different scenario analyses, which, in the case of regular a Markov model, can be computed within seconds. This paper provides a tutorial on how to wrap a health economic model built in R into a Shiny application. We use a four-state Markov model developed by the Decision Analysis in R for Technologies in Health (DARTH) group as a case-study to demonstrate main principles and basic functionality. A more extensive tutorial, all code, and data are provided in a GitHub repository.
Project description:Differential network analysis is an important way to understand network rewiring involved in disease progression and development. Building differential networks from multiple 'omics data provides insight into the holistic differences of the interactive system under different patient-specific groups. DINGO was developed to infer group-specific dependencies and build differential networks. However, DINGO and other existing tools are limited to analyze data arising from a single platform, and modeling each of the multiple 'omics data independently does not account for the hierarchical structure of the data.We developed the iDINGO R package to estimate group-specific dependencies and make inferences on the integrative differential networks, considering the biological hierarchy among the platforms. A Shiny application has also been developed to facilitate easier analysis and visualization of results, including integrative differential networks and hub gene identification across platforms.R package is available on CRAN (https://cran.r-project.org/web/packages/iDINGO) and Shiny application at https://github.com/MinJinHa/iDINGO.email@example.com.Supplementary data are available at Bioinformatics online.
Project description:<h4>Background</h4>In randomised controlled trials with only few randomisation units, treatment allocation may be challenging if balanced distributions of many covariates or baseline outcome measures are desired across all treatment groups. Both traditional approaches, stratified randomisation and allocation by minimisation, have their own limitations. A third method for achieving balance consists of randomly choosing from a preselected list of sufficiently balanced allocations. As with minimisation, this method requires that heterogeneity between treatment groups is measured by specified imbalance metrics. Although certain imbalance measures are more commonly used than others, to the author's knowledge there is no generally accepted "gold standard", neither for categorical and even less so for continuous variables.<h4>Methods</h4>An intuitive and easily accessible web-based software tool was developed which allows for balancing multiple variables of different types and using various imbalance metrics. Different metrics were compared in a simulation study.<h4>Results</h4>Using simulated data, it could be shown that for categorical variables, ?<sup>2</sup>-based imbalance measures seem to be viable alternatives to the established "quadratic imbalance" metric. For continuous variables, using the area between the empirical cumulative distribution functions or the largest difference in the three pairs of quartiles is recommended to measure imbalance. Another imbalance metric suggested in the literature for continuous variables, the (symmetrised) Kullback-Leibler divergence, should be used with caution.<h4>Conclusion</h4>The Shiny Balancer offers the possibility to visually explore the balancing properties of several well established or newly suggested imbalance metrics, and its use is particularly advocated in clinical studies with few randomisation units, as it is typically the case in cluster randomised trials.