EINVis: a visualization tool for analyzing and exploring genetic interactions in large-scale association studies.
ABSTRACT: Epistasis (gene-gene interaction) detection in large-scale genetic association studies has recently drawn extensive research interests as many complex traits are likely caused by the joint effect of multiple genetic factors. The large number of possible interactions poses both statistical and computational challenges. A variety of approaches have been developed to address the analytical challenges in epistatic interaction detection. These methods usually output the identified genetic interactions and store them in flat file formats. It is highly desirable to develop an effective visualization tool to further investigate the detected interactions and unravel hidden interaction patterns. We have developed EINVis, a novel visualization tool that is specifically designed to analyze and explore genetic interactions. EINVis displays interactions among genetic markers as a network. It utilizes a circular layout (specially, a tree ring view) to simultaneously visualize the hierarchical interactions between single nucleotide polymorphisms (SNPs), genes, and chromosomes, and the network structure formed by these interactions. Using EINVis, the user can distinguish marginal effects from interactions, track interactions involving more than two markers, visualize interactions at different levels, and detect proxy SNPs based on linkage disequilibrium. EINVis is an effective and user-friendly free visualization tool for analyzing and exploring genetic interactions. It is publicly available with detailed documentation and online tutorial on the web at http://filer.case.edu/yxw407/einvis/.
Project description:<h4>Motivation</h4>Visualization of multiple genomic data generally requires the use of public or commercially hosted browsers. Flexible visualization of chromatin interaction data as genomic features and network components offer informative insights to gene expression. An open source application for visualizing HiC and chromatin conformation-based data as 2D-arcs accompanied by interactive network analyses is valuable.<h4>Results</h4>DNA Rchitect is a new tool created to visualize HiC and chromatin conformation-based contacts at high (Kb) and low (Mb) genomic resolutions. The user can upload their pre-filtered HiC experiment in bedpe format to the DNA Rchitect web app that we have hosted or to a version they themselves have deployed. Using DNA Rchitect, the uploaded data allows the user to visualize different interactions of their sample, perform simple network analyses, while also offering visualization of other genomic data types. The user can then download their results for additional network functionality offered in network based programs such as Cytoscape.<h4>Availability and implementation</h4>DNA Rchitect is freely available both as a web application written primarily in R available at http://shiny.immgen.org/DNARchitect/ and as an open source released under an MIT license at: https://github.com/alosdiallo/DNA_Rchitect.
Project description:<h4>Background</h4>The increasing availability of HIV-host interaction datasets, including both physical and genetic interactions, has created a need for software tools to integrate and visualize the data. Because these host-pathogen interactions are extensive and interactions between human proteins are found within many different databases, it is difficult to generate integrated HIV-human interaction networks.<h4>Results</h4>We have developed a web-based platform, termed GPS-Prot http://www.gpsprot.org, that allows for facile integration of different HIV interaction data types as well as inclusion of interactions between human proteins derived from publicly-available databases, including MINT, BioGRID and HPRD. The software has the ability to group proteins into functional modules or protein complexes, generating more intuitive network representations and also allows for the uploading of user-generated data.<h4>Conclusions</h4>GPS-Prot is a software tool that allows users to easily create comprehensive and integrated HIV-host networks. A major advantage of this platform compared to other visualization tools is its web-based format, which requires no software installation or data downloads. GPS-Prot allows novice users to quickly generate networks that combine both genetic and protein-protein interactions between HIV and its human host into a single representation. Ultimately, the platform is extendable to other host-pathogen systems.
Project description:Reactome (http://www.reactome.org) is an open-source, expert-authored, peer-reviewed, manually curated database of reactions, pathways and biological processes. We provide an intuitive web-based user interface to pathway knowledge and a suite of data analysis tools. The Pathway Browser is a Systems Biology Graphical Notation-like visualization system that supports manual navigation of pathways by zooming, scrolling and event highlighting, and that exploits PSI Common Query Interface web services to overlay pathways with molecular interaction data from the Reactome Functional Interaction Network and interaction databases such as IntAct, ChEMBL and BioGRID. Pathway and expression analysis tools employ web services to provide ID mapping, pathway assignment and over-representation analysis of user-supplied data sets. By applying Ensembl Compara to curated human proteins and reactions, Reactome generates pathway inferences for 20 other species. The Species Comparison tool provides a summary of results for each of these species as a table showing numbers of orthologous proteins found by pathway from which users can navigate to inferred details for specific proteins and reactions. Reactome's diverse pathway knowledge and suite of data analysis tools provide a platform for data mining, modeling and analysis of large-scale proteomics data sets. This Tutorial is part of the International Proteomics Tutorial Programme (IPTP 8).
Project description:BACKGROUND: To understand individual genomes it is necessary to look at the variations that lead to changes in phenotype and possibly to disease. However, genotype information alone is often not sufficient and additional knowledge regarding the phase of the variation is needed to make correct interpretations. Interactive visualizations, that allow the user to explore the data in various ways, can be of great assistance in the process of making well informed decisions. But, currently there is a lack for visualizations that are able to deal with phased haplotype data. RESULTS: We present inPHAP, an interactive visualization tool for genotype and phased haplotype data. inPHAP features a variety of interaction possibilities such as zooming, sorting, filtering and aggregation of rows in order to explore patterns hidden in large genetic data sets. As a proof of concept, we apply inPHAP to the phased haplotype data set of Phase 1 of the 1000 Genomes Project. Thereby, inPHAP's ability to show genetic variations on the population as well as on the individuals level is demonstrated for several disease related loci. CONCLUSIONS: As of today, inPHAP is the only visual analytical tool that allows the user to explore unphased and phased haplotype data interactively. Due to its highly scalable design, inPHAP can be applied to large datasets with up to 100 GB of data, enabling users to visualize even large scale input data. inPHAP closes the gap between common visualization tools for unphased genotype data and introduces several new features, such as the visualization of phased data. inPHAP is available for download at http://bit.ly/1iJgKmX.
Project description:BACKGROUND:Plastid genomes typically display a circular, quadripartite structure with two inverted repeat regions, which challenges automatic assembly procedures. The correct assembly of plastid genomes is a prerequisite for the validity of subsequent analyses on genome structure and evolution. The average coverage depth of a genome assembly is often used as an indicator of assembly quality. Visualizing coverage depth across a draft genome is a critical step, which allows users to inspect the quality of the assembly and, where applicable, identify regions of reduced assembly confidence. Despite the interplay between genome structure and assembly quality, no contemporary, user-friendly software tool can visualize the coverage depth of a plastid genome assembly while taking its quadripartite genome structure into account. A software tool is needed that fills this void. RESULTS:We introduce 'PACVr', an R package that visualizes the coverage depth of a plastid genome assembly in relation to the circular, quadripartite structure of the genome as well as the individual plastome genes. By using a variable window approach, the tool allows visualizations on different calculation scales. It also confirms sequence equality of, as well as visualizes gene synteny between, the inverted repeat regions of the input genome. As a tool for plastid genomics, PACVr provides the functionality to identify regions of coverage depth above or below user-defined threshold values and helps to identify non-identical IR regions. To allow easy integration into bioinformatic workflows, PACVr can be invoked from a Unix shell, facilitating its use in automated quality control. We illustrate the application of PACVr on four empirical datasets and compare visualizations generated by PACVr with those of alternative software tools. CONCLUSIONS:PACVr provides a user-friendly tool to visualize (a) the coverage depth of a plastid genome assembly on a circular, quadripartite plastome map and in relation to individual plastome genes, and (b) gene synteny across the inverted repeat regions. It contributes to optimizing plastid genome assemblies and increasing the reliability of publicly available plastome sequences. The software, example datasets, technical documentation, and a tutorial are available with the package at https://cran.r-project.org/package=PACVr.
Project description:Rapidly increasing amounts of (physical and genetic) protein-protein interaction (PPI) data are produced by various high-throughput techniques, and interpretation of these data remains a major challenge. In order to gain insight into the organization and structure of the resultant large complex networks formed by interacting molecules, using simulated annealing, a method based on the node connectivity, we developed ModuleRole, a user-friendly web server tool which finds modules in PPI network and defines the roles for every node, and produces files for visualization in Cytoscape and Pajek. For given proteins, it analyzes the PPI network from BioGRID database, finds and visualizes the modules these proteins form, and then defines the role every node plays in this network, based on two topological parameters Participation Coefficient and Z-score. This is the first program which provides interactive and very friendly interface for biologists to find and visualize modules and roles of proteins in PPI network. It can be tested online at the website http://www.bioinfo.org/modulerole/index.php, which is free and open to all users and there is no login requirement, with demo data provided by "User Guide" in the menu Help. Non-server application of this program is considered for high-throughput data with more than 200 nodes or user's own interaction datasets. Users are able to bookmark the web link to the result page and access at a later time. As an interactive and highly customizable application, ModuleRole requires no expert knowledge in graph theory on the user side and can be used in both Linux and Windows system, thus a very useful tool for biologist to analyze and visualize PPI networks from databases such as BioGRID.ModuleRole is implemented in Java and C, and is freely available at http://www.bioinfo.org/modulerole/index.php. Supplementary information (user guide, demo data) is also available at this website. API for ModuleRole used for this program can be obtained upon request.
Project description:PiNGO is a tool to screen biological networks for candidate genes, i.e. genes predicted to be involved in a biological process of interest. The user can narrow the search to genes with particular known functions or exclude genes belonging to particular functional classes. PiNGO provides support for a wide range of organisms and Gene Ontology classification schemes, and it can easily be customized for other organisms and functional classifications. PiNGO is implemented as a plugin for Cytoscape, a popular network visualization platform.PiNGO is distributed as an open-source Java package under the GNU General Public License (http://www.gnu.org/), and can be downloaded via the Cytoscape plugin manager. A detailed user guide and tutorial are available on the PiNGO website (http://www.psb.ugent.be/esb/PiNGO.
Project description:The Distributed Annotation System (DAS) is a protocol for easy sharing and integration of biological annotations. In order to visualize feature annotations in a genomic context a client is required. Here we present myKaryoView, a simple light-weight DAS tool for visualization of genomic annotation. myKaryoView has been specifically configured to help analyse data derived from personal genomics, although it can also be used as a generic genome browser visualization. Several well-known data sources are provided to facilitate comparison of known genes and normal variation regions. The navigation experience is enhanced by simultaneous rendering of different levels of detail across chromosomes. A simple interface is provided to allow searches for any SNP, gene or chromosomal region. User-defined DAS data sources may also be added when querying the system. We demonstrate myKaryoView capabilities for adding user-defined sources with a set of genetic profiles of family-related individuals downloaded directly from 23andMe. myKaryoView is a web tool for visualization of genomic data specifically designed for direct-to-consumer genomic data that uses publicly available data distributed throughout the Internet. It does not require data to be held locally and it is capable of rendering any feature as long as it conforms to DAS specifications. Configuration and addition of sources to myKaryoView can be done through the interface. Here we show a proof of principle of myKaryoView's ability to display personal genomics data with 23andMe genome data sources. The tool is available at: http://mykaryoview.com.
Project description:SilkDB is an open-accessibility database and powerful platform that provides comprehensive information on the silkworm (Bombyx mori) genome. Since SilkDB 2.0 was released 10 years ago, vast quantities of data about multiple aspects of the silkworm have been generated, including genome, transcriptome, Hi-C and pangenome. To visualize data at these different biological levels, we present SilkDB 3.0 (https://silkdb.bioinfotoolkits.net), a visual analytic tool for exploring silkworm data through an interactive user interface. The database contains a high-quality chromosome-level assembly of the silkworm genome, and its coding sequences and gene sets are more accurate than those in the previous version. SilkDB 3.0 provides a view of the information for each gene at the levels of sequence, protein structure, gene family, orthology, synteny, genome organization and gives access to gene expression information, genetic variation and genome interaction map. A set of visualization tools are available to display the abundant information in the above datasets. With an improved interactive user interface for the integration of large data sets, the updated SilkDB 3.0 database will be a valuable resource for the silkworm and insect research community.
Project description:BACKGROUND: Expression levels for 47294 transcripts in lymphoblastoid cell lines from all 270 HapMap phase II individuals, and genotypes (both HapMap phase II and III) of 3.96 million single nucleotide polymorphisms (SNPs) in the same individuals are publicly available. We aimed to generate a user-friendly web based tool for visualization of the correlation between SNP genotypes within a specified genomic region and a gene of interest, which is also well-known as an expression quantitative trait locus (eQTL) analysis. RESULTS: SNPexp is implemented as a server-side script, and publicly available on this website: http://tinyurl.com/snpexp. Correlation between genotype and transcript expression levels are calculated by performing linear regression and the Wald test as implemented in PLINK and visualized using the UCSC Genome Browser. Validation of SNPexp using previously published eQTLs yielded comparable results. CONCLUSIONS: SNPexp provides a convenient and platform-independent way to calculate and visualize the correlation between HapMap genotypes within a specified genetic region anywhere in the genome and gene expression levels. This allows for investigation of both cis and trans effects. The web interface and utilization of publicly available and widely used software resources makes it an attractive supplement to more advanced bioinformatic tools. For the advanced user the program can be used on a local computer on custom datasets.