Project description:Alzheimer's disease (AD) is the most uncertain form of Dementia in terms of finding out the mechanism. AD does not have a vital genetic factor to relate to. There were no reliable techniques and methods to identify the genetic risk factors associated with AD in the past. Most of the data available were from the brain images. However, recently, there have been drastic advancements in the high-throughput techniques in bioinformatics. It has led to focused researches in discovering the AD causing genetic risk factors. Recent analysis has resulted in considerable prefrontal cortex data with which classification and prediction models can be developed for AD. We have developed a Deep Belief Network-based prediction model using the DNA Methylation and Gene Expression Microarray Data, with High Dimension Low Sample Size (HDLSS) issues. To overcome the HDLSS challenge, we performed a two-layer feature selection considering the biological aspects of the features as well. In the two-layered feature selection approach, first the differentially expressed genes and differentially methylated positions are identified, then both the datasets are combined using Jaccard similarity measure. As the second step, an ensemble-based feature selection approach is implemented to further narrow down the gene selection. The results show that the proposed feature selection technique outperforms the existing commonly used feature selection techniques, such as Support Vector Machine Recursive Feature Elimination (SVM-RFE), and Correlation-based Feature Selection (CBS). Furthermore, the Deep Belief Network-based prediction model performs better than the widely used Machine Learning models. Also, the multi-omics dataset shows promising results compared to the single omics.
Project description:Researchers are increasingly seeking to interpret molecular data within a multi-omics context to gain a more comprehensive picture of their study system. OmicsNet (www.omicsnet.ca) is a web-based tool developed to allow users to easily build, visualize, and analyze multi-omics networks to study rich relationships among lists of 'omics features of interest. Three major improvements have been introduced in OmicsNet 2.0, which include: (i) enhanced network visual analytics with eleven 2D graph layout options and a novel 3D module layout; (ii) support for three new 'omics types: single nucleotide polymorphism (SNP) list from genetic variation studies; taxon list from microbiome profiling studies, as well as liquid chromatography-mass spectrometry (LC-MS) peaks from untargeted metabolomics; and (iii) measures to improve research reproducibility by coupling R command history with the release of the companion OmicsNetR package, and generation of persistent links to share interactive network views. We performed a case study using the multi-omics data obtained from a recent large-scale investigation on inflammatory bowel disease (IBD) and demonstrated that OmicsNet was able to quickly create meaningful multi-omics context to facilitate hypothesis generation and mechanistic insights.
Project description:It is essential to reveal the associations between various omics data for a comprehensive understanding of the altered biological process in human wellness and disease. To date, very few studies have focused on collecting and exhibiting multi-omics associations in a single database. Here, we present iNetModels, an interactive database and visualization platform of Multi-Omics Biological Networks (MOBNs). This platform describes the associations between the clinical chemistry, anthropometric parameters, plasma proteomics, plasma metabolomics, as well as metagenomics for oral and gut microbiome obtained from the same individuals. Moreover, iNetModels includes tissue- and cancer-specific Gene Co-expression Networks (GCNs) for exploring the connections between the specific genes. This platform allows the user to interactively explore a single feature's association with other omics data and customize its particular context (e.g. male/female specific). The users can also register their data for sharing and visualization of the MOBNs and GCNs. Moreover, iNetModels allows users who do not have a bioinformatics background to facilitate human wellness and disease research. iNetModels can be accessed freely at https://inetmodels.com without any limitation.
Project description:Multi-omics integration is key to fully understand complex biological processes in an holistic manner. Furthermore, multi-omics combined with new longitudinal experimental design can unreveal dynamic relationships between omics layers and identify key players or interactions in system development or complex phenotypes. However, integration methods have to address various experimental designs and do not guarantee interpretable biological results. The new challenge of multi-omics integration is to solve interpretation and unlock the hidden knowledge within the multi-omics data. In this paper, we go beyond integration and propose a generic approach to face the interpretation problem. From multi-omics longitudinal data, this approach builds and explores hybrid multi-omics networks composed of both inferred and known relationships within and between omics layers. With smart node labelling and propagation analysis, this approach predicts regulation mechanisms and multi-omics functional modules. We applied the method on 3 case studies with various multi-omics designs and identified new multi-layer interactions involved in key biological functions that could not be revealed with single omics analysis. Moreover, we highlighted interplay in the kinetics that could help identify novel biological mechanisms. This method is available as an R package netOmics to readily suit any application.
Project description:Together with various hosts and environments, ubiquitous microbes interact closely with each other forming an intertwined system or community. Of interest, shifts of the relationships between microbes and their hosts or environments are associated with critical diseases and ecological changes. While advances in high-throughput Omics technologies offer a great opportunity for understanding the structures and functions of microbiome, it is still challenging to analyse and interpret the omics data. Specifically, the heterogeneity and diversity of microbial communities, compounded with the large size of the datasets, impose a tremendous challenge to mechanistically elucidate the complex communities. Fortunately, network analyses provide an efficient way to tackle this problem, and several network approaches have been proposed to improve this understanding recently. Here, we systemically illustrate these network theories that have been used in biological and biomedical research. Then, we review existing network modelling methods of microbial studies at multiple layers from metagenomics to metabolomics and further to multi-omics. Lastly, we discuss the limitations of present studies and provide a perspective for further directions in support of the understanding of microbial communities.
Project description:Motivation:Several molecular events are known to be cancer-related, including genomic aberrations, hypermethylation of gene promoter regions and differential expression of microRNAs. These aberration events are very heterogeneous across tumors and it is poorly understood how they affect the molecular makeup of the cell, including the transcriptome and proteome. Protein interaction networks can help decode the functional relationship between aberration events and changes in gene and protein expression. Results:We developed NetICS (Network-based Integration of Multi-omics Data), a new graph diffusion-based method for prioritizing cancer genes by integrating diverse molecular data types on a directed functional interaction network. NetICS prioritizes genes by their mediator effect, defined as the proximity of the gene to upstream aberration events and to downstream differentially expressed genes and proteins in an interaction network. Genes are prioritized for individual samples separately and integrated using a robust rank aggregation technique. NetICS provides a comprehensive computational framework that can aid in explaining the heterogeneity of aberration events by their functional convergence to common differentially expressed genes and proteins. We demonstrate NetICS' competitive performance in predicting known cancer genes and in generating robust gene lists using TCGA data from five cancer types. Availability and implementation:NetICS is available at https://github.com/cbg-ethz/netics. Supplementary information:Supplementary data are available at Bioinformatics online.
Project description:An increasingly common method for predicting gene activity is genome-wide chromatin immunoprecipitation of M-bM-^@M-^XactiveM-bM-^@M-^Y chromatin modifications followed by massively parallel sequencing (ChIP-seq). Using a novel ChIP-seq quantification method (cRPKM), we tested the power of such ChIP-seq strategies to predict relative protein and RNA levels at the pre-pro-B and pro-B differentiation stages in early B cell lymphopoiesis. Using a multi-omics approach that compares promoter chromatin status (ChIP-seq; published in GSE:21978) with ongoing active transcription (GRO-seq; published in GSE:40173), steady state mRNA (RNA-seq), inferred mRNA stability, and relative proteome abundance measurements (iTRAQ), we demonstrate that active chromatin modifications at promoters are a good indicator of transcription and steady state mRNA levels. Moreover, we found that promoters with active chromatin modifications exclusively in one of these cell states frequently predicted differentially expressed proteins. However, we found that many genes whose promoters have non-differential but active chromatin modifications also displayed changes in expression of their cognate proteins. This large class of developmentally and differentially regulated proteins that was uncoupled from chromatin status used mostly post-transcriptional mechanisms. Interestingly, the most differentially expressed protein in our B-cell development system, 2410004B18Rik, was regulated by a post-transcriptional mechanism, which further analyses indicated was mediated by an identified miRNA. These data provide a striking example of how our integrated multi-omics data set can be useful in uncovering regulatory mechanisms. Total RNA from mouse pre-pro-B and pro-B cells, depleted of rRNA and small RNAs, was sequenced using a strand specific, single end sequencing strategy.
Project description:Renal cell carcinoma (RCC) ranks among the most prevalent cancers worldwide, with both incidence and mortality rates increasing annually. The heterogeneity among RCC patients presents considerable challenges for developing universally effective treatment strategies, emphasizing the necessity of in-depth research into RCC's molecular mechanisms, understanding the variations among RCC patients and further identifying distinct molecular subtypes for precise treatment. We proposed a metagene-based similarity network fusion (Meta-SNF) method for RCC subtype identification with multi-omics data, using a non-negative matrix factorization technique to capture alternative structures inherent in the dataset as metagenes. These latent metagenes were then integrated to construct a fused network under the Similarity Network Fusion (SNF) framework for more precise subtyping. We conducted simulation studies and analyzed real-world data from two RCC datasets, namely kidney renal clear cell carcinoma (KIRC) and kidney renal papillary cell carcinoma (KIRP) to demonstrate the utility of Meta-SNF. The simulation studies indicated that Meta-SNF achieved higher accuracy in subtype identification compared with the original SNF and other state-of-the-art methods. In analyses of real data, Meta-SNF produced more distinct and well-separated clusters, classifying both KIRC and KIRP into four subtypes with significant differences in survival outcomes. Subsequently, we performed comprehensive bioinformatics analyses focused on subtypes with poor prognoses in KIRC and KIRP and identified several potential biomarkers. Meta-SNF offers a novel strategy for subtype identification using multi-omics data, and its application to RCC datasets has yielded diverse biological insights which are highly valuable for informing clinical decision-making processes in the treatment of RCC.
Project description:Although genome-wide association studies (GWASs) have successfully identified thousands of risk variants for human complex diseases, understanding the biological function and molecular mechanisms of the associated SNPs involved in complex diseases is challenging. Here we developed a framework named integrative multi-omics network-based approach (IMNA), aiming to identify potential key genes in regulatory networks by integrating molecular interactions across multiple biological scales, including GWAS signals, gene expression-based signatures, chromatin interactions and protein interactions from the network topology. We applied this approach to breast cancer, and prioritized key genes involved in regulatory networks. We also developed an abnormal gene expression score (AGES) signature based on the gene expression deviation of the top 20 rank-ordered genes in breast cancer. The AGES values are associated with genetic variants, tumor properties and patient survival outcomes. Among the top 20 genes, RNASEH2A was identified as a new candidate gene for breast cancer. Thus, our integrative network-based approach provides a genetic-driven framework to unveil tissue-specific interactions from multiple biological scales and reveal potential key regulatory genes for breast cancer. This approach can also be applied in other complex diseases such as ovarian cancer to unravel underlying mechanisms and help for developing therapeutic targets.
Project description:MotivationAccurate disease phenotype prediction plays an important role in the treatment of heterogeneous diseases like cancer in the era of precision medicine. With the advent of high throughput technologies, more comprehensive multi-omics data is now available that can effectively link the genotype to phenotype. However, the interactive relation of multi-omics datasets makes it particularly challenging to incorporate different biological layers to discover the coherent biological signatures and predict phenotypic outcomes. In this study, we introduce omicsGAN, a generative adversarial network model to integrate two omics data and their interaction network. The model captures information from the interaction network as well as the two omics datasets and fuse them to generate synthetic data with better predictive signals.ResultsLarge-scale experiments on The Cancer Genome Atlas breast cancer, lung cancer and ovarian cancer datasets validate that (i) the model can effectively integrate two omics data (e.g. mRNA and microRNA expression data) and their interaction network (e.g. microRNA-mRNA interaction network). The synthetic omics data generated by the proposed model has a better performance on cancer outcome classification and patients survival prediction compared to original omics datasets. (ii) The integrity of the interaction network plays a vital role in the generation of synthetic data with higher predictive quality. Using a random interaction network does not allow the framework to learn meaningful information from the omics datasets; therefore, results in synthetic data with weaker predictive signals.Availability and implementationSource code is available at: https://github.com/CompbioLabUCF/omicsGAN.Supplementary informationSupplementary data are available at Bioinformatics online.