An integrated, high-throughput strategy for multi-omic systems level analysis
ABSTRACT: This report details the automation, benchmarking, and application of a strategy for transcriptomic, proteomic, and metabolomic analyses from a common sample. The approach, Sample Preparation for multi-Omics Technologies (SPOT), provides equivalent performance to typical individual omic preparation methods, but it greatly enhances throughput and minimizes the resources required for multi-omic experiments. SPOT was applied to a multi-omics time course experiment for zinc-treated HL60 cells.
Project description:Proteomics, metabolomics, and transcriptomics generate comprehensive data sets, and current biocomputational capabilities allow their efficient integration for systems biology analysis. Published multiomics studies cover methodological advances as well as applications to biological questions. However, few studies have focused on the development of a high-throughput, unified sample preparation approach to complement high-throughput omic analytics. This report details the automation, benchmarking, and application of a strategy for transcriptomic, proteomic, and metabolomic analyses from a common sample. The approach, sample preparation for multi-omics technologies (SPOT), provides equivalent performance to typical individual omic preparation methods but greatly enhances throughput and minimizes the resources required for multiomic experiments. SPOT was applied to a multiomics time course experiment for zinc-treated HL-60 cells. The data reveal Zn effects on NRF2 antioxidant and NFkappaB signaling. High-throughput approaches such as these are critical for the acquisition of temporally resolved, multicondition, large multiomic data sets such as those necessary to assess complex clinical and biological concerns. Ultimately, this type of approach will provide an expanded understanding of challenging scientific questions across many fields.
Project description:Mass spectrometry (MS)-based integrated metaproteomic, metabolomic, and lipidomic (multi-omic) studies are transforming our ability to understand and characterize microbial communities in environmental and biological systems. These measurements are even enabling enhanced analyses of complex soil microbial communities, which are the most complex microbial systems known to date. Multi-omic analyses, however, do have sample preparation challenges, since separate extractions are typically needed for each omic study, thereby greatly amplifying the preparation time and amount of sample required. To address this limitation, a 3-in-1 method for the simultaneous extraction of metabolites, proteins, and lipids (MPLEx) from the same soil sample was created by adapting a solvent-based approach. This MPLEx protocol has proven to be both simple and robust for many sample types, even when utilized for limited quantities of complex soil samples. The MPLEx method also greatly enabled the rapid multi-omic measurements needed to gain a better understanding of the members of each microbial community, while evaluating the changes taking place upon biological and environmental perturbations.
Project description:Recent advances in experimental biology allow creation of datasets where several genome-wide data types (called omics) are measured per sample. Integrative analysis of multi-omic datasets in general, and clustering of samples in such datasets specifically, can improve our understanding of biological processes and discover different disease subtypes. In this work we present MONET (Multi Omic clustering by Non-Exhaustive Types), which presents a unique approach to multi-omic clustering. MONET discovers modules of similar samples, such that each module is allowed to have a clustering structure for only a subset of the omics. This approach differs from most existent multi-omic clustering algorithms, which assume a common structure across all omics, and from several recent algorithms that model distinct cluster structures. We tested MONET extensively on simulated data, on an image dataset, and on ten multi-omic cancer datasets from TCGA. Our analysis shows that MONET compares favorably with other multi-omic clustering methods. We demonstrate MONET's biological and clinical relevance by analyzing its results for Ovarian Serous Cystadenocarcinoma. We also show that MONET is robust to missing data, can cluster genes in multi-omic dataset, and reveal modules of cell types in single-cell multi-omic data. Our work shows that MONET is a valuable tool that can provide complementary results to those provided by existent algorithms for multi-omic analysis.
Project description:Multi-omic studies combine measurements at different molecular levels to build comprehensive models of cellular systems. The success of a multi-omic data analysis strategy depends largely on the adoption of adequate experimental designs, and on the quality of the measurements provided by the different omic platforms. However, the field lacks a comparative description of performance parameters across omic technologies and a formulation for experimental design in multi-omic data scenarios. Here, we propose a set of harmonized Figures of Merit (FoM) as quality descriptors applicable to different omic data types. Employing this information, we formulate the MultiPower method to estimate and assess the optimal sample size in a multi-omics experiment. MultiPower supports different experimental settings, data types and sample sizes, and includes graphical for experimental design decision-making. MultiPower is complemented with MultiML, an algorithm to estimate sample size for machine learning classification problems based on multi-omic data.
Project description:Recent high throughput experimental methods have been used to collect large biomedical omics datasets. Clustering of single omic datasets has proven invaluable for biological and medical research. The decreasing cost and development of additional high throughput methods now enable measurement of multi-omic data. Clustering multi-omic data has the potential to reveal further systems-level insights, but raises computational and biological challenges. Here, we review algorithms for multi-omics clustering, and discuss key issues in applying these algorithms. Our review covers methods developed specifically for omic data as well as generic multi-view methods developed in the machine learning community for joint clustering of multiple data types. In addition, using cancer data from TCGA, we perform an extensive benchmark spanning ten different cancer types, providing the first systematic comparison of leading multi-omics and multi-view clustering algorithms. The results highlight key issues regarding the use of single- versus multi-omics, the choice of clustering strategy, the power of generic multi-view methods and the use of approximated p-values for gauging solution quality. Due to the growing use of multi-omics data, we expect these issues to be important for future progress in the field.
Project description:MOTIVATION:Cancer subtypes were usually defined based on molecular characterization of single omic data. Increasingly, measurements of multiple omic profiles for the same cohort are available. Defining cancer subtypes using multi-omic data may improve our understanding of cancer, and suggest more precise treatment for patients. RESULTS:We present NEMO (NEighborhood based Multi-Omics clustering), a novel algorithm for multi-omics clustering. Importantly, NEMO can be applied to partial datasets in which some patients have data for only a subset of the omics, without performing data imputation. In extensive testing on ten cancer datasets spanning 3168 patients, NEMO achieved results comparable to the best of nine state-of-the-art multi-omics clustering algorithms on full data and showed an improvement on partial data. On some of the partial data tests, PVC, a multi-view algorithm, performed better, but it is limited to two omics and to positive partial data. Finally, we demonstrate the advantage of NEMO in detailed analysis of partial data of AML patients. NEMO is fast and much simpler than existing multi-omics clustering algorithms, and avoids iterative optimization. AVAILABILITY AND IMPLEMENTATION:Code for NEMO and for reproducing all NEMO results in this paper is in github: https://github.com/Shamir-Lab/NEMO. SUPPLEMENTARY INFORMATION:Supplementary data are available at Bioinformatics online.
Project description:The multi-omic integration of microbiota data with metabolomics has gained popularity. This protocol is based on a human multi-omics study, integrating cervicovaginal microbiota, HPV status and neoplasia, with urinary metabolites. Indeed, to understand the biology of the infections and to develop adequate interventions for cervical cancer prevention, studies are needed to characterize in detail the cervical microbiota and understand the systemic metabolome. This article is a detailed protocol for the multi-omic integration of cervical microbiota and urine metabolome to shed light on the systemic effects of cervical dysbioses associated with Human Papillomavirus (HPV) infections. This methods article suggests detailed sample collection and laboratory processes of metabolomics, DNA extraction for microbiota, HPV typing, and the bioinformatic analyses of the data, both to characterize the metabolome, the microbiota, and joint multi-omic analyses, useful for the development of new point-of-care diagnostic tests based on these approaches.
Project description:Two important challenges in the analysis of molecular biology information are data (multi-omic information) integration and the detection of patterns across large scale molecular networks and sequences. They are are actually coupled beause the integration of omic information may provide better means to detect multi-omic patterns that could reveal multi-scale or emerging properties at the phenotype levels.Here we address the problem of integrating various types of molecular information (a large collection of gene expression and sequence data, codon usage and protein abundances) to analyse the E.coli metabolic response to treatments at the whole network level. Our algorithm, MORA (Multi-omic relations adjacency) is able to detect patterns which may represent metabolic network motifs at pathway and supra pathway levels which could hint at some functional role. We provide a description and insights on the algorithm by testing it on a large database of responses to antibiotics. Along with the algorithm MORA, a novel model for the analysis of oscillating multi-omics has been proposed. Interestingly, the resulting analysis suggests that some motifs reveal recurring oscillating or position variation patterns on multi-omics metabolic networks. Our framework, implemented in R, provides effective and friendly means to design intervention scenarios on real data. By analysing how multi-omics data build up multi-scale phenotypes, the software allows to compare and test metabolic models, design new pathways or redesign existing metabolic pathways and validate in silico metabolic models using nearby species.The integration of multi-omic data reveals that E.coli multi-omic metabolic networks contain position dependent and recurring patterns which could provide clues of long range correlations in the bacterial genome.
Project description:The increasing availability of multi-omic platforms poses new challenges to data analysis. Joint visualization of multi-omics data is instrumental in better understanding interconnections across molecular layers and in fully utilizing the multi-omic resources available to make biological discoveries. We present here PaintOmics 3, a web-based resource for the integrated visualization of multiple omic data types onto KEGG pathway diagrams. PaintOmics 3 combines server-end capabilities for data analysis with the potential of modern web resources for data visualization, providing researchers with a powerful framework for interactive exploration of their multi-omics information. Unlike other visualization tools, PaintOmics 3 covers a comprehensive pathway analysis workflow, including automatic feature name/identifier conversion, multi-layered feature matching, pathway enrichment, network analysis, interactive heatmaps, trend charts, and more. It accepts a wide variety of omic types, including transcriptomics, proteomics and metabolomics, as well as region-based approaches such as ATAC-seq or ChIP-seq data. The tool is freely available at www.paintomics.org.
Project description:BACKGROUND:Patients with rare diseases face unique challenges in obtaining a diagnosis, appropriate medical care and access to support services. Whole genome and exome sequencing have increased identification of causal variants compared to single gene testing alone, with diagnostic rates of approximately 50% for inherited diseases, however integrated multi-omic analysis may further increase diagnostic yield. Additionally, multi-omic analysis can aid the explanation of genotypic and phenotypic heterogeneity, which may not be evident from single omic analyses. MAIN BODY:This scoping review took a systematic approach to comprehensively search the electronic databases MEDLINE, EMBASE, PubMed, Web of Science, Scopus, Google Scholar, and the grey literature databases OpenGrey / GreyLit for journal articles pertaining to multi-omics and rare disease, written in English and published prior to the 30th December 2018. Additionally, The Cancer Genome Atlas publications were searched for relevant studies and forward citation searching / screening of reference lists was performed to identify further eligible articles. Following title, abstract and full text screening, 66 articles were found to be eligible for inclusion in this review. Of these 42 (64%) were studies of multi-omics and rare cancer, two (3%) were studies of multi-omics and a pre-cancerous condition, and 22 (33.3%) were studies of non-cancerous rare diseases. The average age of participants (where known) across studies was 39.4?years. There has been a significant increase in the number of multi-omic studies in recent years, with 66.7% of included studies conducted since 2016 and 33% since 2018. Fourteen combinations of multi-omic analyses for rare disease research were returned spanning genomics, epigenomics, transcriptomics, proteomics, phenomics and metabolomics. CONCLUSIONS:This scoping review emphasises the value of multi-omic analysis for rare disease research in several ways compared to single omic analysis, ranging from the provision of a diagnosis, identification of prognostic biomarkers, distinct molecular subtypes (particularly for rare cancers), and identification of novel therapeutic targets. Moving forward there is a critical need for collaboration of multi-omic rare disease studies to increase the potential to generate robust outcomes and development of standardised biorepository collection and reporting structures for multi-omic studies.