ABSTRACT: Circadian rhythms play a fundamental role at all levels of biological organization. Understanding the mechanisms and implications of circadian oscillations continues to be the focus of intense research. However, there has been no comprehensive and integrated way for accessing and mining all circadian omic datasets. The latest release of CircadiOmics (http://circadiomics.ics.uci.edu) fills this gap for providing the most comprehensive web server for studying circadian data. The newly updated version contains high-throughput 227 omic datasets corresponding to over 74 million measurements sampled over 24 h cycles. Users can visualize and compare oscillatory trajectories across species, tissues and conditions. Periodicity statistics (e.g. period, amplitude, phase, P-value, q-value etc.) obtained from BIO_CYCLE and other methods are provided for all samples in the repository and can easily be downloaded in the form of publication-ready figures and tables. New features and substantial improvements in performance and data volume make CircadiOmics a powerful web portal for integrated analysis of circadian omic data.
Project description:Recent advances in experimental biology allow creation of datasets where several genome-wide data types (called omics) are measured per sample. Integrative analysis of multi-omic datasets in general, and clustering of samples in such datasets specifically, can improve our understanding of biological processes and discover different disease subtypes. In this work we present MONET (Multi Omic clustering by Non-Exhaustive Types), which presents a unique approach to multi-omic clustering. MONET discovers modules of similar samples, such that each module is allowed to have a clustering structure for only a subset of the omics. This approach differs from most existent multi-omic clustering algorithms, which assume a common structure across all omics, and from several recent algorithms that model distinct cluster structures. We tested MONET extensively on simulated data, on an image dataset, and on ten multi-omic cancer datasets from TCGA. Our analysis shows that MONET compares favorably with other multi-omic clustering methods. We demonstrate MONET's biological and clinical relevance by analyzing its results for Ovarian Serous Cystadenocarcinoma. We also show that MONET is robust to missing data, can cluster genes in multi-omic dataset, and reveal modules of cell types in single-cell multi-omic data. Our work shows that MONET is a valuable tool that can provide complementary results to those provided by existent algorithms for multi-omic analysis.
Project description:To better understand dynamic disease processes, integrated multi-omic methods are needed, yet comparing different types of omic data remains difficult. Integrative solutions benefit experimenters by eliminating potential biases that come with single omic analysis. We have developed the methods needed to explore whether a relationship exists between co-expression network models built from transcriptomic and proteomic data types, and whether this relationship can be used to improve the disease signature discovery process. A naïve, correlation based method is utilized for comparison. Using publicly available infectious disease time series data, we analyzed the related co-expression structure of the transcriptome and proteome in response to SARS-CoV infection in mice. Transcript and peptide expression data was filtered using quality scores and subset by taking the intersection on mapped Entrez IDs. Using this data set, independent co-expression networks were built. The networks were integrated by constructing a bipartite module graph based on module member overlap, module summary correlation, and correlation to phenotypes of interest. Compared to the module level results, the naïve approach is hindered by a lack of correlation across data types, less significant enrichment results, and little functional overlap across data types. Our module graph approach avoids these problems, resulting in an integrated omic signature of disease progression, which allows prioritization across data types for down-stream experiment planning. Integrated modules exhibited related functional enrichments and could suggest novel interactions in response to infection. These disease and platform-independent methods can be used to realize the full potential of multi-omic network signatures. The data (experiment SM001) are publically available through the NIAID Systems Virology (https://www.systemsvirology.org) and PNNL (http://omics.pnl.gov) web portals. Phenotype data is found in the supplementary information. The ProCoNA package is available as part of Bioconductor 2.13.
Project description:Researchers now generate large multi-omic datasets using increasingly mature mass spectrometry techniques at an astounding pace, facing new challenges of "Big Data" dissemination, visualization, and exploration. Conveniently, web-based data portals accommodate the complexity of multi-omic experiments and the many experts involved. However, developing these tailored companion resources requires programming expertise and knowledge of web server architecture-a substantial burden for most. Here, we describe Argonaut, a simple, code-free, and user-friendly platform for creating customizable, interactive data-hosting websites. Argonaut carries out real-time statistical analyses of the data, which it organizes into easily sharable projects. Collaborating researchers worldwide can explore the results, visualized through popular plots, and modify them to streamline data interpretation. Increasing the pace and ease of access to multi-omic data, Argonaut aims to propel discovery of new biological insights. We showcase the capabilities of this tool using a published multi-omics dataset on the large mitochondrial protease deletion collection.
Project description:BACKGROUND: In order to understand microarray data reasonably in the context of other existing biological knowledge, it is necessary to conduct a thorough examination of the data utilizing every aspect of available omic knowledge libraries. So far, a number of bioinformatics tools have been developed. However, each of them is restricted to deal with one type of omic knowledge, e.g., pathways, interactions or gene ontology. Now that the varieties of omic knowledge are expanding, analysis tools need a way to deal with any type of omic knowledge. Hence, we have designed the Omic Space Markup Language (OSML) that can represent a wide range of omic knowledge, and also, we have developed a tool named GSCope3, which can statistically analyze microarray data in comparison with the OSML-formatted omic knowledge data. RESULTS: In order to test the applicability of OSML to represent a variety of omic knowledge specifically useful for analysis of Arabidopsis thaliana microarray data, we have constructed a Biological Knowledge Library (BiKLi) by converting eight different types of omic knowledge into OSML-formatted datasets. We applied GSCope3 and BiKLi to previously reported A. thaliana microarray data, so as to extract any additional insights from the data. As a result, we have discovered a new insight that lignin formation resists drought stress and activates transcription of many water channel genes to oppose drought stress; and most of the 20S proteasome subunit genes show similar expression profiles under drought stress. In addition to this novel discovery, similar findings previously reported were also quickly confirmed using GSCope3 and BiKLi. CONCLUSION: GSCope3 can statistically analyze microarray data in the context of any OSML-represented omic knowledge. OSML is not restricted to a specific data type structure, but it can represent a wide range of omic knowledge. It allows us to convert new types of omic knowledge into datasets that can be used for microarray data analysis with GSCope3. In addition to BiKLi, by collecting various types of omic knowledge as OSML libraries, it becomes possible for us to conduct detailed thorough analysis from various biological viewpoints. GSCope3 and BiKLi are available for academic users at our web site http://omicspace.riken.jp.
Project description:BACKGROUND:Patients with rare diseases face unique challenges in obtaining a diagnosis, appropriate medical care and access to support services. Whole genome and exome sequencing have increased identification of causal variants compared to single gene testing alone, with diagnostic rates of approximately 50% for inherited diseases, however integrated multi-omic analysis may further increase diagnostic yield. Additionally, multi-omic analysis can aid the explanation of genotypic and phenotypic heterogeneity, which may not be evident from single omic analyses. MAIN BODY:This scoping review took a systematic approach to comprehensively search the electronic databases MEDLINE, EMBASE, PubMed, Web of Science, Scopus, Google Scholar, and the grey literature databases OpenGrey / GreyLit for journal articles pertaining to multi-omics and rare disease, written in English and published prior to the 30th December 2018. Additionally, The Cancer Genome Atlas publications were searched for relevant studies and forward citation searching / screening of reference lists was performed to identify further eligible articles. Following title, abstract and full text screening, 66 articles were found to be eligible for inclusion in this review. Of these 42 (64%) were studies of multi-omics and rare cancer, two (3%) were studies of multi-omics and a pre-cancerous condition, and 22 (33.3%) were studies of non-cancerous rare diseases. The average age of participants (where known) across studies was 39.4?years. There has been a significant increase in the number of multi-omic studies in recent years, with 66.7% of included studies conducted since 2016 and 33% since 2018. Fourteen combinations of multi-omic analyses for rare disease research were returned spanning genomics, epigenomics, transcriptomics, proteomics, phenomics and metabolomics. CONCLUSIONS:This scoping review emphasises the value of multi-omic analysis for rare disease research in several ways compared to single omic analysis, ranging from the provision of a diagnosis, identification of prognostic biomarkers, distinct molecular subtypes (particularly for rare cancers), and identification of novel therapeutic targets. Moving forward there is a critical need for collaboration of multi-omic rare disease studies to increase the potential to generate robust outcomes and development of standardised biorepository collection and reporting structures for multi-omic studies.
Project description:Background:Cancer classification is of great importance to understanding its pathogenesis, making diagnosis and developing treatment. The accumulation of extensive omics data of abundant cancer cell line provide basis for large scale classification of cancer with low cost. However, the reliability of cell lines as in vitro models of cancer has been controversial. Methods:In this study, we explore the classification on pan-cancer cell line with single and integrated multiple omics data from the Cancer Cell Line Encyclopedia (CCLE) database. The representative omics data of cancer, mRNA data, miRNA data, copy number variation data, DNA methylation data and reverse-phase protein array data were taken into the analysis. TumorMap web tool was used to illustrate the landscape of molecular classification.The molecular classification of patient samples was compared with cancer cell lines. Results:Eighteen molecular clusters were identified using integrated multiple omics clustering. Three pan-cancer clusters were found in integrated multiple omics clustering. By comparing with single omics clustering, we found that integrated clustering could capture both shared and complementary information from each omics data. Omics contribution analysis for clustering indicated that, although all the five omics data were of value, mRNA and proteomics data were particular important. While the classifications were generally consistent, samples from cancer patients were more diverse than cancer cell lines. Conclusions:The clustering analysis based on integrated omics data provides a novel multi-dimensional map of cancer cell lines that can reflect the extent to pan-cancer cell lines represent primary tumors, and an approach to evaluate the importance of omic features in cancer classification.
Project description:Soybean production is greatly influenced by abiotic stresses imposed by environmental factors such as drought, water submergence, salt, and heavy metals. A thorough understanding of plant response to abiotic stress at the molecular level is a prerequisite for its effective management. The molecular mechanism of stress tolerance is complex and requires information at the omic level to understand it effectively. In this regard, enormous progress has been made in the omics field in the areas of genomics, transcriptomics, and proteomics. The emerging field of ionomics is also being employed for investigating abiotic stress tolerance in soybean. Omic approaches generate a huge amount of data, and adequate advancements in computational tools have been achieved for effective analysis. However, the integration of omic-scale information to address complex genetics and physiological questions is still a challenge. In this review, we have described advances in omic tools in the view of conventional and modern approaches being used to dissect abiotic stress tolerance in soybean. Emphasis was given to approaches such as quantitative trait loci (QTL) mapping, genome-wide association studies (GWAS), and genomic selection (GS). Comparative genomics and candidate gene approaches are also discussed considering identification of potential genomic loci, genes, and biochemical pathways involved in stress tolerance mechanism in soybean. This review also provides a comprehensive catalog of available online omic resources for soybean and its effective utilization. We have also addressed the significance of phenomics in the integrated approaches and recognized high-throughput multi-dimensional phenotyping as a major limiting factor for the improvement of abiotic stress tolerance in soybean.
Project description:BACKGROUND:Analysis of large genomic datasets along with their accompanying clinical information has shown great promise in cancer research over the last decade. Such datasets typically include thousands of samples, each measured by one or several high-throughput technologies ('omics') and annotated with extensive clinical information. While instrumental for fulfilling the promise of personalized medicine, the analysis and visualization of such large datasets is challenging and necessitates programming skills and familiarity with a large array of software tools to be used for the various steps of the analysis. RESULTS:We developed PROMO (Profiler of Multi-Omic data), a friendly, fully interactive stand-alone software for analyzing large genomic cancer datasets together with their associated clinical information. The tool provides an array of built-in methods and algorithms for importing, preprocessing, visualizing, clustering, clinical label enrichment testing, and survival analysis that can be performed on a single or multi-omic dataset. The tool can be used for quick exploration and stratification of tumor samples taken from patients into clinically significant molecular subtypes. Identification of prognostic biomarkers and generation of simple subtype classifiers are additional important features. We review PROMO's main features and demonstrate its analysis capabilities on a breast cancer cohort from TCGA. CONCLUSIONS:PROMO provides a single integrated solution for swiftly performing a complete analysis of cancer genomic data for subtype discovery and biomarker identification without writing a single line of code, and can, therefore, make the analysis of these data much easier for cancer biologists and biomedical researchers. PROMO is freely available for download at http://acgt.cs.tau.ac.il/promo/.
Project description:Recent high throughput experimental methods have been used to collect large biomedical omics datasets. Clustering of single omic datasets has proven invaluable for biological and medical research. The decreasing cost and development of additional high throughput methods now enable measurement of multi-omic data. Clustering multi-omic data has the potential to reveal further systems-level insights, but raises computational and biological challenges. Here, we review algorithms for multi-omics clustering, and discuss key issues in applying these algorithms. Our review covers methods developed specifically for omic data as well as generic multi-view methods developed in the machine learning community for joint clustering of multiple data types. In addition, using cancer data from TCGA, we perform an extensive benchmark spanning ten different cancer types, providing the first systematic comparison of leading multi-omics and multi-view clustering algorithms. The results highlight key issues regarding the use of single- versus multi-omics, the choice of clustering strategy, the power of generic multi-view methods and the use of approximated p-values for gauging solution quality. Due to the growing use of multi-omics data, we expect these issues to be important for future progress in the field.
Project description:Multi-omic studies combine measurements at different molecular levels to build comprehensive models of cellular systems. The success of a multi-omic data analysis strategy depends largely on the adoption of adequate experimental designs, and on the quality of the measurements provided by the different omic platforms. However, the field lacks a comparative description of performance parameters across omic technologies and a formulation for experimental design in multi-omic data scenarios. Here, we propose a set of harmonized Figures of Merit (FoM) as quality descriptors applicable to different omic data types. Employing this information, we formulate the MultiPower method to estimate and assess the optimal sample size in a multi-omics experiment. MultiPower supports different experimental settings, data types and sample sizes, and includes graphical for experimental design decision-making. MultiPower is complemented with MultiML, an algorithm to estimate sample size for machine learning classification problems based on multi-omic data.