Project description:Reconstruction of gene regulatory networks (GRNs) from expression data is a significant open problem. Common approaches train a machine learning (ML) model to predict a gene's expression using transcription factors' (TFs') expression as features and designate important features/TFs as regulators of the gene. Here, we present an entirely different paradigm, where GRN edges are directly predicted by the ML model. The new approach, named "SPREd" is a simulation-supervised neural network for GRN inference. Its inputs comprise expression relationships (e.g., correlation, mutual information) between the target gene and each TF and between pairs of TFs. The output includes binary labels indicating whether each TF regulates the target gene. We train the neural network model using synthetic expression data generated by a biophysics-inspired simulation model that incorporates linear as well as non-linear TF-gene relationships and diverse GRN configurations. We show SPREd to outperform state-of-the-art GRN reconstruction tools GENIE3, ENNET, PORTIA and TIGRESS on synthetic datasets with high co-expression among TFs, similar to that seen in real data. A key advantage of the new approach is its robustness to relatively small numbers of conditions (columns) in the expression matrix, which is a common problem faced by existing methods. Finally, we evaluate SPREd on real data sets in yeast that represent gold standard benchmarks of GRN reconstruction and show it to perform significantly better than or comparably to existing methods. In addition to its high accuracy and speed, SPREd marks a first step towards incorporating biophysics principles of gene regulation into ML-based approaches to GRN reconstruction.
Project description:Precise control of developmental processes is encoded in the genome in the form of gene regulatory networks (GRNs). Such multi-factorial systems are difficult to decode in vertebrates owing to their complex gene hierarchies and dynamic molecular interactions. Here we present a genome-wide in vivo reconstruction of the GRN underlying development of the multipotent neural crest (NC) embryonic cell population. By coupling NC-specific epigenomic and transcriptional profiling at population and single-cell levels with genome/epigenome engineering in vivo, we identify multiple regulatory layers governing NC ontogeny, including NC-specific enhancers and super-enhancers, novel trans-factors, and cis-signatures allowing reverse engineering of the NC-GRN at unprecedented resolution. Furthermore, identification and dissection of divergent upstream combinatorial regulatory codes has afforded new insights into opposing gene circuits that define canonical and neural NC fates early during NC ontogeny. Our integrated approach, allowing dissection of cell-type-specific regulatory circuits in vivo, has broad implications for GRN discovery and investigation.
Project description:The task of gene regulatory network reconstruction from high-throughput data is receiving increasing attention in recent years. As a consequence, many inference methods for solving this task have been proposed in the literature. It has been recently observed, however, that no single inference method performs optimally across all datasets. It has also been shown that the integration of predictions from multiple inference methods is more robust and shows high performance across diverse datasets. Inspired by this research, in this paper, we propose a machine learning solution which learns to combine predictions from multiple inference methods. While this approach adds additional complexity to the inference process, we expect it would also carry substantial benefits. These would come from the automatic adaptation to patterns on the outputs of individual inference methods, so that it is possible to identify regulatory interactions more reliably when these patterns occur. This article demonstrates the benefits (in terms of accuracy of the reconstructed networks) of the proposed method, which exploits an iterative, semi-supervised ensemble-based algorithm. The algorithm learns to combine the interactions predicted by many different inference methods in the multi-view learning setting. The empirical evaluation of the proposed algorithm on a prokaryotic model organism (E. coli) and on a eukaryotic model organism (S. cerevisiae) clearly shows improved performance over the state of the art methods. The results indicate that gene regulatory network reconstruction for the real datasets is more difficult for S. cerevisiae than for E. coli. The software, all the datasets used in the experiments and all the results are available for download at the following link: http://figshare.com/articles/Semi_supervised_Multi_View_Learning_for_Gene_Network_Reconstruction/1604827.
Project description:BackgroundGene regulatory network inference remains a challenging problem in systems biology despite the numerous approaches that have been proposed. When substantial knowledge on a gene regulatory network is already available, supervised network inference is appropriate. Such a method builds a binary classifier able to assign a class (Regulation/No regulation) to an ordered pair of genes. Once learnt, the pairwise classifier can be used to predict new regulations. In this work, we explore the framework of Markov Logic Networks (MLN) that combine features of probabilistic graphical models with the expressivity of first-order logic rules.ResultsWe propose to learn a Markov Logic network, e.g. a set of weighted rules that conclude on the predicate "regulates", starting from a known gene regulatory network involved in the switch proliferation/differentiation of keratinocyte cells, a set of experimental transcriptomic data and various descriptions of genes all encoded into first-order logic. As training data are unbalanced, we use asymmetric bagging to learn a set of MLNs. The prediction of a new regulation can then be obtained by averaging predictions of individual MLNs. As a side contribution, we propose three in silico tests to assess the performance of any pairwise classifier in various network inference tasks on real datasets. A first test consists of measuring the average performance on balanced edge prediction problem; a second one deals with the ability of the classifier, once enhanced by asymmetric bagging, to update a given network. Finally our main result concerns a third test that measures the ability of the method to predict regulations with a new set of genes. As expected, MLN, when provided with only numerical discretized gene expression data, does not perform as well as a pairwise SVM in terms of AUPR. However, when a more complete description of gene properties is provided by heterogeneous sources, MLN achieves the same performance as a black-box model such as a pairwise SVM while providing relevant insights on the predictions.ConclusionsThe numerical studies show that MLN achieves very good predictive performance while opening the door to some interpretability of the decisions. Besides the ability to suggest new regulations, such an approach allows to cross-validate experimental data with existing knowledge.
Project description:Precise control of developmental processes is encoded in the genome in the form of gene regulatory networks (GRNs). Such multi-factorial systems are difficult to decode in vertebrates owing to their complex gene hierarchies and transient dynamic molecular interactions. Here we present a genome-wide in vivo reconstruction of the GRN underlying development of neural crest (NC), an emblematic embryonic multipotent cell population. By coupling NC-specific epigenomic and single-cell transcriptome profiling with genome/epigenome engineering in vivo, we identify multiple regulatory layers governing NC ontogeny, including NC-specific enhancers and super-enhancers, novel trans-factors and cis-signatures. Assembling the NC regulome has allowed the comprehensive reverse engineering of the NC-GRN at unprecedented resolution. Furthermore, identification and dissection of divergent upstream combinatorial regulatory codes has afforded new insights into opposing gene circuits t hat define canonical and neural NC fates. Our integrated approach, allowing dissection of cell-type-specific regulatory circuits in vivo, has broad implications for GRN discovery and investigation.
Project description:BackgroundPositron-emission tomography (PET) simulators are frequently used for development and performance evaluation of segmentation methods or quantitative uptake metrics. To date, most PET simulation tools are based on Monte Carlo simulations, which are computationally demanding. Other analytical simulation tools lack the implementation of time of flight (TOF) or resolution modelling (RM). In this study, a fast and easy-to-use PET simulation-reconstruction package, SiMulAtion and ReconsTruction (SMART)-PET, is developed and validated, which includes both TOF and RM. SMART-PET, its documentation and instructions to calibrate the tool to a specific PET/CT system are available on Zenodo. SMART-PET allows the fast generation of 3D PET images. As input, it requires one image representing the activity distribution and one representing the corresponding CT image/attenuation map. It allows the user to adjust different parameters, such as reconstruction settings (TOF/RM), noise level or scan duration. Furthermore, a random spatial shift can be included, representing patient repositioning. To evaluate the tool, simulated images were compared with real scan data of the NEMA NU 2 image quality phantom. The scan was acquired as a 60-min list-mode scan and reconstructed with and without TOF and/or RM. For every reconstruction setting, ten statistically equivalent images, representing 30, 60, 120 and 300 s scan duration, were generated. Simulated and real-scan data were compared regarding coefficient of variation in the phantom background and activity recovery coefficients (RCs) of the spheres. Furthermore, standard deviation images of each of the ten statistically equivalent images were compared.ResultsSMART-PET produces images comparable to actual phantom data. The image characteristics of simulated and real PET images varied in similar ways as function of reconstruction protocols and noise levels. The change in image noise with variation of simulated TOF settings followed the theoretically expected behaviour. RC as function of sphere size agreed within 0.3-11% between simulated and actual phantom data.ConclusionsSMART-PET allows for rapid and easy simulation of PET data. The user can change various acquisition and reconstruction settings (including RM and TOF) and noise levels. The images obtained show similar image characteristics as those seen in actual phantom data.
Project description:Precise control of developmental processes is encoded in the genome in the form of gene regulatory networks (GRNs). Such multi-factorial systems are difficult to decode in vertebrates owing to their complex gene hierarchies and transient dynamic molecular interactions. Here we present a genome-wide in vivo reconstruction of the GRN underlying development of neural crest (NC), an emblematic embryonic multipotent cell population. By coupling NC-specific epigenomic and single-cell transcriptome profiling with genome/epigenome engineering in vivo, we identify multiple regulatory layers governing NC ontogeny, including NC-specific enhancers and super-enhancers, novel trans-factors and cis-signatures. Assembling the NC regulome has allowed the comprehensive reverse engineering of the NC-GRN at unprecedented resolution. Furthermore, identification and dissection of divergent upstream combinatorial regulatory codes has afforded new insights into opposing gene circuits t hat define canonical and neural NC fates. Our integrated approach, allowing dissection of cell-type-specific regulatory circuits in vivo, has broad implications for GRN discovery and investigation.
Project description:Precise control of developmental processes is encoded in the genome in the form of gene regulatory networks (GRNs). Such multi-factorial systems are difficult to decode in vertebrates owing to their complex gene hierarchies and transient dynamic molecular interactions. Here we present a genome-wide in vivo reconstruction of the GRN underlying development of neural crest (NC), an emblematic embryonic multipotent cell population. By coupling NC-specific epigenomic and single-cell transcriptome profiling with genome/epigenome engineering in vivo, we identify multiple regulatory layers governing NC ontogeny, including NC-specific enhancers and super-enhancers, novel trans-factors and cis-signatures. Assembling the NC regulome has allowed the comprehensive reverse engineering of the NC-GRN at unprecedented resolution. Furthermore, identification and dissection of divergent upstream combinatorial regulatory codes has afforded new insights into opposing gene circuits t hat define canonical and neural NC fates. Our integrated approach, allowing dissection of cell-type-specific regulatory circuits in vivo, has broad implications for GRN discovery and investigation.
Project description:Precise control of developmental processes is encoded in the genome in the form of gene regulatory networks (GRNs). Such multi-factorial systems are difficult to decode in vertebrates owing to their complex gene hierarchies and transient dynamic molecular interactions. Here we present a genome-wide in vivo reconstruction of the GRN underlying development of neural crest (NC), an emblematic embryonic multipotent cell population. By coupling NC-specific epigenomic and single-cell transcriptome profiling with genome/epigenome engineering in vivo, we identify multiple regulatory layers governing NC ontogeny, including NC-specific enhancers and super-enhancers, novel trans-factors and cis-signatures. Assembling the NC regulome has allowed the comprehensive reverse engineering of the NC-GRN at unprecedented resolution. Furthermore, identification and dissection of divergent upstream combinatorial regulatory codes has afforded new insights into opposing gene circuits t hat define canonical and neural NC fates. Our integrated approach, allowing dissection of cell-type-specific regulatory circuits in vivo, has broad implications for GRN discovery and investigation.
Project description:Precise control of developmental processes is encoded in the genome in the form of gene regulatory networks (GRNs). Such multi-factorial systems are difficult to decode in vertebrates owing to their complex gene hierarchies and transient dynamic molecular interactions. Here we present a genome-wide in vivo reconstruction of the GRN underlying development of neural crest (NC), an emblematic embryonic multipotent cell population. By coupling NC-specific epigenomic and single-cell transcriptome profiling with genome/epigenome engineering in vivo, we identify multiple regulatory layers governing NC ontogeny, including NC-specific enhancers and super-enhancers, novel trans-factors and cis-signatures. Assembling the NC regulome has allowed the comprehensive reverse engineering of the NC-GRN at unprecedented resolution. Furthermore, identification and dissection of divergent upstream combinatorial regulatory codes has afforded new insights into opposing gene circuits t hat define canonical and neural NC fates. Our integrated approach, allowing dissection of cell-type-specific regulatory circuits in vivo, has broad implications for GRN discovery and investigation.