Project description:The cell type composition of many biological tissues varies widely across samples. Such sample heterogeneity hampers efforts to probe the role of each cell type in the tissue microenvironment. Current approaches that address this issue have drawbacks. Cell sorting or single-cell based experimental techniques disrupt in situ interactions and alter physiological status of cells in tissues. Computational methods are flexible and promising; but they often estimate either sample-specific proportions of each cell type or cell-type-specific gene expression profiles, not both, by requiring the other as input. We introduce a computational Complete Deconvolution method that can estimate both sample-specific proportions of each cell type and cell-type-specific gene expression profiles simultaneously using bulk RNA-Seq data only (CDSeq). We assessed our method’s performance using several synthetic and experimental mixtures of varied but known cell-type composition and compared its performance to the performance of two state-of-the-art deconvolution methods on the same mixtures. The results showed CDSeq can estimate both sample-specific proportions of each component cell type and cell-type-specific gene expression profiles with high accuracy. CDSeq holds promise for computationally deciphering complex mixtures of cell types, each with differing expression profiles, using RNA-seq data measured in bulk tissue .
Project description:Cell-to-cell interactions between tumor cells and their microenvironment are critical determinants of tumor tissue biology and therapeutic responses. Interactions between glioblastoma (GBM) cells and endothelial cells (ECs) establish a purported stem cell niche. We hypothesized that genes that mediate these interactions would be important, particularly as therapeutic targets. Using a novel computational approach to deconvoluting expression data from mixed physical coculture of GBM cells and ECs, we identified upregulation of the cAMP specific phosphodiesterase PDE7B in GBM cells in response to ECs. We further found that elevated PDE7B expression occurs in most GBM cases and has a negative effect on survival. PDE7B overexpression resulted in the expansion of a stem-like cell subpopulation, increased tumor aggressiveness, and increased growth in an intracranial GBM model. This deconvolution algorithm provides a new tool for cancer biology, and these results identify PDE7B as a therapeutic target in GBM. 3 replicates from U87 monocultures, 3 replicates from HBMEC monocultures, 3 replicates from U87-HBMEC cocultures
Project description:Deconvolution is the problem of estimating proportions of mixed cell types from tissue samples. DNA methylation is commonly used for deconvolution because individual CpG sequences can reflect cell type identity and can be accurately measured at either the population or single-molecule level. Genomic sequencing techniques can profile multiple CpGs on a single DNA molecule, but few deconvolution models have been developed to exploit these single-molecule methylation haplotypes for cell type deconvolution. We use simulated whole-genome methylation data and in silico mixtures of real data to compare existing computational tools with two new models developed here. We find that adapting an existing model CelFiE to incorporate methylation haplotype information improves deconvolution accuracy by ~30% over the original CelFiE and the next best tool. Our new tool, CelFiE Integrated Single-molecule Haplotypes (or CelFiE-ISH), also outperforms other tools in detecting rare cell types present at 0.1% and below, which can be used to improve detection of rare cell types in circulating DNA. CelFiE-ISH is publicly available under a permissive open-source license. Finally, we investigate the selection of cell-type specific marker regions from the genome, a prerequisite for all of these tools. We find that the selection method used has a strong effect on accuracy in our benchmarks, and identify specific marker features that contribute to deconvolution accuracy. We show that tailoring the marker selection method to specific attributes of the deconvolution model, such as the use of DNA methylation haplotypes, can improve deconvolution accuracy.
Project description:RNA profiling technologies at single-cell resolutions, including single-cell and single-nuclei RNA sequencing (scRNA-Seq and snRNA-Seq, scnRNA-Seq for short), can help characterize the composition of tissues and reveal cells that influence key healthy and disease functions. However, the use of these technologies is challenging because of their relatively high costs and exacting sample collection requirements. Computational deconvolution methods that infer the composition of RNA-Seq-profiled samples using scnRNA-Seq-characterized cell types can expand the benefit of these technologies, but their effectiveness remains controversial. We produced the first systematic evaluation of deconvolution methods on datasets with either known compositions or based on concurrent RNA-Seq and scnRNA-Seq profiles. Our analyses revealed biases that are common to scnRNA-Seq 10X Genomics assays and illustrated the importance of accurate and properly controlled data preprocessing and method selection and optimization. Moreover, our results suggested that concurrent RNA-Seq and scnRNA-Seq profiles can help improve the accuracy of both scnRNA-Seq preprocessing and the deconvolution methods that employ them. Indeed, our proposed method, Single-cell RNA Quantity Informed Deconvolution (SQUID), combined RNA-Seq transformation and a dampened weighted least squares deconvolution approach to consistently outperform other methods in predicting the composition of cell mixtures and tissue samples. Moreover, our analysis suggested that only SQUID could identify outcomes-predictive cancer cell subtypes in pediatric acute myeloid leukemia and neuroblastoma datasets.
Project description:Deconvolution is a methodology for estimating the immune cell proportions from the transcriptome. It is mainly applied to blood-derived samples and tumor tissues. However, the influence of tissue-specific modeling on the estimation results has rarely been investigated. In this study, we constructed a system to evaluate the performance of the deconvolution method on liver transcriptome data. Correspondence: Tadahaya Mizuno
Project description:RNA profiling technologies at single-cell resolutions, including single-cell and single-nuclei RNA sequencing (scRNA-Seq and snRNA-Seq, scnRNA-Seq for short), can help characterize the composition of tissues and reveal cells that influence key healthy and disease functions. However, the use of these technologies is challenging because of their relatively high costs and exacting sample collection requirements. Computational deconvolution methods that infer the composition of RNA-Seq-profiled samples using scnRNA-Seq-characterized cell types can expand the benefit of these technologies, but their effectiveness remains controversial. We produced the first systematic evaluation of deconvolution methods on datasets with either known compositions or based on concurrent RNA-Seq and scnRNA-Seq profiles. Our analyses revealed biases that are common to scnRNA-Seq 10X Genomics assays and illustrated the importance of accurate and properly controlled data preprocessing and method selection and optimization. Moreover, our results suggested that concurrent RNA-Seq and scnRNA-Seq profiles can help improve the accuracy of both scnRNA-Seq preprocessing and the deconvolution methods that employ them. Indeed, our proposed method, Single-cell RNA Quantity Informed Deconvolution (SQUID), combined RNA-Seq transformation and a dampened weighted least squares deconvolution approach to consistently outperform other methods in predicting the composition of cell mixtures and tissue samples. Moreover, our analysis suggested that only SQUID could identify outcomes-predictive cancer cell subtypes in pediatric acute myeloid leukemia and neuroblastoma datasets.
Project description:RNA profiling technologies at single-cell resolutions, including single-cell and single-nuclei RNA sequencing (scRNA-Seq and snRNA-Seq, scnRNA-Seq for short), can help characterize the composition of tissues and reveal cells that influence key healthy and disease functions. However, the use of these technologies is challenging because of their relatively high costs and exacting sample collection requirements. Computational deconvolution methods that infer the composition of RNA-Seq-profiled samples using scnRNA-Seq-characterized cell types can expand the benefit of these technologies, but their effectiveness remains controversial. We produced the first systematic evaluation of deconvolution methods on datasets with either known compositions or based on concurrent RNA-Seq and scnRNA-Seq profiles. Our analyses revealed biases that are common to scnRNA-Seq 10X Genomics assays and illustrated the importance of accurate and properly controlled data preprocessing and method selection and optimization. Moreover, our results suggested that concurrent RNA-Seq and scnRNA-Seq profiles can help improve the accuracy of both scnRNA-Seq preprocessing and the deconvolution methods that employ them. Indeed, our proposed method, Single-cell RNA Quantity Informed Deconvolution (SQUID), combined RNA-Seq transformation and a dampened weighted least squares deconvolution approach to consistently outperform other methods in predicting the composition of cell mixtures and tissue samples. Moreover, our analysis suggested that only SQUID could identify outcomes-predictive cancer cell subtypes in pediatric acute myeloid leukemia and neuroblastoma datasets.
Project description:RNA profiling technologies at single-cell resolutions, including single-cell and single-nuclei RNA sequencing (scRNA-Seq and snRNA-Seq, scnRNA-Seq for short), can help characterize the composition of tissues and reveal cells that influence key healthy and disease functions. However, the use of these technologies is challenging because of their relatively high costs and exacting sample collection requirements. Computational deconvolution methods that infer the composition of RNA-Seq-profiled samples using scnRNA-Seq-characterized cell types can expand the benefit of these technologies, but their effectiveness remains controversial. We produced the first systematic evaluation of deconvolution methods on datasets with either known compositions or based on concurrent RNA-Seq and scnRNA-Seq profiles. Our analyses revealed biases that are common to scnRNA-Seq 10X Genomics assays and illustrated the importance of accurate and properly controlled data preprocessing and method selection and optimization. Moreover, our results suggested that concurrent RNA-Seq and scnRNA-Seq profiles can help improve the accuracy of both scnRNA-Seq preprocessing and the deconvolution methods that employ them. Indeed, our proposed method, Single-cell RNA Quantity Informed Deconvolution (SQUID), combined RNA-Seq transformation and a dampened weighted least squares deconvolution approach to consistently outperform other methods in predicting the composition of cell mixtures and tissue samples. Moreover, our analysis suggested that only SQUID could identify outcomes-predictive cancer cell subtypes in pediatric acute myeloid leukemia and neuroblastoma datasets.
Project description:Difference in RNA content of different cell types introduces bias to gene expression deconvolution methods. If ERCC spike-ins are introduced into samples, predicted proportions of deconvolution methods can be corrected
Project description:Cell-to-cell interactions between tumor cells and their microenvironment are critical determinants of tumor tissue biology and therapeutic responses. Interactions between glioblastoma (GBM) cells and endothelial cells (ECs) establish a purported stem cell niche. We hypothesized that genes that mediate these interactions would be important, particularly as therapeutic targets. Using a novel computational approach to deconvoluting expression data from mixed physical coculture of GBM cells and ECs, we identified upregulation of the cAMP specific phosphodiesterase PDE7B in GBM cells in response to ECs. We further found that elevated PDE7B expression occurs in most GBM cases and has a negative effect on survival. PDE7B overexpression resulted in the expansion of a stem-like cell subpopulation, increased tumor aggressiveness, and increased growth in an intracranial GBM model. This deconvolution algorithm provides a new tool for cancer biology, and these results identify PDE7B as a therapeutic target in GBM.