Project description:Human red blood cells (RBCs), or erythrocytes, are the most abundant blood cells responsible for gas exchange. RBC diseases affect hundreds of millions of people and impose enormous financial and personal burdens. One well-recognized, but poorly understood feature of RBC populations within the same individual are their phenotypic heterogeneity. The granular characterization of phenotypic RBC variation in normative and disease states may allow us to identify the genetic determinants of red cell diseases and reveal novel therapeutic approaches for their treatment. Previously, we discovered diverse RNA transcripts in RBCs that has allowed us to dissect the phenotypic heterogeneity and malaria resistance of sickle red cells. However, these analyses failed to capture the heterogeneity found in RBC sub-populations. To overcome this limitation, we have performed single cell RNA-Seq to analyze the transcriptional heterogeneity of RBCs from three adult healthy donors which have been stored in the blood bank conditions and assayed at day 1 and day 15. The expression pattern clearly separated RBCs into seven distinct clusters that include one RBC cluster that expresses HBG2 and a small population of RBCs that express fetal hemoglobin (HbF) that we annotated as F cells. Almost all HBG2-expessing cells also express HBB, suggesting bi-allelic expression in single RBC from the HBG2/HBB loci, and we annotated another cluster as reticulocytes based on canonical gene expression. Additional RBC clusters were also annotated based on the enriched expression of NIX, ACVR2B and HEMGN, previously shown to be involved in erythropoiesis. Finally, we found the storage of RBC was associated with an increase in the ACVR2B and F-cell clusters. Collectively, these data indicate the power of single RBC RNA-Seq to capture and discover known and unexpected heterogeneity of RBC population.
Project description:BackgroundViral infection causes multiple forms of human cancer, and HPV infection is the primary factor in cervical carcinomas. Recent single-cell RNA-seq studies highlight the tumor heterogeneity present in most cancers, but virally induced tumors have not been studied. HeLa is a well characterized HPV+ cervical cancer cell line.ResultWe developed a new high throughput platform to prepare single-cell RNA on a nanoliter scale based on a customized microwell chip. Using this method, we successfully amplified full-length transcripts of 669 single HeLa S3 cells and 40 of them were randomly selected to perform single-cell RNA sequencing. Based on these data, we obtained a comprehensive understanding of the heterogeneity of HeLa S3 cells in gene expression, alternative splicing and fusions. Furthermore, we identified a high diversity of HPV-18 expression and splicing at the single-cell level. By co-expression analysis we identified 283 E6, E7 co-regulated genes, including CDC25, PCNA, PLK4, BUB1B and IRF1 known to interact with HPV viral proteins.ConclusionOur results reveal the heterogeneity of a virus-infected cell line. It not only provides a transcriptome characterization of HeLa S3 cells at the single cell level, but is a demonstration of the power of single cell RNA-seq analysis of virally infected cells and cancers.
Project description:The specific interactions between RNA-binding proteins and their target RNAs are an essential level to control gene expression. By combining ultra-violet cross-linking and immunoprecipitation (CLIP) and massive SoliD sequencing we identified the RNAs bound by the RNA-binding protein CELF1, in human HeLa cells. The CELF1 binding sites deduced from the sequence data allow characterizing specific features of CELF1-RNA association. We present therefore the first map of CELF1 binding sites in human cells.
Project description:PurposeCorneal epithelial homeostasis is maintained by coordinated gene expression across distinct cell populations, but the gene regulatory programs underlying this cellular diversity remain to be characterized. Here we applied single-cell multi-omics analysis to delineate the gene regulatory profile of mouse corneal epithelial cells under normal homeostasis.MethodsSingle cells isolated from the cornea epithelium (with marginal conjunctiva) of adult mice were subjected to scRNA-seq and scATAC-seq using the 10×Genomics platform. Cell types were clustered by the graph-based visualization method uniform manifold approximation and projection and unbiased computational informatics analysis. The scRNA-seq and scATAC-seq datasets were integrated following the integration pipeline described in ArchR and Seurat.ResultsWe characterized diverse corneal epithelial cell types based on gene expression signatures and chromatin accessibility. We found that cell type-specific accessibility regions were mainly located at distal regions, suggesting essential roles of distal regulatory elements in determining corneal epithelial cell diversity. Trajectory analyses revealed a continuum of cell state transition and higher coordination between transcription factor (TF) motif accessibility and gene expression during corneal epithelial cell differentiation. By integrating transcriptomic and chromatin accessibility analysis, we identified cell type-specific and shared gene regulation programs. We also uncovered critical TFs driving corneal epithelial cell differentiation, such as nuclear factor I (NFI) family members, Rarg, Elf3. We found that nuclear factor-κB (NF-κB) family members were positive TFs in limbal cells and some superficial cells, but they were involved in regulating distinct biological processes.ConclusionsOur study presents a comprehensive gene regulatory landscape of mouse cornea epithelial cells, and provides valuable foundations for future investigation of corneal epithelial homeostasis in the context of cornea pathologies and regenerative medicine.
Project description:The ubiquitin-specific protease 22 (USP22) is an oncogene and its expression is upregulated in many types of cancer. In the nucleus, USP22 functions as one subunit of the SAGA to regulate gene transcription. However, the genome-wide USP22 binding sites and its direct target genes are yet clear. In this study, we characterized the potential genomic binding sites of UPS22 and GCN5 by ChIP-seq using specific antibodies in HeLa cells. There were 408 overlapping putative target genes bound by both USP22 and GCN5. Motif analysis showed that the sequences bound by USP22 and GCN5 shared two common motifs. Gene ontology (GO) and pathway analysis indicated that the genes targeted by USP22 and GCN5 were involved in different physiological processes and pathways. Further RNA-seq, GO and pathway analyses revealed that knockdown of UPS22 induced differential expression of many genes that participated in diverse physiological processes, such as metabolic process. Integration of ChIP-seq and RNA-seq data revealed that UPS22 bound to the promoters of 56 genes. These findings may provide new insights into the regulation of USP22 on gene expression during the development of cervical cancer.
Project description:Single-cell RNA sequencing (scRNA-seq) has broad applications across biomedical research. One of the key challenges is to ensure that only single, live cells are included in downstream analysis, as the inclusion of compromised cells inevitably affects data interpretation. Here, we present a generic approach for processing scRNA-seq data and detecting low quality cells, using a curated set of over 20 biological and technical features. Our approach improves classification accuracy by over 30 % compared to traditional methods when tested on over 5,000 cells, including CD4+ T cells, bone marrow dendritic cells, and mouse embryonic stem cells.
Project description:Single-cell RNA sequencing has become a valuable tool for investigating cell types in complex tissues, where clustering of cells enables the identification and comparison of cell populations. Although many studies have sought to develop and compare different clustering approaches, a deeper investigation into the properties of the resulting populations is lacking. Specifically, the presence of misclassified cells can influence downstream analyses, highlighting the need to assess subpopulation purity and to detect such cells. We developed DSAVE (Down-SAmpling based Variation Estimation), a method to evaluate the purity of single-cell transcriptome clusters and to identify misclassified cells. The method utilizes down-sampling to eliminate differences in sampling noise and uses a log-likelihood based metric to help identify misclassified cells. In addition, DSAVE estimates the number of cells needed in a population to achieve a stable average gene expression profile within a certain gene expression range. We show that DSAVE can be used to find potentially misclassified cells that are not detectable by similar tools and reveal the cause of their divergence from the other cells, such as differing cell state or cell type. With the growing use of single-cell RNA-seq, we foresee that DSAVE will be an increasingly useful tool for comparing and purifying subpopulations in single-cell RNA-Seq datasets.
Project description:Aplastic anemia (AA) is an autoimmune disease characterized by peripheral blood pancytopenia and bone marrow failure. Recently, a research study verified bone marrow failure of AA patients resulting from hematopoietic stem and progenitor cell (HSPC) attack by active T cells. Nonetheless, whether B cells, as one of the important immune cells, destruct the hematopoiesis is still unclear. Here, a large-scale single-cell transcriptomic sequencing of 20,000 bone marrow cells from AA patients and healthy donors was performed. A total of 17 clusters and differentially expressed genes were identified in each cluster relative to other clusters, which were considered potential marker genes in each cluster. The top differentially expressed genes in HSPCs (S100A8, RETN, and TNFAIP3), monocytes (CXCL8, JUN, and IL1B), and neutrophils and granulocytes (CXCL8, NFKBIA, and MT-CYB) were related to immune and inflammatory injury. Then, the B-cell receptor (BCR) diversities and pairing frequencies of V and J genes were analyzed. The highest pairing frequencies in AA patients were IGHV3-20-IGKJ2, IGHV3-20-IGKJ4, and IGHV3-20-IGHLJ2. Meanwhile, there were 3 V genes, including IGHV3-7, IGHV3-33, and IGLV2-11, with elevated expression in B cells from AA patients. Cell type-specific ligand-receptor was further identified in B-cell interaction with hematopoietic cells in the bone marrow. The changed ligand-receptor pairs involved antigen presentation, inflammation, apoptosis, and proliferation of B cells. These data showed the transcriptomic landscape of hematopoiesis in AA at single-cell resolution, providing new insights into hematopoiesis failure related with aberrance of B cells, and provide available targets of treatment for AA.
Project description:Single-cell RNA-seq data contains a lot of dropouts hampering downstream analyses due to the low number and inefficient capture of mRNAs in individual cells. Here, we present Epi-Impute, a computational method for dropout imputation by reconciling expression and epigenomic data. Epi-Impute leverages single-cell ATAC-seq data as an additional source of information about gene activity to reduce the number of dropouts. We demonstrate that Epi-Impute outperforms existing methods, especially for very sparse single-cell RNA-seq data sets, significantly reducing imputation error. At the same time, Epi-Impute accurately captures the primary distribution of gene expression across cells while preserving the gene-gene and cell-cell relationship in the data. Moreover, Epi-Impute allows for the discovery of functionally relevant cell clusters as a result of the increased resolution of scRNA-seq data due to imputation.
Project description:Single-cell RNA-seq data contain a large proportion of zeros for expressed genes. Such dropout events present a fundamental challenge for various types of data analyses. Here, we describe the SCRABBLE algorithm to address this problem. SCRABBLE leverages bulk data as a constraint and reduces unwanted bias towards expressed genes during imputation. Using both simulation and several types of experimental data, we demonstrate that SCRABBLE outperforms the existing methods in recovering dropout events, capturing true distribution of gene expression across cells, and preserving gene-gene relationship and cell-cell relationship in the data.