Project description:We report a new unsupervised clustering tool for single cell RNA-seq data called SC3. We show that biologically relevant information can be obtained from preneoplastic cells of patients with myeloprolifertive disease.

2017-02-16 | GSE79102 | GEO

Unsupervised hierarchical clustering of iHPCs induced by 9 or 10 TFs

Project description:To clarify the gene expression profile of iHep, microarray analysis was performed using iHeps induced by 10 TFs (Foxg1, Lcor, Hnf3b, Hnf4a, Foxo6, Cdx2, Tcf1, Foxa3 ,Tcf2, Onecut1) and 9 TFs (Onecut1 was omitted from 10 TFs). Unsupervised hierarchical clustering indicated that iHep is expressing a global transcriptional profile more similar to that of HPCs rather than that of NPCs, and suggested that TFs present in the pool acted as inducing TFs. HPC (HB1 and HNG2) were established from fetal liver (E13.5) of C57BL6J and STOCK Tg(Nanog-GFP, Puro)1 Yam, respectively. iHeps were induced from NPC (NSBAg2, established from an ES cell line BAg73C2 carrying beta-geo knock-in allele in Afp) using retroviral vectors (pMXs without drug-selection markers) of 9 or 10 transcription factors. Three weeks after the infection, G418 was added and cultured for 1 week before the harvest. NSBAg2 and NSEB5-2C were used for the data of NPC. GSM396240 and GSM336010 were used for the data of ESC.

2011-06-07 | E-GEOD-29730 | biostudies-arrayexpress

Unsupervised hierarchical clustering of iNPCs induced by 6 or 5 TFs

Project description:To clarify the gene expression profile of iNPC, microarray analysis was performed using iNPCs induced by 6 TFs (Pax6, Hmga2, Etv6, Gatad2b, Nfxl1, and Esx1) and 5 TFs (Esx1 was omitted from 6 TFs). Unsupervised hierarchical clustering indicated that iNPC is expressing a global transcriptional profile more similar to that of NPCs rather than that of MEFs, and suggested that the TFs present in the pool acted as inducing TFs. iNPCs were induced from MEF using 6 or 5 transcription factors. iNPCs were induced from MEF (MEFSH, derived from mice carrying IRES-Hygro in Sox allele) using retroviral vectors (pMXs-IRESNeo) of 6 or 5 transcription factors. Four weeks after the infection, Hygromycin was added and cultured for 1 week before the harvest. NSBAg2 and NSEB5-2C were used for the data of NPC. GSM396240 and GSM336010 were used for the data of ESC.GSM651349 and GSM336011 were used for the data of MEF.

2011-06-07 | E-GEOD-29724 | biostudies-arrayexpress

SECANT: a biology-guided semi-supervised method for clustering, classification, and annotation of single-cell multi-omics

Project description:The recent advance of single cell sequencing (scRNA-seq) technology such as Cellular Indexing of Transcriptomes and Epitopes by Sequencing (CITE-seq) allows researchers to quantify cell surface protein abundance and RNA expression simultaneously at single cell resolution. Although CITE-seq and other similar technologies have quickly gained enormous popularity, novel methods for analyzing this new type of single cell multi-omics data are still in urgent need. A limited number of available tools utilize data-driven approach, which may undermine the biological importance of surface protein data. In this study, we developed SECANT, a biology-guided SEmi-supervised method for Clustering, classification, and ANnoTation of single-cell multi-omics. SECANT can be used to analyze CITE-seq data, or jointly analyze CITE-seq and scRNA-seq data. The novelties of SECANT include 1) using confident cell type labels identified from surface protein data as guidance for cell clustering, 2) providing general annotation of confident cell types for each cell cluster, 3) fully utilizing cells with uncertain or missing cell type labels to increase performance, and 4) accurate prediction of confident cell types identified from surface protein data for scRNA-seq data. Besides, as a model-based approach, SECANT can quantify the uncertainty of the results, and our framework can be easily extended to handle other types of multi-omics data. We successfully demonstrated the validity and advantages of SECANT via simulation studies and analysis of public and in-house real datasets. We believe this new method will greatly help researchers characterize novel cell types and make new biological discoveries using single cell multi-omics data.

2022-09-03 | GSE168264 | GEO

Unsupervised Analysis of Array Comparative Genomic Hybridization Data from Early-Onset Colorectal Cancer Reveals Equivalence with Molecular Classification and Phenotypes

Project description:AIM: To investigate whether chromosomal instability (CIN) is associated with tumor phenotypes and/or with global genomic status based on MSI (microsatellite instability) and CIMP (CpG island methylator phenotype) in early onset colorectal cancer (EOCRC). METHODS: Taking as a starting point our previous work in which tumors from 60 EOCRC cases (≤45 years at the time of diagnosis) were analyzed by array comparative genomic hybridization (aCGH), in the present study we performed an unsupervised hierarchical clustering analysis of those aCGH data in order to unveil possible associations between the CIN profile and the clinical features of the tumors. In addition, we evaluated the MSI and the CIMP statuses of the samples with the aim of investigating a possible relationship between copy number alterations (CNAs) and the MSI/CIMP condition in EOCRC. RESULTS: Based on the similarity of the CNAs detected, the unsupervised analysis stratified samples into two main clusters (A, B) and four secondary clusters (A1, A2, B3, B4). The different subgroups showed a certain correspondence with the molecular classification of colorectal cancer (CRC), which enabled us to outline an algorithm to categorize tumors according to their CIMP status. Interestingly, each subcluster showed some distinctive clinicopathological features. But more interestingly, the CIN of each subcluster mainly affected articular chromosomes, allowing us to define chromosomal regions more specifically affected depending on the CIMP/MSI status of the samples. CONCLUSIONS: Our findings may provide a basis for a new form of classifying EOCRC according to the genomic status of the tumors.

2019-12-20 | GSE108220 | GEO

Classification and functional identification by gene expression patterns from various muscle tissues in a single livestock

Project description:Purpose : The RNA-seq data of 18 skeletal muscle tissues obtained from one Jejunal were sequenced. In this study, we aimed to group multiple tissues by unsupervised clustering of tissues based on expression pattern alone, and to find the criteria for dividing the classified tissues and the causative genes. Method : The raw counts obtained from the STAR 2 pass alignment were obtained. The log2 normalization value of raw counts of each sample is classified by hierarchical clustering. The eighteen samples were classified into two groups of nine each. Differential expression analysis of each group through R package DESeq2 revealed a total of 1104 Differentially expressed genes (DEGs). Result and Conclusion : Based on myosin heavy chain family gene expression in Identified DEGs, the criteria for dividing the two groups were determined by the classification of slow-twitch and fast-twitch muscle according to the content of myosin heavy chain fiber. ?In addition, we confirmed the up-regulation of the genes that are characteristic of the slow-twitch muscle, and identified that the central genes that express these features play a role in maintaining cardiovascular homeostasis.

2020-11-17 | GSE113147 | GEO

OmicsDI is part of the ELIXIR infrastructure

OmicsDI is an Elixir interoperability service. Learn more ›

OmicsDI Databases

PRIDE
PeptideAtlas
MassIVE
JPOST Repository
Physiome Model Repository

EGA
EVA
ENA
LINCS
PAXDB
Cell Collective

MetaboLights
Metabolomics Workbench
MetabolomeExpress
GNPS
BioModels
FAIRDOMHub

ArrayExpress
dbGaP
ExpressionAtlas
GEO
NODE

Information

Databases
Help
API
Contact us
Code on GitHub
Terms of use
Submit Data