Transcriptomics

Dataset Information

0

Bulk RNA-Seq gene expression profiles from 13 TCGA cancer types and normal tissues for tumor classification


ABSTRACT: This dataset includes bulk RNA-Seq gene expression profiles of 6310 samples from 13 cancer types and normal tissues, obtained from The Cancer Genome Atlas (TCGA) and processed using a standardized pipeline. The raw count matrix, initially containing 60,660 genes, was filtered to retain only valid ENTREZ gene identifiers matched via the org.Hs.eg.db annotation package. Genes with zero variance across samples were removed. Normalization was performed by transforming expression values into log2 Counts Per Million (CPM) using the edgeR::cpm() function with log transformation. To ensure all values were positive and suitable for downstream modeling, a global shift was applied by adding the absolute minimum value. Subsequently, Gene Ontology (GO) analysis was performed across all three categories (Biological Process, Molecular Function, Cellular Component) using the TopGO package. Significant genes were selected, followed by Benjamini–Hochberg p-value correction, and validated using both Fisher’s and Kolmogorov–Smirnov tests. The resulting expression matrix consists of 18,564 functionally relevant genes across 6310 samples and forms the basis for clustering, classification, and metric evaluation within hybrid modeling frameworks for cancer diagnostics.

ORGANISM(S): Homo sapiens

PROVIDER: GSE304485 | GEO | 2026/02/12

REPOSITORIES: GEO

Dataset's files

Source:
Action DRS
Other
Items per page:
1 - 1 of 1

Similar Datasets

2017-01-31 | GSE73111 | GEO
2009-09-19 | E-MTAB-144 | biostudies-arrayexpress
2012-07-05 | E-GEOD-33236 | biostudies-arrayexpress
2008-06-16 | E-GEOD-9511 | biostudies-arrayexpress
2012-06-19 | GSE38795 | GEO
2012-07-06 | GSE33236 | GEO
2022-04-01 | GSE173661 | GEO
2021-11-12 | GSE184869 | GEO
2024-06-25 | GSE268217 | GEO
2020-03-28 | GSE123917 | GEO