Dataset Information

EGAS00001000978-sc-20141017 - samples

ABSTRACT: Genomic characterisation of a large series of cancer cell lines.

PROVIDER: EGAD00001001039 | EGA |

REPOSITORIES: EGA

ACCESS DATA

Similar Datasets

Project description:A variety of newly developed next-generation sequencing technologies are making their way rapidly into the research and clinical applications, for which accuracy and cross-lab reproducibility are critical, and reference standards are much needed. However, there is still a lack of well-characterized reference materials which include epigenomic and proteomic data. Our previous multicenter studies under the SEQC-2 umbrella using a breast cancer cell line with paired B-cell line have produced large amount different genomic data including whole genome sequencing (Illumina, PacBio, Nanopore), HiC, and scRNA-seq with detailed analyses on somatic mutations, single-nucleotide variations (SNVs), and structure variations (SVs). Here we further performed ATAC-seq, Methyl-seq, RNA-seq, and proteomic analyses and provided a comprehensive catalog of epigenomic landscape, which overlapped with the transcriptomes and proteomes for the two cell lines. We identified >7,700 peptide isoforms, where the majority (95%) of the genes had a single peptide isoform and found that the protein expression levels of the transcripts overlapping CGIs were much higher than the protein expression levels of the non-CGI transcripts in both cell lines. We observed that open chromatin regions had low methylation while closed chromatin regions had high methylation, which were largely regulated by CG density, where CG-rich regions had more accessible chromatin, low methylation, and higher gene and protein expressions. The CG-poor regions had higher repressive epigenetic regulations (less open chromatin and higher DNA methylation), resulting in a cell line specific methylation and gene expression patterns. Our studies provide well-defined reference materials consisting of two cell lines with genomic, epigenomic, transcriptomic, scRNA-seq and proteomic characterizations which can serve as standards for validating and benchmarking not only on various omics assays, but also on bioinformatics methods. It will be a valuable resource for both research and clinical communities.

Dataset Information

EGAS00001000978-sc-20141017 - samples

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets