Bulk RNA sequencing of colorectal tumors for molecular subtyping
Ontology highlight
ABSTRACT: We have performed bulk RNA sequencing on colorectal tumors in order to determine their transcriptome-based molecular subtypes. RNA was isolated from fresh frozen human colorectal tumor specimens with Trizol extraction. RNA library preparation was performed with KAPA stranded mRNA sequencing kit, during which mRNA selection was performed with polyT capture beads. Sequencing was performed on Illumina HiSeq4000 as single-end 51 bp reads. File names include dataset name (KUL3), patient code(SCXXX), sample region (core or border) and sample code (EXTXXX): DATASETNAME_PATIENTCODE_SAMPLEREGION_rSAMPLE_CODE_...fastq.gz
Project description:Whole genome sequencing of 10 HCLc tumor and matched-germline T cells. Genomic DNA from highly purified HCLc tumor and T cell populations were utilized for library preparation using NEBNext Ultra DNA library prep kit. Sequencing was performed as 150 bp paired end sequencing using four lanes of an Illumina HiSeq4000 to an average depth of 12X. Reads from each library were aligned to the human reference genome GRCh37 using BWA-MEM (v0.7.12). The analysis of somatic genetic alterations in WGS data from tumor-germline pair HCLc samples was divided based on the nature of the mutation, as follow: single-nucleotide variants (SNVs), indels, CNAs and SVs. Moreover, COSMIC mutational signatures and subclonal architecture was inferred for each tumor.
Project description:Artificial intelligence (AI) applications in biomedical settings face challenges such as data privacy and regulatory compliance. Federated Deep Learning (FDL) effectively addresses these issues. We developed ProCanFDL, where local models were trained on simulated sites using proteomic data drawn from a pan-cancer cohort (n = 1,260) and 29 other cohorts (n = 6,265), representing 4,956 patients and 19,930 mass spectrometry (MS) runs, all held behind private firewalls. Local parameter updates were aggregated to build the global model, achieving a 43% performance gain over local models on the hold-out test set (n = 625) in 14 cancer subtyping tasks. Additionally, ProCanFDL preserved data privacy while matching centralized model performance. External validation assessed generalization by retraining the global model with data from two external cohorts (n = 55) and eight (n = 832) using a different MS technology. ProCanFDL presents a solution for internationally collaborative machine learning initiatives using proteomic data while maintaining data privacy.
Project description:Genome-wide mRNA expression profiles of 70 primary gastric tumors from the Australian patient cohort. Like many cancers, gastric adenocarcinomas (gastric cancers) show considerable heterogeneity between patients. Thus, there is intense interest in using gene expression profiles to discover subtypes of gastric cancers with particular biological properties or therapeutic vulnerabilities. Identification of such subtypes could generate insights into the mechanisms of cancer progression or lay the foundation for personalized treatments. Here we report a robust gene-xpression-based clustering of a large collection of gastric adenocarcinomas from Singaporean patients [GSE34942 and GSE15459]. We developed and validated a classifier for the three subtypes in Australian patient cohort. Profiling of 70 primary gastric tumors on Affymetrix GeneChip Human Genome U133 Plus 2.0 Array. All tumors were collected with approvals from Peter MacCallum Cancer Center, Australia; the Research Ethics Review Committee; and signed patient informed consent.
Project description:In mammals brain evolution, specifically that of the cerebellum, is a focus of evolutionary change. Here we functionally characterize the phylogenetically-restricted novel gene, piggyBac transposable element. The ChiP-exonuclease assay protocol was performed as in Serandour's method. The libraries were quantified by using the KAPA library quantification kit for Illumina sequencing platforms (KAPA Biosystems, KK4824) and sequenced on HiSeq following the manufacturer’s protocol.
Project description:Transposon insertion site sequencing (TIS) is a powerful method for associating genotype to phenotype. However, all TIS methods described to date use short nucleotide sequence reads which cannot uniquely determine the locations of transposon insertions within repeating genomic sequences where the repeat units are longer than the sequence read length. To overcome this limitation, we have developed a TIS method using Oxford Nanopore sequencing technology that generates and uses long nucleotide sequence reads; we have called this method LoRTIS (Long Read Transposon Insertion-site Sequencing). This experiment data contains sequence files generated using Nanopore and Illumina platforms. Biotin1308.fastq.gz and Biotin2508.fastq.gz are fastq files generated from nanopore technology. Rep1-Tn.fastq.gz and Rep1-Tn.fastq.gz are fastq files generated using Illumina platform. In this study, we have compared the efficiency of two methods in identification of transposon insertion sites.
Project description:This study aims to investigate differentially expressed proteins in tumor pericytes derived from colorectal cancer patients with or without liver metastasis. Tumor pericytes were isolated from tumor of colorectal cancer patients with or without liver metastasis. Then, tumor pericytes were cultured and subjected to proteomic analysis. TCAF2 was significantly increased in tumor pericytes from liver metastasis patients.
Project description:Single Gland Whole-exome sequencing: building on our prior description of multi-region WES of colorectal tumors and targeted single gland sequencing (E-MTAB-2247), we performed WES of multiple single glands from different sides (right: A and left: B) of two tumors in this study (tumor O and U) on the illumina platform using the Agilent SureSelect 2.0 or illumina Nextera Rapid Capture Exome kit (SureSelect or NRCE, as indicated in the naming of fastq files). Colorectal Cancer Xenograft Whole-exome sequencing: The HCT116 and LoVo Mismatch-Repair-deficient colorectal adenocarcinoma cell lines were obtained from the ATCC and cultured under standard conditions. For both cell lines, a single âfoundingâ cell was cloned and expanded in vitro to ~6M cells. Two aliquots of ~1M cells were subcutaneously injected into opposite flanks (right and left) of a nude mouse and tumors allowed to reach a size of ~1B cells (1cm3) before the animal was sacrificed. Tumor tissue was collected separately from the right and left lesions and DNA was extracted for WES using the illumina TruSeq Exome kit or Nextera Rapid Capture Exome expanded Kits (Truseq or NRCEe), as was DNA from the first passage population (a polyclonal tissue culture for HCT116 and a polyclonal xenograft sample for LoVo), which were employed as a control to study mutation accumulation in culture and post xenotransplantation.
Project description:Protocol: Total RNA was extracted from mouse embryos and DNase treated. Fragmented RNA was enriched for the 3 ends by pull down using an anchored polyT oligo attached to magnetic beads. An RNA oligo comprising part of the Illumina adapter 1 was ligated to the 5 end of the captured RNA and the RNA was eluted from the beads. Reverse transcription was primed with an anchored polyT oligo with part of Illumina adapter 2 at the 5 end followed by 10 bases HBDVHBDVHB (using the single base code), then one of 96 eight base indexing tags, then CG and 14 T bases. An Illumina library with full adapter sequence was produced by 20 cycles of PCR. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/