Browse
Submit Data
Databases
API
Help

Dataset Information

0 Views

0 Connections

0 Citations

0 Reanalyses

0 Downloads

Omics score: 0

Remapping the SRA: Drosophila melanogaster RNA-Seq data from the Sequence Read Archive

ABSTRACT: The sequence read archive (SRA) contains over 52 terabases or 482 billion reads from Drosophila melanogaster (as of June 2018). These data are massively underused by the community and include 14,423 RNA-Seq samples, that is roughly 7 times the size of modENCODE. Currently the major challenge is finding high quality datasets that are suitable for inclusion in new studies. To help the community overcome this hurdle, we re-processed all D. melanogaster RNA-Seq SRA experiments (SRXs) using an identical workflow. This workflow uses a data driven approach to identify technical metadata (i.e., strandedness and layout) for each sample in order to optimize mapping parameters. The workflow generates QC metrics, coverage tracks based on the dm6 assembly, and calculates gene level, junction level, and intergenic counts against FlyBase r6.11. This resource will allow any researcher to visualize browser tracks for any publicly available dataset, quickly identify high quality data sets for use in their own research, and download identically processed counts tables. There is a treasure trove of underused data sitting in the SRA and this work addresses the first challenge to make data integration a common laboratory practice.

ORGANISM(S): Drosophila melanogaster

PROVIDER: GSE117217 | GEO | 2018/07/18

REPOSITORIES: GEO

ACCESS DATA

Json Xml

Dataset's files

Source:

			Action	DRS
		Other

Items per page:

1 - 1 of 1

Similar Datasets

Homo sapiens

Project description:NCBI-generated Human raw gene counts from SRA RNA-seq data

| PRJNA1023937 | ENA

Mus musculus

Project description:NCBI-generated Mouse raw gene counts from SRA RNA-seq data

| PRJNA1023938 | ENA

Track normalization and averaging of H3K4me3 ChIP-seq data across various cell and tissue types from Mouse ENCODE.

Project description:Data tracks from chromatin immunoprecipitation followed by high throughput sequencing (ChIP-seq) experiments were processed using an in-house algorithm that provides normalization functionality followed by generation of a track average.

2015-10-09 | GSE73834 | GEO

The lateral septum orchestrates state-dependent modulation of associative threat memory dynamics across the ovarian hormone cycle

Project description:Single-nucleus ATAC- and RNA-sequencing data from adult female proestrus C57Bl/6J mouse brain lateral septum, threat conditioned and naive control animals. This GEO submission contains the processed RNA-seq/single-nucleus RNA-seq data associated with existing SRA records.

2026-06-07 | GSE333543 | GEO

Track normalization and averaging of bislufite-seq DNA methylation data across various cell and tissue types

Project description:Data tracks from bisulfite sequencing (BS-seq) experiments were sorted by tissue or cell types and processed using an in-house algorithm that provides normalization functionality followed by generation of a track average.

2015-01-20 | GSE64577 | GEO

Re-analysis of PRO-seq data from HCT116 cells (GSE129501)

Project description:This submission provides processed PRO-seq coverage tracks (bigWig files) from the re-analysis of public data (GSE129501, samples GSM3714462 and GSM3714463) from HCT116 human colorectal carcinoma cells. Data were aligned to the hg38 reference genome (GENCODE release 33; https://www.gencodegenes.org/human/release_33.html) and normalized using bamCoverage (PMID: 27079975). These processed files are intended for visualization and downstream analysis of nascent transcription.

2026-03-09 | GSE300323 | GEO

Track normalization and averaging of H3K4me3 ChIP-seq data across various cell and tissue types.

Project description:Data tracks from chromatin immunoprecipitation followed by high throughput sequencing (ChIP-seq) experiments were sorted by tissue or cell types and processed using an in-house algorithm that provides normalization functionality followed by generation of a track average.

2015-01-20 | GSE65049 | GEO

Re-analysis of PRO-cap data from HCT116 cells (GSE219977)

Project description:This submission provides processed PRO-cap coverage tracks (bigWig files) from the re-analysis of public data (GSE219977, samples GSM6783554 and GSM6783555) from Homo sapiens HCT116 genetically modified (insertion) using CRISPR inserting O. sativa LOC4335696, (insertion) using CRISPR targeting H. sapiens SUPT16H. Data were aligned to the hg38 reference genome (GENCODE release 33; https://www.gencodegenes.org/human/release_33.html) and normalized using bamCoverage (PMID: 27079975). These processed files are intended for visualization and downstream analysis of nascent transcription.

2026-03-09 | GSE300325 | GEO

CpG island mediated linear and spatial gene partitioning (pA+ RNA-seq profile)

Project description:In order to test the global effects of CpG island-centered gene regulation on global gene expression profile, pA+ RNA-seq data of diverse tissues and cell lines were gathered and profiled. All available mouse poly-A positive RNA-seq data (3,818 samples) were summarized and downloaded at May, 5th, 2015. Among them, excluding single cell RNA-seq or experiments whose expression verified gene counts are small (less than 5,000 genes with RPKM 0.5 or higher), 1,524 high quality RNA-seq data were used. Raw data were downloaded from Sequence Read Archive (SRA) in National Center for Biotechnology Information (NCBI) database. FASTQ files were extracted with the SRA Toolkit version 2.5.5 and aligned using STAR 2.4.2 onto the mouse and human genome (mm9 and hg19, respectively). Gene expression was calculated as RPKM values using rpkmforgenes.py (Ramsköld et al., 2009).

2018-02-15 | GSE80797 | GEO

Acute effects of active breaks during prolonged sitting on subcutaneous adipose tissue gene expression

Project description:Breaking up prolonged periods of time spent sitting has a range of beneficial impacts on cardiometabolic risk biomarkers. The molecular mechanisms include regulation of skeletal muscle gene and protein expression controlling metabolic, inflammatory and cell development pathways. An active communication network exists between adipose and muscle tissue, but the effect of active breaks in prolonged sitting on adipose tissue have yet to be investigated. This study characterised the acute transcriptional events induced in adipose tissue by regular active breaks during prolonged sitting. In a subset of 8 overweight/obese adults participating in an acute randomised three-intervention crossover trial, subcutaneous adipose tissue biopsies were obtained after each condition. The three experimental conditions were conducted in the postprandial state and included: i) prolonged uninterrupted sitting; or prolonged sitting interrupted with 2-minute bouts of ii) light- or iii) moderate-intensity treadmill walking every 20 minutes. Microarrays identified 36 differentially expressed genes between the three conditions (fold change≥0.5 in either direction; p<0.05). Pathway analysis indicated that breaking up of prolonged sitting led to differential regulation of adipose tissue metabolic networks and inflammatory pathways, increased insulin signalling, increased adipocyte turnover, and facilitated cross-talk between adipose tissue and other organs. This study provides insight into the adipose tissue regulatory systems and transcriptional processes that contribute to the physiological benefits of interrupting prolonged sitting.

2018-06-13 | GSE115645 | GEO

OmicsDI is part of the ELIXIR infrastructure

OmicsDI is an Elixir interoperability service. Learn more ›

Tweets

OmicsDI Databases

PRIDE
PeptideAtlas
MassIVE
JPOST Repository
Physiome Model Repository

EGA
EVA
ENA
LINCS
PAXDB
Cell Collective

MetaboLights
Metabolomics Workbench
MetabolomeExpress
GNPS
BioModels
FAIRDOMHub

ArrayExpress
dbGaP
ExpressionAtlas
GEO
NODE

Information

Databases
Help
API
Contact us
Code on GitHub
Terms of use
Submit Data