Metabolomics,Unknown,Transcriptomics,Genomics,Proteomics

Dataset Information

0

RNA-seq from ENCODE/Caltech (Mouse)


ABSTRACT: RNA-seq is a method for mapping and quantifying the transcriptome of any organism that has a genomic DNA sequence assembly (Mortazavi et al., 2008). RNA-seq is performed by reverse-transcribing an RNA sample into cDNA, followed by high-throughput DNA sequencing, which was done here on the Illumina HiSeq sequencer. The transcriptome measurements shown on these tracks were performed on polyA selected RNA (http://hgwdev.cse.ucsc.edu/cgi-bin/hgEncodeVocab?term=longPolyA&type=rnaExtract) from total cellular RNA (http://hgwdev.cse.ucsc.edu/cgi-bin/hgEncodeVocab?term=cell&type=localization). PolyA-selected RNA was fragmented by magnesium-catalyzed hydrolysis and then converted into cDNA by random priming and amplified. Paired-end 2x100 bp reads were obtained from each end of a cDNA fragment. Reads were aligned to the mm9 human reference genome using TopHat (Trapnell et al., 2009), a program specifically designed to align RNA-seq reads and discover splice junctions de novo. All sequence and alignments files are available at http://hgwdev.cse.ucsc.edu/cgi-bin/hgFileUi?db=mm9&g=wgEncodeCaltechRnaSeq. Cells were grown according to the approved ENCODE cell culture protocols (http://hgwdev.cse.ucsc.edu/ENCODE/protocols/cell/mouse). Cells were lysed in RLT buffer (Qiagen RNEasy kit), and processed on RNEasy midi columns according to the manufacturer's protocol, with the inclusion of the "on-column" DNAse digestion step to remove residual genomic DNA. A quantity of 75 µgs of total RNA was selected twice with oligo-dT beads (Dynal) according to the manufacturer's protocol to isolate mRNA from each of the preparations. A quantity of 100 ngs of mRNA was then processed according to the protocol in Mortazavi et al. (2008), and prepared for sequencing on the Illumina GAIIx or HiSeq platforms according to the protocol for the ChIP-Seq DNA genomic DNA kit (Illumina). Paired-end libraries were size-selected around 200 bp (fragment length). Libraries were sequenced with the Illumina HiSeq according to the manufacturer's recommendations. Paired-end reads of 100 bp length were obtained. Reads were mapped to the reference mouse genome (version mm9 with or without the Y chromosome, depending on the sex of the cell line, and without the random chromosomes in all cases) using TopHat (version 1.3.1) (http://tophat.cbcb.umd.edu/). TopHat was used with default settings with the exception of specifying an empirically determined mean inner-mate distance and supplying known ENSEMBL version 63 splice junctions.

ORGANISM(S): Mus musculus

SUBMITTER: UCSC ENCODE DCC 

PROVIDER: E-GEOD-37909 | biostudies-arrayexpress |

REPOSITORIES: biostudies-arrayexpress

Similar Datasets

2012-05-10 | GSE37909 | GEO
2012-05-23 | E-GEOD-38163 | biostudies-arrayexpress
2011-12-16 | GSE34448 | GEO
2011-11-10 | GSE33600 | GEO
2012-05-24 | GSE38163 | GEO
2011-12-16 | E-GEOD-34448 | biostudies-arrayexpress
2012-06-07 | GSE35583 | GEO
2012-07-20 | E-GEOD-39524 | biostudies-arrayexpress
2012-07-20 | GSE39524 | GEO
2012-06-06 | E-GEOD-35585 | biostudies-arrayexpress