Project description:Transcript Mapping on Affymetrix ENCODE arrays for 3 different biological replicates of Placental Poly(A)+ RNA (each with either 2 or 3 technical replicates) Keywords = Transcript Mapping, Human, Affymetrix, Genome Tiling Arrays Keywords: other
Project description:Transcript Mapping on Affymetrix ENCODE arrays for 3 different biological replicates of Placental Poly(A)+ RNA (each with either 2 or 3 technical replicates) Keywords = Transcript Mapping, Human, Affymetrix, Genome Tiling Arrays Keywords: other
Project description:Transcript Mapping on Affymetrix ENCODE arrays for 10 different biological replicates of Neutrophil (PMN) Total RNA (each with 2 technical replicates). Each biological sample is from a different individual. Keywords = Transcript Mapping, Human, Affymetrix, Genome Tiling Arrays Keywords: other
Project description:Transcript Mapping on Affymetrix ENCODE arrays for Total RNA from 4 different samples (2 or 3 technical replicates of each). All of the four same NB4 samples have also been treated with Retanoic Acid (RA) (2 or 3 technical replicates of each). Three of the four same NB4 samples have also been treated with TPA (2 or 3 technical replicates of each). RA treated NB4 cells can be partially differentiated to Neutrophils and TPA treated NB4 cells can be differentiated to Monocytes. Keywords = Transcript Mapping, Human, Affymetrix, Genome Tiling Arrays Keywords: parallel sample
Project description:Transcript Mapping on Affymetrix ENCODE arrays for Total RNA from 4 different samples (2 or 3 technical replicates of each). All of the four same NB4 samples have also been treated with Retanoic Acid (RA) (2 or 3 technical replicates of each). Three of the four same NB4 samples have also been treated with TPA (2 or 3 technical replicates of each). RA treated NB4 cells can be partially differentiated to Neutrophils and TPA treated NB4 cells can be differentiated to Monocytes. Keywords = Transcript Mapping, Human, Affymetrix, Genome Tiling Arrays Keywords: parallel sample
Project description:BACKGROUND:Proteogenomic mapping is an approach that uses mass spectrometry data from proteins to directly map protein-coding genes and could aid in locating translational regions in the human genome. In concert with the ENcyclopedia of DNA Elements (ENCODE) project, we applied proteogenomic mapping to produce proteogenomic tracks for the UCSC Genome Browser, to explore which putative translational regions may be missing from the human genome. RESULTS:We generated ~1 million high-resolution tandem mass (MS/MS) spectra for Tier 1 ENCODE cell lines K562 and GM12878 and mapped them against the UCSC hg19 human genome, and the GENCODE V7 annotated protein and transcript sets. We then compared the results from the three searches to identify the best-matching peptide for each MS/MS spectrum, thereby increasing the confidence of the putative new protein-coding regions found via the whole genome search. At a 1% false discovery rate, we identified 26,472, 24,406, and 13,128 peptides from the protein, transcript, and whole genome searches, respectively; of these, 481 were found solely via the whole genome search. The proteogenomic mapping data are available on the UCSC Genome Browser at http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=wgEncodeUncBsuProt. CONCLUSIONS:The whole genome search revealed that ~4% of the uniquely mapping identified peptides were located outside GENCODE V7 annotated exons. The comparison of the results from the disparate searches also identified 15% more spectra than would have been found solely from a protein database search. Therefore, whole genome proteogenomic mapping is a complementary method for genome annotation when performed in conjunction with other searches.
Project description:RNA-seq data can be mined for sequence differences relative to the reference genome to identify both genomic SNPs and RNA editing events. We analyzed the long, polyA-selected, unstranded, deeply sequenced RNA-seq data from the ENCODE Project across 14 human cell lines for candidate RNA editing events. On average, 43% of the RNA sequencing variants that are not in dbSNP and are within gene boundaries are A-to-G(I) RNA editing candidates. The vast majority of A-to-G(I) edits are located in introns and 3' UTRs, with only 123 located in protein-coding sequence. In contrast, the majority of non-A-to-G variants (60%-80%) map near exon boundaries and have the characteristics of splice-mapping artifacts. After filtering out all candidates with evidence of private genomic variation using genome resequencing or ChIP-seq data, we find that up to 85% of the high-confidence RNA variants are A-to-G(I) editing candidates. Genes with A-to-G(I) edits are enriched in Gene Ontology terms involving cell division, viral defense, and translation. The distribution and character of the remaining non-A-to-G variants closely resemble known SNPs. We find no reproducible A-to-G(I) edits that result in nonsynonymous substitutions in all three lymphoblastoid cell lines in our study, unlike RNA editing in the brain. Given that only a fraction of sites are reproducibly edited in multiple cell lines and that we find a stronger association of editing and specific genes suggests that the editing of the transcript is more important than the editing of any individual site.
Project description:The elucidation of the largely unknown transcriptome of small RNAs is crucial for the understanding of genome and cellular function. We report here the results of the analysis of small RNAs (< 50 nt) in the ENCODE regions of the human genome. Size-fractionated RNAs from four different cell lines (HepG2, HelaS3, GM06990, SK-N-SH) were mapped with the forward and reverse ENCODE high-density resolution tiling arrays. The top 1% of hybridization signals are termed SmRfrags (Small RNA fragments). Eight percent of SmRfrags overlap the GENCODE genes (CDS), given that the majority map to intergenic regions (34%), intronic regions (53%), and untranslated regions (UTRs) (5%). In addition, 9.6% and 16.8% of SmRfrags in the 5' UTR regions overlap significantly with His/Pol II/TAF250 binding sites and DNase I Hypersensitive sites, respectively (compared to the 5.3% and 9% expected). Interestingly, 17%-24% (depending on the cell line) of SmRfrags are sense-antisense strand pairs that show evidence of overlapping transcription. Only 3.4% and 7.2% of SmRfrags in intergenic regions overlap transcribed fragments (Txfrags) in HeLa and GM06990 cell lines, respectively. We hypothesized that a fraction of the identified SmRfrags corresponded to microRNAs. We tested by Northern blot a set of 15 high-likelihood predictions of microRNA candidates that overlap with smRfrags and validated three potential microRNAs ( approximately 20 nt length). Notably, most of the remaining candidates showed a larger hybridizing band ( approximately 100 nt) that could be a microRNA precursor. The small RNA transcriptome is emerging as an important and abundant component of the genome function.