Project description:Protein-coding small open reading frames (smORFs) are emerging as an important class of genes, however, the coding capacity of smORFs in the human genome is unclear. By integrating de novo transcriptome assembly and Ribo-Seq, we confidently annotate thousands of novel translated smORFs in three human cell lines. We find that smORF translation prediction is noisier than for annotated coding sequences, underscoring the importance of analyzing multiple experiments and footprinting conditions. These smORFs are located within non-coding and antisense transcripts, the UTRs of mRNAs, and unannotated transcripts. Analysis of RNA levels and translation efficiency during cellular stress identifies regulated smORFs and provides an approach for identifying smORFs for further investigation. Sequence conservation and signatures of positive selection indicate that encoded microproteins are likely functional. Additionally, proteomics data from enriched human leukocyte antigen complexes validates the translation of hundreds of smORFs and positions them as a source of novel antigens. Thus, smORFs represent a significant number of important, yet unexplored human genes.
Project description:Open reading frames (ORFs) are the genomic DNA sequences that have the potential to be translated. Genome annotation pipelines dismiss translation products of small ORFs (smORFs) of less than 100 codons (<300 nucleotides) as being unlikely to have a biological function. In this study, we systematically characterized smORFs in mouse B and T cells under different conditions and predicted a total of 5744 unique actively translated smORFs. We then extended our analysis to ORFs of 101-200 codons in length and predicted 945 of such longer translation products. Additionally, our results have suggested the existence of candidate secreted micropeptides. Furthermore, verifying their existence and identifying their functions will be essential and potentially lead to useful applications.
Project description:We present a genome-wide assessment of small open reading frames (smORF) translation by ribosomal profiling of polysomal fractions in Drosophila S2 cell. In this way, mRNAs bound by multiple ribosomes and hence actively translated can be isolated and distinguished from mRNAs bound by sporadic, putatively non-productive single ribosomes or ribosomal subunits. Ribosomal profiling of large and small polysomal fractions in Drosophila S2 cells to assess translation of smORFs
Project description:We present a genome-wide assessment of small open reading frames (smORF) translation by ribosomal profiling of polysomal fractions in Drosophila S2 cell. In this way, mRNAs bound by multiple ribosomes and hence actively translated can be isolated and distinguished from mRNAs bound by sporadic, putatively non-productive single ribosomes or ribosomal subunits.
Project description:Ribosome profiling has revealed pervasive but largely uncharacterized translation outside of canonical coding sequences (CDSs). Here, we exploit a systematic CRISPR-based screening strategy to identify hundreds of non-canonical CDSs that are essential for cellular growth and whose disruption elicit specific, robust transcriptomic and phenotypic changes in human cells. Functional characterization of the encoded microproteins reveals distinct cellular localizations, specific protein binding partners, and hundreds that are presented by the HLA system. Interestingly, we find multiple microproteins encoded in upstream open reading frames, which form stable complexes with the main, canonical protein encoded on the same mRNA, thus revealing the diverse use of functional bicistronic operons in mammals. Together, our results point to a family of functional human microproteins that play critical and diverse cellular roles.