Transcriptomics

Dataset Information

0

Long-read cDNA sequencing identifies functional pseudogenes in the human transcriptome


ABSTRACT: Pseudogenes are gene copies presumed to mainly be functionless relics of evolution due to acquired deleterious mutations or transcriptional silencing. When transcribed, pseudogenes may encode proteins or enact RNA-intrinsic regulatory mechanisms. However, the extent, characteristics and functional relevance of the human pseudogene transcriptome are unclear. Short-read sequencing platforms have limited power to resolve and accurately quantify pseudogene transcripts owing to the high sequence similarity of pseudogenes and their parent genes. Using deep full-length PacBio cDNA sequencing of normal human tissues and cancer cell lines, we identify here hundreds of novel transcribed pseudogenes. Pseudogene transcripts are expressed in tissue-specific patterns, exhibit complex splicing patterns and contribute to the coding sequences of known genes. We survey pseudogene transcripts encoding intact open reading frames (ORFs), representing potential unannotated protein-coding genes, and demonstrate their efficient translation in cultured cells. To assess the impact of noncoding pseudogenes on the cellular transcriptome, we delete the nucleus-enriched pseudogene PDCL3P4 transcript from HAP1 cells and observe hundreds of perturbed genes. This study highlights pseudogenes as a complex and dynamic component of the transcriptional landscape underpinning human biology and disease.

ORGANISM(S): Homo sapiens

PROVIDER: GSE160383 | GEO | 2021/04/26

REPOSITORIES: GEO

Similar Datasets

2021-08-01 | GSE155510 | GEO
2017-07-26 | GSE101837 | GEO
2021-05-27 | GSE172219 | GEO
2022-02-18 | GSE176018 | GEO
| phs000525 | dbGaP
2010-06-24 | E-GEOD-17191 | biostudies-arrayexpress
2009-07-21 | GSE17191 | GEO
| phs000525.v1.p1 | EGA
2022-10-15 | GSE215459 | GEO
2008-03-28 | GSE10364 | GEO