Unknown

Dataset Information

0

Small open reading frames: a comparative genetics approach to validation.


ABSTRACT: Open reading frames (ORFs) with fewer than 100 codons are generally not annotated in genomes, although bona fide genes of that size are known. Newer biochemical studies have suggested that thousands of small protein-coding ORFs (smORFs) may exist in the human genome, but the true number and the biological significance of the micropeptides they encode remain uncertain. Here, we used a comparative genomics approach to identify high-confidence smORFs that are likely protein-coding. We identified 3,326 high-confidence smORFs using constraint within human populations and evolutionary conservation as additional lines of evidence. Next, we validated that, as a group, our high-confidence smORFs are conserved at the amino-acid level rather than merely residing in highly conserved non-coding regions. Finally, we found that high-confidence smORFs are enriched among disease-associated variants from GWAS. Overall, our results highlight that smORF-encoded peptides likely have important functional roles in human disease.

SUBMITTER: Jain N 

PROVIDER: S-EPMC10152738 | biostudies-literature | 2023 May

REPOSITORIES: biostudies-literature

altmetric image

Publications

Small open reading frames: a comparative genetics approach to validation.

Jain Niyati N   Richter Felix F   Adzhubei Ivan I   Sharp Andrew J AJ   Gelb Bruce D BD  

BMC genomics 20230501 1


Open reading frames (ORFs) with fewer than 100 codons are generally not annotated in genomes, although bona fide genes of that size are known. Newer biochemical studies have suggested that thousands of small protein-coding ORFs (smORFs) may exist in the human genome, but the true number and the biological significance of the micropeptides they encode remain uncertain. Here, we used a comparative genomics approach to identify high-confidence smORFs that are likely protein-coding. We identified 3,  ...[more]

Similar Datasets

| S-EPMC11602987 | biostudies-literature
2021-04-28 | GSE154491 | GEO
2019-07-03 | GSE125218 | GEO
2014-09-11 | E-GEOD-60384 | biostudies-arrayexpress
| S-EPMC3334604 | biostudies-literature
| S-EPMC7085969 | biostudies-literature
2014-09-11 | GSE60384 | GEO
2020-03-14 | GSE131650 | GEO
| S-EPMC7856248 | biostudies-literature
| S-EPMC4359375 | biostudies-literature