Metabolomics,Unknown,Transcriptomics,Genomics,Proteomics

Dataset Information

0

Long-read transcriptomics of a diverse human cohort reveals widespread ancestry bias in gene annotation


ABSTRACT: Understanding gene expression diversity across human populations is essential for accurate genome annotation and disease interpretation. However, existing annotations are primarily based on European-derived transcriptomic data, potentially limiting their applicability to other populations. This study aims to assess population-specific transcript diversity and its impact on gene annotation. To achieve this, we performed long-read RNA sequencing on lymphoblastoid cell lines from 43 individuals across eight globally diverse populations. Our workflow included RNA extraction, cDNA synthesis, and sequencing using Oxford Nanopore long-read technology, followed by transcript assembly and comparison with existing gene annotations. We also integrated novel transcripts into reference annotations to evaluate their effect on allele-specific transcript usage detection. This work provides a critical step toward improving transcriptome annotation across diverse populations, ensuring a more comprehensive representation of human genetic variation. We provide here unprocessed, unaligned BAMs (just basecalled; uploaded file names end with *.bam) along with (ONT duplex read resolution, chimeric read splitting, UMI-based deduplication, adapter trimming, and basecalling quality >= 10; uploaded file names end with *preprocessed_Q10.fastq.gz) for each sample.

INSTRUMENT(S): GridION

ORGANISM(S): Homo sapiens

SUBMITTER: Pau Clavell-Revelles 

PROVIDER: E-MTAB-14935 | biostudies-arrayexpress |

REPOSITORIES: biostudies-arrayexpress

Similar Datasets

2024-11-06 | E-MTAB-14562 | biostudies-arrayexpress
2023-12-20 | E-MTAB-13063 | biostudies-arrayexpress
2017-01-20 | GSE93848 | GEO
2022-11-12 | PXD034107 | Pride
2021-05-17 | PXD025486 | Pride
2016-10-26 | PXD005034 | Pride
2022-01-08 | GSE189482 | GEO
| PRJNA1311745 | ENA
2003-10-30 | GSE636 | GEO
2022-04-27 | E-MTAB-8436 | biostudies-arrayexpress