Project description:Understanding gene expression diversity across human populations is essential for accurate genome annotation and disease interpretation. However, existing annotations are primarily based on European-derived transcriptomic data, potentially limiting their applicability to other populations. This study aims to assess population-specific transcript diversity and its impact on gene annotation. To achieve this, we performed long-read RNA sequencing on lymphoblastoid cell lines from 43 individuals across eight globally diverse populations. Our workflow included RNA extraction, cDNA synthesis, and sequencing using Oxford Nanopore long-read technology, followed by transcript assembly and comparison with existing gene annotations. We also integrated novel transcripts into reference annotations to evaluate their effect on allele-specific transcript usage detection. This work provides a critical step toward improving transcriptome annotation across diverse populations, ensuring a more comprehensive representation of human genetic variation.
Project description:In Europe, ticks are the most important vectors of diseases threatening humans, livestock, wildlife and companion animals. Nevertheless, genomic sequence information and functional annotation of proteins of the most important European tick, Ixodes ricinus, is limited. Here we present the first analysis of the I. ricinus genome and of the transcriptome of the unfed I. ricinus midgut. We combined and integrated data from genome, transcriptome and proteome. The de novo assembly of 1 billion paired-end sequences identified 6,415 putative genes providing an unprecedented insight into the I. ricinus genome. Mapping of our midgut mRNA reads to the assembled contigs let us estimate to cover around two third of the unique genomic sequences. In addition, more than 10,000 transcripts from naïve midgut were annotated functionally and/or locally. By combining the alignment-based with a motif-search based annotation approach, we could double the number of annotations throughout all groups without shifting the dataset. Moreover, 1,175 proteins expressed in the naïve midgut were identified by mass spectrometry confirming the high completeness of our transcriptome database, and 608 were significantly annotated for function and/or localization. This multiple-omics study vastly extends the publicly available DNA, RNA and protein databases for I. ricinus and ticks in general.