Project description:The incomplete genome annotation of non-model organisms hampers molecular and proteomic studies. Proteomics informed by transcriptomics (PIT) is suited to non-model organisms because peptides are identified using transcriptomic, not genomic, data. Aedes aegypti is the mosquito vector for the (re-)emerging dengue, chikungunya, yellow fever and Zika viruses. An Ae. aegypti genome sequence is available, however experimental evidence for >90% of the Ae. aegypti proteome or the activity of transposable elements (TEs) that constitute 50% of the Ae. aegypti genome is lacking. We used PIT to characterise the proteome of the Aedes aegypti derived cell line Aag2. Hotspots of incomplete genome annotation were identified which are not explained by poor sequence and assembly quality. We developed criteria for the characterisation of proteomically active TEs and demonstrate that protein expression does not correlate with a TE’s genomic abundance. Finally, we identify Phasi Charoen-like virus as an unrecognised contaminant of Aag2 cells. We therefore present the first proteomic characterisation of mobile genetic elements, and provide proof-of-principle that PIT can evaluate a genome’s annotation to guide annotation efforts.
Project description:We present a draft genome assembly that includes 200 Gb of Illumina reads, 4 Gb of Moleculo synthetic long-reads and 108 Gb of Chicago libraries, with a final size matching the estimated genome size of 2.7 Gb, and a scaffold N50 of 4.8 Mb. We also present an alternative assembly including 27 Gb raw reads generated using the Pacific Biosciences platform. In addition, we sequenced the proteome of the same individual and RNA from three different tissue types from three other species of squid species (Onychoteuthis banksii, Dosidicus gigas, and Sthenoteuthis oualaniensis) to assist genome annotation. We annotated 33,406 protein coding genes supported by evidence and the genome completeness estimated by BUSCO reached 92%. Repetitive regions cover 49.17% of the genome.
Project description:This dataset includes RNAseq data of 7 tissues/developmental stages of Lathyrus sativus genotype LSWT11 and 2 tissues with drought- and well-watered treatments of Lathyrus sativus genotypes LS007 and Mahateora. These data were used in the functional annotation pipeline of the Rbp1.0 genome assembly of LS007. The multi-tissue transcriptome was also used to support gene candidate identification by mRNA abundance. Also included is Hi-C sequencing data used to scaffold the assembly into pseudochromosomes
Project description:In this study, we dissect in detail the glial and immune responses associated to early stages of CAA. To do so, RNAseq gene expression analysis were performed in a mouse model for Familial Danish Dementia (FDD), a neurodegenerative disease characterized by the accumulation of Danish amyloid (ADan) in the vasculature. Findings observed in this CAA mouse model were complemented with primary culture assays.
Project description:A new genome of Fraxinus excelsior (PRJNA865134) was assembled using a hybrid approach combining Nanopore and Illumina data. The gene expression of a 182 Danish tree panel (Harper et al. 2016) was assessed using the new genome as reference (BioProject PRJNA865134, SAMN30100368, genome JANJPF000000000 ).Manuscript title: Fraxinus excelsior updated long-read genome reveals the importance of MADS-box genes in tolerance mechanisms against ash dieback, G3:Genes|Genomes|Genetics
Project description:Macaque species share over 93% genome homology with humans and develop many disease phenotypes similar to those of humans, making them valuable animal models for the study of human diseases (e.g.,HIV and neurodegenerative diseases). However, the quality of genome assembly and annotation for several macaque species lags behind the human genome effort. To close this gap and enhance functional genomics approaches, we employed a combination of de novo linked-read assembly and scaffolding using proximity ligation assay (HiC) to assemble the pig-tailed macaque (Macaca nemestrina) genome. This combinatorial method yielded large scaffolds at chromosome-level with a scaffold N50 of 127.5 Mb; the 23 largest scaffolds covered 90% of the entire genome. This assembly revealed large-scale rearrangements between pig-tailed macaque chromosomes 7, 12, and 13 and human chromosomes 2, 14, and 15. We subsequently annotated the genome using transcriptome and proteomics data from personalized induced pluripotent stem cells (iPSCs) derived from the same animal. Reconstruction of the evolutionary tree using whole genome annotation and orthologous comparisons among three macaque species, human and mouse genomes revealed extensive homology between human and pig-tailed macaques with regards to both pluripotent stem cell genes and innate immune gene pathways. Our results confirm that rhesus and cynomolgus macaques exhibit a closer evolutionary distance to each other than either species exhibits to humans or pig-tailed macaques. These findings demonstrate that pig-tailed macaques can serve as an excellent animal model for the study of many human diseases particularly with regards to pluripotency and innate immune pathways.
Project description:We identified hankyphage prophages within B. thetaiotaomicron isolates gathered from French hospitals. We extracted genomic DNA from an overnight culture from a single colony of each strain and sequenced them using Nanopore sequencing using the Plasmidsaurus platform. This long-read approach helped the assembly of the phages and determination of the hankyphage ends. We also improved the annotation of the reference hankyphage (hankyphage p00 from P. dorei HM719) using a structural prediction approach and annotated our B. thetaiotaomicron hankyphages using this new annotation. In this project we upload the genomic raw reads of nanopore sequencing of our hankyphage-bearing B. thetaiotaomicron collection (jmh strains) and the processed assembled hankyphages.
Project description:Proteomic analysis of six tissues (liver, kidney, blubber, brain, muscle, skin) provided experimental confirmation of 10,402 proteins from 4,711 protein groups, almost 1/3 of the possible predicted proteins in the Atlantic bottlenose dolphin (Tursiops truncatus) NCBI annotation (release 101), which is based on the recently completed NIST Tur_tru v1 genome assembly.