Project description:We applied Single Molecule Real-Time long-read whole-genome sequencing in Dux knockout mouse and confirmed the success of our Dux knockout mouse model.
Project description:This repository contains whole genome long read sequencing data generated using Oxford Nanopore Technologies from a mouse of the Four Core Genotypes cross. By crossing mice with both a deletion of the sex-determining factor Sry on the Y-chromosome and a transgenic insertion of Sry on Chromosome 3, four combinations of gonadal (testis or ovaries) and chromosomal (XX or XY) are generated, namely XYSry-Chr3Sry+ (gonadal and chromosomal males), XYSry- (gonadal females, chromosomal males), XX (gonadal and chromosomal females), XXChr3Sry+ (gonadal males, chromosomal females). The transgenes were on a C57BL6 genetic background which is crossed with a CAST/EiJ female to allow for the distinction of the parental haplotypes. DNA sequencing was done on a liver sample of the XYSry+ genotype.
Project description:Evaluation of short-read-only, long-read-only, and hybrid assembly approaches on metagenomic samples demonstrating how they affect gene and protein prediction which is relevant for downstream functional analyses. For a human gut microbiome sample, we use complementary metatranscriptomic, and metaproteomic data to evaluate the metagenomic-based protein predictions.
Project description:Recent studies have demonstrated that the non-coding genome can produce unannotated proteins as antigens that induce immune response. One major source of this activity is the aberrant epigenetic reactivation of transposable elements (TEs). In tumors, TEs often provide cryptic or alternate promoters, which can generate transcripts that encode tumor-specific unannotated proteins. Thus, TE-derived transcripts have the potential to produce tumor-specific, but recurrent, antigens shared among many tumors. Identification of TE-derived tumor antigens holds the promise to improve cancer immunotherapy approaches; however, current genomics and computational tools are not optimized for their detection. Here we combined CAGE technology with full-length long-read transcriptome sequencing (Long-Read CAGE, or LRCAGE) and developed a suite of computational tools to significantly improve immunopeptidome detection by incorporating TE-derived and other tumor transcripts into the proteome database. By applying our methods to human lung cancer cell line H1299 data, we demonstrated that long-read technology significantly improves mapping of promoters with low mappability scores and LRCAGE guarantees accurate construction of uncharacterized 5’ transcript structure. Unannotated peptides predicted from newly characterized transcripts were readily detectable in whole cell lysate mass-spectrometry data. Incorporating unannotated peptides into the proteome database enabled us to detect non-canonical antigens in HLA-pulldown LC-MS/MS data. At last, we showed that epigenetic treatment increased the number of non-canonical antigens, particularly those encoded by TE-derived transcripts, which might expand the pool of targetable antigens for cancers with low mutational burden.