Project description:Genome graphs, including the recently released draft human pangenome graph, can represent the breadth of genetic diversity and thus transcend the limits of traditional linear reference genomes. However, there are no genome-graph-compatible tools for analyzing whole genome bisulfite sequencing (WGBS) data. To close this gap, we introduce methylGrapher, a tool tailored for accurate DNA methylation analysis by mapping WGBS data to a genome graph. Notably, methylGrapher can reconstruct methylation patterns along haplotype paths precisely and efficiently. To demonstrate the utility of methylGrapher, we analyzed the WGBS data derived from five individuals whose genomes were included in the first Human Pangenome draft as well as WGBS data from ENCODE (EN-TEx). Along with standard performance benchmarking, we show that methylGrapher fully recapitulates DNA methylation patterns defined by classic linear genome analysis approaches. Importantly, methylGrapher captures a substantial number of CpG sites that are missed by linear methods, and improves overall genome coverage while reducing alignment reference bias. Thus, methylGrapher is a first step towards unlocking the full potential of Human Pangenome graphs in genomic DNA methylation analysis.
Project description:This dataset comprises spatial multi-omic profiling of malignant pleural mesothelioma (MPM) tissue, enabling simultaneous measurement of transcriptomic and protein expression at spatial resolution. To address the lack of spatially resolved proteomic measurements in standard spatial transcriptomics (ST) technologies, we generated this in-house MPM spatial transcriptomics and proteomics dataset and used it to train and evaluate DGAT (Dual-Graph Attention Network), a deep learning framework for imputing protein expression from ST data. DGAT integrates transcriptomic, proteomic, and spatial information using graph attention networks and jointly reconstructs mRNA and protein profiles through multi-task learning. This dataset serves both as a biological resource for understanding the tumor immune microenvironment in MPM and as a benchmark for spatial protein inference. DGAT’s predictions on this dataset enabled improved identification of immune phenotypes and tumor–stroma spatial organization, with implications for biomarker discovery and therapeutic targeting in mesothelioma. Spatial transcriptomics (ST) technologies provide genome-wide mRNA profiles in tissue context but lack direct protein-level measurements, which are critical for interpreting cellular function and microenvironmental organization. We present DGAT (Dual-Graph Attention Network), a deep learning framework that imputes spatial protein expression from transcriptomics-only ST data by learning RNA–protein relationships from spatial CITE-seq datasets. DGAT constructs heterogeneous graphs integrating transcriptomic, proteomic, and spatial information, encoded using graph attention networks. Task-specific decoders reconstruct mRNA and predict protein abundance from a shared latent representation.
Project description:Recent advances in chromatin architecture profiling technologies, such as single-cell Hi-C (scHi-C), allow us to dissect the heterogeneity of chromosome higher-order structures across diverse cell states and different individuals. However, scHi-C experiments are still expensive and not immediately available for population-scale profiling. Here, we present scENCORE, a computational method, to reconstruct personalized and cell-type-specific higher-order chromatin structures, such as A/B compartments, directly from more cost-effective and widely available single-cell epigenetic data (e.g., scATAC-seq). We apply scENCORE on scATAC-seq data from post-mortem prefrontal cortex brains and demonstrate its utility to 1) project Mega-base scale chromatin regions into lower dimensional space by leveraging graph embedding technologies based on cell-type-specific co-variability patterns, 2) define A/B compartments via unsupervised clustering, 3) perform an alignment algorithm for multi-graph embedding to derive comparable chromatin representations and highlight dynamic chromatin compartments across cell states and individuals. Validated by Hi-C experiments using FACS-sorted cells, scENCORE can faithfully reconstruct cell-type-specific chromatin compartments. Furthermore, scENCORE uniformly constructs chromosome conformation across population-scale scATAC-seq data and discovers key 3D structural switching events associated with psychiatric disorders. In summary, scENCORE allows cost-effective cell-type-specific and personalized reconstruction that delineate higher-order chromatin structures.
Project description:We present MultiEditR, the first algorithm specifically designed to detect and quantify RNA editing from Sanger sequencing (z.umn.edu/multieditr). Although RNA editing is routinely evaluated by measuring the heights of peaks from in Sanger sequencing traces, the accuracy and the precision of this approach has yet to be evaluated against gold-standards next-generation sequencing methods. Through a comprehensive comparison to RNA-seq and amplicon based deep sequencing, we show that MultiEditR is accurate, precise, and reliable for detecting endogenous and programmable RNA editing.