Project description:Constructing high-quality haplotype-resolved genome assemblies has substantially improved the ability to detect and characterize genetic variants. A targeted approach providing readily access to the rich information from haplotype-resolved genome assemblies will be appealing to groups of basic researchers and medical scientists focused on specific genomic regions. Here, using the 4.5 megabase, notoriously difficult-to-assemble major histocompatibility complex (MHC) region as an example, we demonstrated an approach to construct haplotype-resolved assembly of the targeted genomic region with the CRISPR-based enrichment. Compared to the results from haplotype-resolved genome assembly, our targeted approach achieved comparable completeness and accuracy with reduced computing complexity, sequencing cost, as well as the amount of starting materials. Moreover, using the targeted assembled personal MHC haplotypes as the reference both improves the quantification accuracy for sequencing data and enables allele-specific functional genomics analyses of the MHC region. Given its highly efficient use of resources, our approach can greatly facilitate population genetic studies of targeted regions, and may pave a new way to elucidate the molecular mechanisms in disease etiology.
Project description:Constructing high-quality haplotype-resolved genome assemblies has substantially improved the ability to detect and characterize genetic variants. A targeted approach providing readily access to the rich information from haplotype-resolved genome assemblies will be appealing to groups of basic researchers and medical scientists focused on specific genomic regions. Here, using the 4.5 megabase, notoriously difficult-to-assemble major histocompatibility complex (MHC) region as an example, we demonstrated an approach to construct haplotype-resolved assembly of the targeted genomic region with the CRISPR-based enrichment. Compared to the results from haplotype-resolved genome assembly, our targeted approach achieved comparable completeness and accuracy with reduced computing complexity, sequencing cost, as well as the amount of starting materials. Moreover, using the targeted assembled personal MHC haplotypes as the reference both improves the quantification accuracy for sequencing data and enables allele-specific functional genomics analyses of the MHC region. Given its highly efficient use of resources, our approach can greatly facilitate population genetic studies of targeted regions, and may pave a new way to elucidate the molecular mechanisms in disease etiology.
Project description:Airbreathing catfish are stenohaline freshwater fish capable of withstanding various environmental conditions and farming practices, including breathing atmospheric oxygen. This unique ability has enabled them to thrive in semi-terrestrial habitats. However, the genomic mechanisms underlying their adaptation to adverse ecological environments remain largely unexplored, primarily due to the limited availability of high-quality genomic resources. Here, we present a haplotype-resolved and near telomere-to-telomere (T2T) genome assembly of the African catfish (Clarias gariepinus), utilizing Oxford Nanopore, PacBio HiFi, Illumina and Hi-C sequencing technologies. The primary assembly spans 969.62 Mb with only 47 contigs, achieving a contig N50 of 33.71 Mb. Terminal telomeric signals were detected in 22 of 47 contigs, suggesting T2T assembled chromosomes. BUSCO analysis confirmed gene space completeness of 99% against the Actinopterygii dataset, highlighting the high quality of the assembly. Genome annotation identified 25,655 protein-coding genes and estimated 43.94% genome-wide repetitive elements. This data provides valuable genomic resources to advance aquaculture practices and to explore the genomic underpinnings of the ecological resilience of airbreathing catfish and related teleosts.
Project description:Allelic differences between the two sets of chromosomes can affect the propensity of inheritance in humans, but the extent of such differences in the human genome has yet to be fully explored. Here, we delineate allelic chromatin modifications and transcriptomes amongst a broad set of human tissues, enabled by a chromosome-span haplotype reconstruction strategy1. The resulting haplotype-resolved epigenomic maps reveal extensive allele bias in the transcription of human genes as well as chromatin state, allowing us to infer cis-regulatory relationships between genes and their control sequences. These maps also uncover a new class of cis regulatory elements and detail activities of repetitive elements in various human tissues. The rich datasets described here will enhance our understanding of the mechanisms controlling tissue-specific gene expression programs. One replicate of Hi-C experiment in four human tissues with four different individuals (Thymus STL001, Aorta STL002, Leftventricle STL003, and Liver STL011).