biostudies-literatureUnknownSui YNICHD NIH HHSNIMH NIH HHSNINDS NIH HHSNIH HHSAutism spectrum disorders (ASDs) are genetically and phenotypically heterogeneous and the majority of cases still remain genetically unresolved. To better understand large-effect pathogenic variation, we generated long-read sequencing data to construct phased and near-complete genome assemblies (average contig N50=43 Mbp, QV=56) for 189 individuals from 51 families with unsolved cases of autism. We applied read- and assembly-based strategies to facilitate comprehensive characterization of de novo mutations (DNMs), structural variants (SVs), and DNA methylation profiles. Merging common SVs obtained from long-read pangenome controls, we efficiently filtered >97% of common SVs exclusive to 87 offspring. We find no evidence of increased autosomal SV burden for probands when compared to unaffected siblings yet note a trend for an increase of SV burden on the X chromosome among affected females. We establish a workflow to prioritize potential pathogenic variants by integrating autism risk genes and putative noncoding regulatory elements defined from ATAC-seq and CUT&Tag data from the developing cortex. In total, we identified three pathogenic variants in TBL1XR1, MECP2, and SYNGAP1, as well as nine candidate de novo and biparental homozygous SVs, most of which were missed by short-read sequencing. Our work highlights the potential of phased genomes to discover complex more pathogenic mutations and the power of the pangenome to restrict the focus on an increasingly smaller number of SVs for clinical evaluation.medRxiv : the preprint server for health sciences2025.07.21.25331932https://www.ebi.ac.uk/biostudies/studies/S-EPMC12330438biostudies-literaturePangenome discovery of missing autism variants.PMC12330438R01 NS057819F32 HD116501DP5 OD033357R01 MH101221Koundinya NWu MNowakowski TJKarunakaran DHuman Pangenome Reference Consortium (HPRC)Hennick KCherry FPolo Prieto RAWang TKwon YWong ISuter BWertz JHoekzema KChakravarti ASui YGarcia GHBerk-Rauch HMunson KMPehlivan DZoghbi HYMeyer-Schuman RHarvey WTMiller DEEichler EELin JNoyes MDGustafson JAKnuth JfalsePangenome discovery of missing autism variants.Autism spectrum disorders (ASDs) are genetically and phenotypically heterogeneous and the majority of cases still remain genetically unresolved. To better understand large-effect pathogenic variation, we generated long-read sequencing data to construct phased and near-complete genome assemblies (average contig N50=43 Mbp, QV=56) for 189 individuals from 51 families with unsolved cases of autism. We applied read- and assembly-based strategies to facilitate comprehensive characterization of de novo mutations (DNMs), structural variants (SVs), and DNA methylation profiles. Merging common SVs obtained from long-read pangenome controls, we efficiently filtered >97% of common SVs exclusive to 87 offspring. We find no evidence of increased autosomal SV burden for probands when compared to unaffected siblings yet note a trend for an increase of SV burden on the X chromosome among affected females. We establish a workflow to prioritize potential pathogenic variants by integrating autism risk genes and putative noncoding regulatory elements defined from ATAC-seq and CUT&Tag data from the developing cortex. In total, we identified three pathogenic variants in TBL1XR1, MECP2, and SYNGAP1, as well as nine candidate de novo and biparental homozygous SVs, most of which were missed by short-read sequencing. Our work highlights the potential of phased genomes to discover complex more pathogenic mutations and the power of the pangenome to restrict the focus on an increasingly smaller number of SVs for clinical evaluation.2025-01-01T00:00:00Z2025 Jul2026-04-07T13:43:39.649Z2026-04-07T13:34:18.821ZS-EPMC123304384077814410.1101/2025.07.21.25331932