Open Chromatin Guided Interpretable Machine Learning Reveals Cancer-Specific Chromatin Features in Cell-free DNA
Ontology highlight
ABSTRACT: Cell-free DNAs (cfDNAs) are DNA fragments found in blood, originating mainly from immune cells in healthy individuals and from both immune and cancer cells in cancer patients. While cancer-derived cfDNAs carry mutations, they also retain epigenetic features such as DNA methylation and nucleosome positioning. In this study, we examine nucleosome enrichment patterns in cfDNAs from breast and pancreatic cancer patients and find significant enrichment at open chromatin regions. Differential enrichment is observed not only at cancer cell type specific ATAC-seq peaks but also at CD4? T cell specific peaks, suggesting both tumor- and immune-derived contributions to the cfDNA signal. To leverage these patterns, we apply an interpretable machine learning model (XGBoost) trained on cell type specific open chromatin regions. This approach improves cancer detection accuracy and highlights key genomic loci associated with the disease state. Our pipeline provides a robust and interpretable framework for cfDNA-based cancer detection.
ORGANISM(S): Homo sapiens
PROVIDER: GSE279542 | GEO | 2025/09/08
REPOSITORIES: GEO
ACCESS DATA