Unknown

Dataset Information

0

Digital profiling of cancer transcriptomes from histology images with grouped vision attention.


ABSTRACT: Cancer is a heterogeneous disease that demands precise molecular profiling for better understanding and management. Recently, deep learning has demonstrated potentials for cost-efficient prediction of molecular alterations from histology images. While transformer-based deep learning architectures have enabled significant progress in non-medical domains, their application to histology images remains limited due to small dataset sizes coupled with the explosion of trainable parameters. Here, we develop SEQUOIA, a transformer model to predict cancer transcriptomes from whole-slide histology images. To enable the full potential of transformers, we first pre-train the model using data from 1,802 normal tissues. Then, we fine-tune and evaluate the model in 4,331 tumor samples across nine cancer types. The prediction performance is assessed at individual gene levels and pathway levels through Pearson correlation analysis and root mean square error. The generalization capacity is validated across two independent cohorts comprising 1,305 tumors. In predicting the expression levels of 25,749 genes, the highest performance is observed in cancers from breast, kidney and lung, where SEQUOIA accurately predicts the expression of 11,069, 10,086 and 8,759 genes, respectively. The accurately predicted genes are associated with the regulation of inflammatory response, cell cycles and metabolisms. While the model is trained at the tissue level, we showcase its potential in predicting spatial gene expression patterns using spatial transcriptomics datasets. Leveraging the prediction performance, we develop a digital gene expression signature that predicts the risk of recurrence in breast cancer. SEQUOIA deciphers clinically relevant gene expression patterns from histology images, opening avenues for improved cancer management and personalized therapies.

SUBMITTER: Zheng Y 

PROVIDER: S-EPMC10557714 | biostudies-literature | 2024 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

Digital profiling of cancer transcriptomes from histology images with grouped vision attention.

Zheng Yuanning Y   Pizurica Marija M   Carrillo-Perez Francisco F   Noor Humaira H   Yao Wei W   Wohlfart Christian C   Marchal Kathleen K   Vladimirova Antoaneta A   Gevaert Olivier O  

bioRxiv : the preprint server for biology 20240119


Cancer is a heterogeneous disease that demands precise molecular profiling for better understanding and management. Recently, deep learning has demonstrated potentials for cost-efficient prediction of molecular alterations from histology images. While transformer-based deep learning architectures have enabled significant progress in non-medical domains, their application to histology images remains limited due to small dataset sizes coupled with the explosion of trainable parameters. Here, we de  ...[more]

Similar Datasets

| S-EPMC11564640 | biostudies-literature
| EGAD00010001911 | EGA
| S-EPMC8522581 | biostudies-literature
| S-EPMC10771481 | biostudies-literature
| S-EPMC10498032 | biostudies-literature
2025-01-29 | GSE287979 | GEO
| S-EPMC7943565 | biostudies-literature
| S-EPMC11761186 | biostudies-literature
| S-EPMC3819321 | biostudies-literature
| S-EPMC10536945 | biostudies-literature