Genomics

Dataset Information

0

DevCellPy: A machine learning-enabled pipeline for automated annotation of complex multilayered single-cell transcriptomic data


ABSTRACT: A major informatic challenge in single cell RNA-sequencing analysis is the precise annotation of datasets where cells exhibit complex multilayered identities or transitory states. Here, we present devCellPy a highly accurate and precise machine learning-enabled tool that enables automated prediction of cell types across complex annotation hierarchies. To demonstrate the power of devCellPy, we construct a murine cardiac developmental atlas from published datasets encompassing 104,199 cells from E6.5-E16.5 and train devCellPy to generate a cardiac prediction algorithm. Using this algorithm, we observe a high prediction accuracy (>90%) across multiple layers of annotation and across de novo murine developmental data. Furthermore, we conduct a cross-species prediction of cardiomyocyte subtypes from in vitro-derived human induced pluripotent stem cells and unexpectedly uncover a predominance of left ventricular (LV) identity that we confirmed by an LV-specific TBX5 lineage tracing system. Together, our results show devCellPy to be a powerful tool for automated cell prediction across complex cellular hierarchies, species, and experimental systems.

ORGANISM(S): Mus musculus Homo sapiens

PROVIDER: GSE184943 | GEO | 2022/08/14

REPOSITORIES: GEO

Similar Datasets

2010-11-01 | E-MTAB-407 | biostudies-arrayexpress
2011-05-01 | E-MTAB-612 | biostudies-arrayexpress
2011-06-24 | E-GEOD-30194 | biostudies-arrayexpress
2022-01-08 | GSE189482 | GEO
2012-10-01 | E-MTAB-1309 | biostudies-arrayexpress
2009-11-24 | GSE15370 | GEO
2010-05-19 | E-GEOD-15370 | biostudies-arrayexpress
2011-06-24 | GSE30194 | GEO
2021-02-17 | GSE164767 | GEO
2019-06-04 | PXD000730 | Pride