Project description:High-content cellular imaging, transcriptomics, and proteomics data provide rich and complementary views on the molecular layers of biology that influence cellular states and function. However, the biological determinants through which changes in multi-omics measurements influence cellular morphology have not yet been systematically explored, and the degree to which cell imaging could potentially enable the prediction of multi-omics directly from cell imaging data is therefore currently unclear. Here, we address the question of whether it is possible to predict bulk multi-omics measurements directly from cell images using Image2Omics -- a deep learning approach that predicts multi-omics in a cell population directly from high-content images stained with multiplexed fluorescent dyes. We perform an experimental evaluation in gene-edited macrophages derived from human induced pluripotent stem cell (hiPSC) under multiple stimulation conditions and demonstrate that Image2Omics achieves significantly better performance in predicting transcriptomics and proteomics measurements directly from cell images than predictors based on the mean observed training set abundance. We observed significant predictability of abundances for 5903 (22.43%; 95% CI: 8.77%, 38.88%) and 5819 (22.11%; 95% CI: 10.40%, 38.08%) transcripts out of 26137 in M1 and M2-stimulated macrophages respectively and for 1933 (38.77%; 95% CI: 36.94%, 39.85%) and 2055 (41.22%; 95% CI: 39.31%, 42.42%) proteins out of 4986 in M1 and M2-stimulated macrophages respectively. Our results show that some transcript and protein abundances are predictable from cell imaging and that cell imaging may potentially, in some settings and depending on the mechanisms of interest and desired performance threshold, even be a scalable and resource-efficient substitute for multi-omics measurements.
Project description:Accurate prediction of cell cycle phases is important in understanding how certain diseases (e.g. cancer) develop and respond to treatment. It is also essential for reducing the confounding effects in single-cell RNA sequencing (scRNA-Seq) data analysis. We evaluated both traditional machine learning techniques and deep learning approaches for predicting cell cycle phases from scRNA-seq data. Models were trained on consensus predictions from four popular cell cycle analysis tools, namely, CellCycleScore, ccAF, Revelio, and Tricycle, then applied to unlabeled scRNA-seq data generated from human leukemia cell lines. We evaluated model performance using publicly available datasets, GSE146773 (human leukemia cells, 1,151 cells) and GSE64016 (human embryonic stem cells, 213 cells), both labeled with their respective cell cycle phases through experiments. In addition to the four tools, we assessed traditional machine learning models (AdaBoost, Random Forest, and LightGBM), deep learning models including multiple dense networks DNN (2-5), CNN, Hybrid CNN-Dense, Feature embedding, and ensemble models. DNN 3 achieved the highest accuracy, reaching 75.16% on the GSE146773 dataset and 70.85% on GSE64016. This outperformed existing tools such as Revelio (68.34% on GSE146773, 68.2% on GSE64016), CellCycleScore (66.54% on GSE146773, 53.84% on GSE64016), ccAF (45% on GSE146773, 40% on GSE64016), ccAFv2 (47.23% on GSE146773, 43.72% on GSE64016), Tricycle (53.23% on GSE146773, 58.7% on GSE64016), and Cyclum (38.52% on GSE146773, 26.3% for GSE64016). These results signify the potential of deep learning models for robust cell cycle phase prediction in scRNA-seq data.
Project description:Post-translational modification (PTM) of proteins regulates cellular proteostasis by expanding protein functional diversity. This naturally leads to increased proteome complexity as the result of PTM crosstalk. Here, we used a heavily modified molecular chaperone, Heat shock protein-90 (Hsp90), to investigate this concept. Hsp90 is at the hub of proteostasis and cellular signaling networks in cancer and is, therefore, an attractive therapeutic target in cancer. We showed that deletion of HDAC3 and HDAC8 in human cells led to increased binding of Hsp90 to both ATP and drugs. When bound to its ATP-competitive inhibitor, Hsp90 from both HDAC3 and HDAC8 knock out human cells exhibited similar PTMs, mainly phosphorylation and acetylation, and created a common proteomic network signature. We used both a deep-learning artificial intelligence (AI) prediction model and data based on mass-spectrometry analysis of Hsp90 isolated from the mammalian cells bound to its drugs to decipher PTM crosstalk. The alignment of data from both methods demonstrates that the deep-learning prediction model offers a highly efficient and rapid approach for deciphering PTM crosstalk on complex proteins such as Hsp90.
Project description:Prediction of protein localization plays an important role in understanding protein function and mechanism. A deep learning-based localization prediction tool (“MULocDeep”) assessing each amino acid’s contribution to the localization process provides insights into the mechanism of protein sorting and localization motifs. A dataset with 45 sub-organellar localization annotations under 10 major sub-cellular compartments was produced and the tool was tested on an independent dataset of mitochondrial proteins that were extracted from Arabidopsis thaliana cell cultures, Solanum tuberosum tubers, and Vicia faba roots, and analyzed by shotgun mass spectrometry.
Project description:Fusarium head blight (FHB) incited by Fusarium graminearum Schwabe is a devastating disease of barley and other cereal crops worldwide. Fusarium head blight is associated with trichothecene mycotoxins such as deoxynivalenol (DON), where contaminated grains are unfit for malting or animal feed industries. While genetically resistant cultivars offer the best economic and environmentally responsible means to mitigate disease, parent lines with adequate resistance are limited in barley. Resistancebreeding based upon quantitative genetic gains has been slow to date, due to intensive labour requirements of disease nurseries. The development of high throughput genome-wide molecular markers, allow application in genomic prediction models. A diverse genomic panel consisting of 400 two-row spring barley lines was assembled to focus on Canadian barley breeding programs. The panel was evaluated for FHB and DON content in three environments and over two years. Moreover, it was genotyped using an Illumina Infinium HTS iSelect custom beadchip array of single nucleotide polymorphic molecular markers (50K SNP), where over 23K molecular markers were polymorphic. Genomic prediction has been successfully demonstrated for reducing FHB and DON content in cereals using various statistically-based models of different underlying assumptions. Herein, we have studied an alternative method basedon machine learning and compare it with a statistical approach. Two encoding techniques were utilized (categorical or Hardy-Weinberg frequencies), followed by selecting essential genomic markers for phenotype prediction. Subsequently, we applied a transformer-based deep learning algorithm to predict FHB and DON. Apart from the transformer method, we also implemented a Residual Fully Connected Neural Network (RFCNN). Pearson correlation coefficients were calculated to compare true vs. predicted outputs. Under most model scenarios, the use of all markers vs. selected markers marginally improved prediction performance except for RFCNN method for FHB (27.6%). Hardy-Weinberg encoding generally improved correlation for FHB (6.9%) and DON (9.6%) for transformer. This study suggests the potential of the transformer based method for genomic prediction of complex traits such as FHB or DON, having performed better or equally compared with existing machine learning and statistical method. To genomic prediction in barley for Fusarium head blight and deoxynivalenol content using a custom Illumina Infinium array (BarleySNP50-JHI) (www.illumina.com). Sample types included leaves from 400 barley genotypes mostly of Canadian origin. This series includes 400 genotypes assayed on an Illumina infinium HTS platform 50K BeadChip.