Cancer in silico drug discovery: a systems biology tool for identifying candidate drugs to target specific molecular tumor subtypes.
ABSTRACT: Large-scale cancer datasets such as The Cancer Genome Atlas (TCGA) allow researchers to profile tumors based on a wide range of clinical and molecular characteristics. Subsequently, TCGA-derived gene expression profiles can be analyzed with the Connectivity Map (CMap) to find candidate drugs to target tumors with specific clinical phenotypes or molecular characteristics. This represents a powerful computational approach for candidate drug identification, but due to the complexity of TCGA and technology differences between CMap and TCGA experiments, such analyses are challenging to conduct and reproduce. We present Cancer in silico Drug Discovery (CiDD; scheet.org/software), a computational drug discovery platform that addresses these challenges. CiDD integrates data from TCGA, CMap, and Cancer Cell Line Encyclopedia (CCLE) to perform computational drug discovery experiments, generating hypotheses for the following three general problems: (i) determining whether specific clinical phenotypes or molecular characteristics are associated with unique gene expression signatures; (ii) finding candidate drugs to repress these expression signatures; and (iii) identifying cell lines that resemble the tumors being studied for subsequent in vitro experiments. The primary input to CiDD is a clinical or molecular characteristic. The output is a biologically annotated list of candidate drugs and a list of cell lines for in vitro experimentation. We applied CiDD to identify candidate drugs to treat colorectal cancers harboring mutations in BRAF. CiDD identified EGFR and proteasome inhibitors, while proposing five cell lines for in vitro testing. CiDD facilitates phenotype-driven, systematic drug discovery based on clinical and molecular data from TCGA.
Project description:Numerous clinical trials of drug candidates for Alzheimer's disease (AD) have failed, and computational drug repositioning approaches using omics data have been proposed as effective alternative approaches to the discovery of drug candidates. However, little multi-omics data is available for AD, due to limited availability of brain tissues. Even if omics data exist, systematic drug repurposing study for AD has suffered from lack of big data, insufficient clinical information, and difficulty in data integration on account of sample heterogeneity derived from poor diagnosis or shortage of qualified post-mortem tissue. In this study, we developed a proteotranscriptomic-based computational drug repositioning method named Drug Repositioning Perturbation Score/Class (DRPS/C) based on inverse associations between disease- and drug-induced gene and protein perturbation patterns, incorporating pharmacogenomic knowledge. We constructed a Drug-induced Gene Perturbation Signature Database (DGPSD) comprised of 61,019 gene signatures perturbed by 1,520 drugs from the Connectivity Map (CMap) and the L1000 CMap. Drugs were classified into three DRPCs (High, Intermediate, and Low) according to DRPSs that were calculated using drug- and disease-induced gene perturbation signatures from DGPSD and The Cancer Genome Atlas (TCGA), respectively. The DRPS/C method was evaluated using the area under the ROC curve, with a prescribed drug list from TCGA as the gold standard. Glioblastoma had the highest AUC. To predict anti-AD drugs, DRPS were calculated using DGPSD and AD-induced gene/protein perturbation signatures generated from RNA-seq, microarray and proteomic datasets in the Synapse database, and the drugs were classified into DRPCs. We predicted 31 potential anti-AD drug candidates commonly belonged to high DRPCs of transcriptomic and proteomic signatures. Of these, four drugs classified into the nervous system group of Anatomical Therapeutic Chemical (ATC) system are voltage-gated sodium channel blockers (bupivacaine, topiramate) and monamine oxidase inhibitors (selegiline, iproniazid), and their mechanism of action was inferred from a potential anti-AD drug perspective. Our approach suggests a shortcut to discover new efficacy of drugs for AD.
Project description:Objective: The present study aims to identify the potential clinical application and molecular mechanism of plasmacytoma variant translocation 1 (PVT1) in patients with sarcomas by mining an RNA sequencing dataset from The Cancer Genome Atlas (TCGA) through multiple genome-wide analysis approaches. Methods: A genome-wide RNA sequencing dataset was downloaded from TCGA, survival analysis was used to evaluate the prognostic value of PVT1 in sarcoma. The potential mechanism was investigated by multiple tools: Database for Annotation, Visualization, and Integrated Discovery v6.8, gene set enrichment analysis (GSEA), and Connectivity Map (CMap). Results: Comprehensive survival analysis indicated that overexpression of PVT1 was significantly associated with poor prognosis in patients with sarcoma, and nomogram demonstrated that PVT1 contributed more than other traditional clinical parameters in sarcoma survival prediction. Weighted gene co-expression network analysis identified ten hub differentially expressed genes (DEGs) between sarcoma tissues with low and overexpression of PVT1, and substantiated that these DEGs have a complex co-expression network relationship. CMap analysis has identified that antipyrine, ondansetron, and econazole may be candidate targeted drugs for sarcoma patients with PVT1 overexpression. GSEA revealed that overexpression of PVT1 may be involved in the posttranscriptional regulation of gene expression, tumor invasiveness and metastasis, osteoblast differentiation and development, apoptosis, nuclear factor kappa B, Wnt, and apoptotic related signaling pathways. Conclusions: Our findings indicate that PVT1 may serve as a prognostic indicator in patients with sarcoma. Its underlying mechanism is revealed by GSEA, and CMap offers three candidate drugs for the individualized targeted therapy of sarcoma patients with overexpression of PVT1.
Project description:Colorectal cancer (CRC) is the third most commonly diagnosed type of cancer worldwide. The mechanisms leading to the progression of CRC are involved in both genetic and epigenetic regulations. In this study, we applied systems biology methods to identify potential biomarkers and conduct drug discovery in a computational approach. Using big database mining, we constructed a candidate protein-protein interaction network and a candidate gene regulatory network, combining them into a genome-wide genetic and epigenetic network (GWGEN). With the assistance of system identification and model selection approaches, we obtain real GWGENs for early-stage, mid-stage, and late-stage CRC. Subsequently, we extracted core GWGENs for each stage of CRC from their real GWGENs through a principal network projection method, and projected them to the Kyoto Encyclopedia of Genes and Genomes pathways for further analysis. Finally, we compared these core pathways resulting in different molecular mechanisms in each stage of CRC and identified carcinogenic biomarkers for the design of multiple-molecule drugs to prevent the progression of CRC. Based on the identified gene expression signatures, we suggested potential compounds combined with known CRC drugs to prevent the progression of CRC with querying Connectivity Map (CMap).
Project description:Currently there is only one method of treatment for human schistosomiasis, the drug praziquantel. Strong selective pressure has caused a serious concern for a rise in resistance to praziquantel leading to the necessity for additional pharmaceuticals, with a distinctly different mechanism of action, to be used in combination therapy with praziquantel. Previous treatment of Schistosoma mansoni included the use of oxamniquine (OXA), a prodrug that is enzymatically activated in S. mansoni but is ineffective against S. haematobium and S. japonicum. The oxamniquine activating enzyme was identified as a S. mansoni sulfotransferase (SmSULT-OR). Structural data have allowed for directed drug development in reengineering oxamniquine to be effective against S. haematobium and S. japonicum. Guided by data from X-ray crystallographic studies and Schistosoma worm killing assays on oxamniquine, our structure-based drug design approach produced a robust SAR program that tested over 300 derivatives and identified several new lead compounds with effective worm killing in vitro. Previous studies resulted in the discovery of compound CIDD-0066790, which demonstrated broad-species activity in killing of schistosome species. As these compounds are racemic mixtures, we tested and demonstrate that the R enantiomer CIDD-007229 kills S. mansoni, S. haematobium and S. japonicum better than the parent drug (CIDD-0066790). The search for derivatives that kill better than CIDD-0066790 has resulted in a derivative (CIDD- 149830) that kills 100% of S. mansoni, S. haematobium and S. japonicum adult worms within 7 days. We hypothesize that the difference in activation and thus killing by the derivatives is due to the ability of the derivative to fit in the binding pocket of each sulfotransferase (SmSULT-OR, ShSULT-OR, SjSULT-OR) and to be efficiently sulfated. The purpose of this research is to develop a second drug to be used in conjunction with praziquantel to treat the major human species of Schistosoma. Collectively, our findings show that CIDD-00149830 and CIDD-0072229 are promising novel drugs for the treatment of human schistosomiasis and strongly support further development and in vivo testing.
Project description:Drug repurposing has become an increasingly attractive approach to drug development owing to the ever-growing cost of new drug discovery and frequent withdrawal of successful drugs caused by side effect issues. Here, we devised Functional Module Connectivity Map (FMCM) for the discovery of repurposed drug compounds for systems treatment of complex diseases, and applied it to colorectal adenocarcinoma. FMCM used multiple functional gene modules to query the Connectivity Map (CMap). The functional modules were built around hub genes identified, through a gene selection by trend-of-disease-progression (GSToP) procedure, from condition-specific gene-gene interaction networks constructed from sets of cohort gene expression microarrays. The candidate drug compounds were restricted to drugs exhibiting predicted minimal intracellular harmful side effects. We tested FMCM against the common practice of selecting drugs using a genomic signature represented by a single set of individual genes to query CMap (IGCM), and found FMCM to have higher robustness, accuracy, specificity, and reproducibility in identifying known anti-cancer agents. Among the 46 drug candidates selected by FMCM for colorectal adenocarcinoma treatment, 65% had literature support for association with anti-cancer activities, and 60% of the drugs predicted to have harmful effects on cancer had been reported to be associated with carcinogens/immune suppressors. Compounds were formed from the selected drug candidates where in each compound the component drugs collectively were beneficial to all the functional modules while no single component drug was harmful to any of the modules. In cell viability tests, we identified four candidate drugs: GW-8510, etacrynic acid, ginkgolide A, and 6-azathymine, as having high inhibitory activities against cancer cells. Through microarray experiments we confirmed the novel functional links predicted for three candidate drugs: phenoxybenzamine (broad effects), GW-8510 (cell cycle), and imipenem (immune system). We believe FMCM can be usefully applied to repurposed drug discovery for systems treatment of other types of cancer and other complex diseases.
Project description:Background:Existing drugs are far from enough for investigators and patients to administrate the therapy of rheumatoid arthritis. Drug repositioning has drawn broad attention by reusing marketed drugs and clinical candidates for new uses. Purpose:This study attempted to predict candidate drugs for rheumatoid arthritis treatment by mining the similarities of pathway aberrance induced by disease and various drugs, on a personalized or customized basis. Methods:We firstly measured the individualized pathway aberrance induced by rheumatoid arthritis based on the microarray data and various drugs from CMap database, respectively. Then, the similarities of pathway aberrances between RA and various drugs were calculated using a Kolmogorov-Smirnov weighted enrichment score algorithm. Results:Using this method, we identified 4 crucial pathways involved in rheumatoid arthritis development and predicted 9 underlying candidate drugs for rheumatoid arthritis treatment. Some candidates with current indications to treat other diseases might be repurposed to treat rheumatoid arthritis and complement the drug group for rheumatoid arthritis. Conclusion:This study predicts candidate drugs for rheumatoid arthritis treatment through mining the similarities of pathway aberrance induced by disease and various drugs, on a personalized or customized basis. Our framework will provide novel insights in personalized drug discovery for rheumatoid arthritis and contribute to the future application of custom therapeutic decisions.
Project description:The aim of the present study was to identify potential molecular mechanisms and therapeutic targets in regards to isocitrate dehydrogenase 2 (IDH2) R140Q-mutated acute myeloid leukemia (AML). An RNA sequencing dataset of IDH2 wild-type and R140Q-mutated adult de novo AML bone marrow samples was obtained from The Cancer Genome Atlas (TCGA) database. The edgeR package was used to screen for the differentially expressed genes (DEGs), and the potential molecular mechanisms and therapeutic targets were identified using Database for Annotation, Visualization, and Integrated Discovery (DAVID) v6.8, Biological Networks Gene Ontology tool, Connectivity Map (CMap), Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) and GeneMANIA. A total of 230 DEGs were identified between the bone marrow tissues of IDH2 R140Q-mutated and wild-type AML patients, of which 31 were significantly associated with overall survival (OS). Functional assessment of DEGs showed significant enrichment in multiple biological processes, including angiogenesis and cell differentiation. STRING and GeneMANIA were used to identify the hub genes of these DEGs. CMap analysis identified 13 potential small-molecule drugs against IDH2 R140Q-mutated adult de novo AML. Genome-wide co-expression network analysis identified several IDH2 R140Q co-expressed genes, of which 56 were significantly associated with AML OS. The difference in IDH2 mRNA expression levels and OS between the IDH2 R140Q-mutated and wild-type AML were not statistically significant in our cohort. In conclusion, we identified several co-expressing genes and potential molecular mechanisms that are instrumental in IDH2 R140Q-mutated adult de novo AML, along with 13 candidate targeted therapeutic drugs.
Project description:Drug repositioning is a cost-efficient and time-saving process to drug development compared to traditional techniques. A systematic method to drug repositioning is to identify candidate drug's gene expression profiles on target disease models and determine how similar these profiles are to approved drugs. Databases such as the CMAP have been developed recently to help with systematic drug repositioning.To overcome the limitation of connectivity maps on data coverage, we constructed a comprehensive in silico drug-protein connectivity map called DMAP, which contains directed drug-to-protein effects and effect scores. The drug-to-protein effect scores are compiled from all database entries between the drug and protein have been previously observed and provide a confidence measure on the quality of such drug-to-protein effects.In DMAP, we have compiled the direct effects between 24,121 PubChem Compound ID (CID), which were mapped from 289,571 chemical entities recognized from public literature, and 5,196 reviewed Uniprot proteins. DMAP compiles a total of 438,004 chemical-to-protein effect relationships. Compared to CMAP, DMAP shows an increase of 221 folds in the number of chemicals and 1.92 fold in the number of ATC codes. Furthermore, by overlapping DMAP chemicals with the approved drugs with known indications from the TTD database and literature, we obtained 982 drugs and 622 diseases; meanwhile, we only obtained 394 drugs with known indication from CMAP. To validate the feasibility of applying new DMAP for systematic drug repositioning, we compared the performance of DMAP and the well-known CMAP database on two popular computational techniques: drug-drug-similarity-based method with leave-one-out validation and Kolmogorov-Smirnov scoring based method. In drug-drug-similarity-based method, the drug repositioning prediction using DMAP achieved an Area-Under-Curve (AUC) score of 0.82, compared with that using CMAP, AUC = 0.64. For Kolmogorov-Smirnov scoring based method, with DMAP, we were able to retrieve several drug indications which could not be retrieved using CMAP. DMAP data can be queried using the existing C2MAP server or downloaded freely at: http://bio.informatics.iupui.edu/cmapsReliable measurements of how drug affect disease-related proteins are critical to ongoing drug development in the genome medicine era. We demonstrated that DMAP can help drug development professionals assess drug-to-protein relationship data and improve chances of success for systematic drug repositioning efforts.
Project description:Gene-set-based analysis (GSA), which uses the relative importance of functional gene-sets, or molecular signatures, as units for analysis of genome-wide gene expression data, has exhibited major advantages with respect to greater accuracy, robustness, and biological relevance, over individual gene analysis (IGA), which uses log-ratios of individual genes for analysis. Yet IGA remains the dominant mode of analysis of gene expression data. The Connectivity Map (CMap), an extensive database on genomic profiles of effects of drugs and small molecules and widely used for studies related to repurposed drug discovery, has been mostly employed in IGA mode. Here, we constructed a GSA-based version of CMap, Gene-Set Connectivity Map (GSCMap), in which all the genomic profiles in CMap are converted, using gene-sets from the Molecular Signatures Database, to functional profiles. We showed that GSCMap essentially eliminated cell-type dependence, a weakness of CMap in IGA mode, and yielded significantly better performance on sample clustering and drug-target association. As a first application of GSCMap we constructed the platform Gene-Set Local Hierarchical Clustering (GSLHC) for discovering insights on coordinated actions of biological functions and facilitating classification of heterogeneous subtypes on drug-driven responses. GSLHC was shown to tightly clustered drugs of known similar properties. We used GSLHC to identify the therapeutic properties and putative targets of 18 compounds of previously unknown characteristics listed in CMap, eight of which suggest anti-cancer activities. The GSLHC website http://cloudr.ncu.edu.tw/gslhc/ contains 1,857 local hierarchical clusters accessible by querying 555 of the 1,309 drugs and small molecules listed in CMap. We expect GSCMap and GSLHC to be widely useful in providing new insights in the biological effect of bioactive compounds, in drug repurposing, and in function-based classification of complex diseases.
Project description:Small drug molecules usually bind to multiple protein targets or even unintended off-targets. Such drug promiscuity has often led to unwanted or unexplained drug reactions, resulting in side effects or drug repositioning opportunities. So it is always an important issue in pharmacology to identify potential drug-target interactions (DTI). However, DTI discovery by experiment remains a challenging task, due to high expense of time and resources. Many computational methods are therefore developed to predict DTI with high throughput biological and clinical data. Here, we initiatively demonstrate that the on-target and off-target effects could be characterized by drug-induced in vitro genomic expression changes, e.g. the data in Connectivity Map (CMap). Thus, unknown ligands of a certain target can be found from the compounds showing high gene-expression similarity to the known ligands. Then to clarify the detailed practice of CMap based DTI prediction, we objectively evaluate how well each target is characterized by CMap. The results suggest that (1) some targets are better characterized than others, so the prediction models specific to these well characterized targets would be more accurate and reliable; (2) in some cases, a family of ligands for the same target tend to interact with common off-targets, which may help increase the efficiency of DTI discovery and explain the mechanisms of complicated drug actions. In the present study, CMap expression similarity is proposed as a novel indicator of drug-target interactions. The detailed strategies of improving data quality by decreasing the batch effect and building prediction models are also effectively established. We believe the success in CMap can be further translated into other public and commercial data of genomic expression, thus increasing research productivity towards valid drug repositioning and minimal side effects.