Prediction of drug indications based on chemical interactions and chemical similarities.
ABSTRACT: Discovering potential indications of novel or approved drugs is a key step in drug development. Previous computational approaches could be categorized into disease-centric and drug-centric based on the starting point of the issues or small-scaled application and large-scale application according to the diversity of the datasets. Here, a classifier has been constructed to predict the indications of a drug based on the assumption that interactive/associated drugs or drugs with similar structures are more likely to target the same diseases using a large drug indication dataset. To examine the classifier, it was conducted on a dataset with 1,573 drugs retrieved from Comprehensive Medicinal Chemistry database for five times, evaluated by 5-fold cross-validation, yielding five 1st order prediction accuracies that were all approximately 51.48%. Meanwhile, the model yielded an accuracy rate of 50.00% for the 1st order prediction by independent test on a dataset with 32 other drugs in which drug repositioning has been confirmed. Interestingly, some clinically repurposed drug indications that were not included in the datasets are successfully identified by our method. These results suggest that our method may become a useful tool to associate novel molecules with new indications or alternative indications with existing drugs.
Project description:Cancer, which is a leading cause of death worldwide, places a big burden on health-care system. In this study, an order-prediction model was built to predict a series of cancer drug indications based on chemical-chemical interactions. According to the confidence scores of their interactions, the order from the most likely cancer to the least one was obtained for each query drug. The 1(st) order prediction accuracy of the training dataset was 55.93%, evaluated by Jackknife test, while it was 55.56% and 59.09% on a validation test dataset and an independent test dataset, respectively. The proposed method outperformed a popular method based on molecular descriptors. Moreover, it was verified that some drugs were effective to the 'wrong' predicted indications, indicating that some 'wrong' drug indications were actually correct indications. Encouraged by the promising results, the method may become a useful tool to the prediction of drugs indications.
Project description:Drug repositioning aims to find new indications for existing drugs in order to reduce drug development cost and time. Currently,there are numerous stories of successful drug repositioning that have been reported and many repurposed drugs are already available on the market. Although drug repositioning is often a product of serendipity, repositioning opportunities can be uncovered systematically. There are three systematic approaches to drug repositioning: disease-centric approach, target-centric and drug-centric. Disease-centric approaches identify close relationships between an old and a new indication. A target-centric approach links a known target and its established drug to a new indication. Lastly, a drug-centric approach connects a known drug to a new target and its associated indication. These three approaches differ in their potential and their limitations, but above all else, in the required start information and computing power. This raises the question of which approach prevails in current drug discovery and what that implies for future developments. To address this question, we systematically evaluated over 100 drugs, 200 target structures and over 300 indications from the Drug Repositioning Database. Each analyzed case was classified as one of the three repositioning approaches. For the majority of cases (more than 60%) the disease-centric definition was assigned. Almost 30% of the cases were classified as target-centric and less than 10% as drug-centric approaches. We concluded that, despite the use of umbrella term "drug" repositioning, disease- and target-centric approaches have dominated the field until now. We propose the use of drug-centric approaches while discussing reasons, such as structure-based repositioning techniques, to exploit the full potential of drug-target-disease connections.
Project description:A drug side effect is an undesirable effect which occurs in addition to the intended therapeutic effect of the drug. The unexpected side effects that many patients suffer from are the major causes of large-scale drug withdrawal. To address the problem, it is highly demanded by pharmaceutical industries to develop computational methods for predicting the side effects of drugs. In this study, a novel computational method was developed to predict the side effects of drug compounds by hybridizing the chemical-chemical and protein-chemical interactions. Compared to most of the previous works, our method can rank the potential side effects for any query drug according to their predicted level of risk. A training dataset and test datasets were constructed from the benchmark dataset that contains 835 drug compounds to evaluate the method. By a jackknife test on the training dataset, the 1st order prediction accuracy was 86.30%, while it was 89.16% on the test dataset. It is expected that the new method may become a useful tool for drug design, and that the findings obtained by hybridizing various interactions in a network system may provide useful insights for conducting in-depth pharmacological research as well, particularly at the level of systems biomedicine.
Project description:Toxicity is a major contributor to high attrition rates of new chemical entities in drug discoveries. In this study, an order-classifier was built to predict a series of toxic effects based on data concerning chemical-chemical interactions under the assumption that interactive compounds are more likely to share similar toxicity profiles. According to their interaction confidence scores, the order from the most likely toxicity to the least was obtained for each compound. Ten test groups, each of them containing one training dataset and one test dataset, were constructed from a benchmark dataset consisting of 17,233 compounds. By a Jackknife test on each of these test groups, the 1(st) order prediction accuracies of the training dataset and the test dataset were all approximately 79.50%, substantially higher than the rate of 25.43% achieved by random guesses. Encouraged by the promising results, we expect that our method will become a useful tool in screening out drugs with high toxicity.
Project description:Repurposing of drugs to novel disease indications has promise for faster clinical translation. However, identifying the best drugs for a given pathological context is not trivial. We developed an integrated random walk-based network framework that combines functional biomolecular relationships and known drug-target interactions as a platform for contextual prioritization of drugs, genes and pathways. We show that the use of gene-centric or drug-centric data, such as gene expression data or a phenotypic drug screen, respectively, within this network platform can effectively prioritize drugs and pathways, respectively, to the studied biological context. We demonstrate that various genomic data can be used as contextual cues to effectively prioritize drugs to the studied context, while similarly, phenotypic drug screen data can be used to effectively prioritize genes and pathways to the studied phenotypic context. As a proof-of-principle, we showcase the use of our platform to identify known and novel drug indications against different subsets of breast cancers through contextual prioritization based on genome-wide gene expression, shRNA and drug screen and clinical survival data. The integrated network and associated methods are incorporated into the NetWalker suite for functional genomics analysis ().
Project description:BACKGROUND: Drug-induced gene expression dataset (for example Connectivity Map, CMap) represent a valuable resource for drug-repurposing, a class of methods for identifying novel indications for approved drugs. Recently, CMap-based methods have successfully applied to identifying drugs for a number of diseases. However, currently few gene expression based methods are available for the repurposing of combined drugs. Increasing evidence has shown that the combination of drugs may valid for novel indications. METHOD: Here, for this purpose, we presented a simple CMap-based scoring system to predict novel indications for the combination of two drugs. We then confirmed the effectiveness of the predicted drug combination in an animal model of type 2 diabetes. RESULTS: We applied the presented scoring system to type 2 diabetes and identified a candidate combination of two drugs, Trolox C and Cytisine. Finally, we confirmed that the predicted combined drugs are effective for the treatment of type 2 diabetes. CONCLUSION: The presented scoring system represents one novel method for drug repurposing, which would provide helps for greatly extended the space of drugs.
Project description:BACKGROUND:Drug repositioning, also known as drug repurposing, defines new indications for existing drugs and can be used as an alternative to drug development. In recent years, the accumulation of large volumes of information related to drugs and diseases has led to the development of various computational approaches for drug repositioning. Although herbal medicines have had a great impact on current drug discovery, there are still a large number of herbal compounds that have no definite indications. RESULTS:In the present study, we constructed a computational model to predict the unknown pharmacological effects of herbal compounds using machine learning techniques. Based on the assumption that similar diseases can be treated with similar drugs, we used four categories of drug-drug similarity (e.g., chemical structure, side-effects, gene ontology, and targets) and three categories of disease-disease similarity (e.g., phenotypes, human phenotype ontology, and gene ontology). Then, associations between drug and disease were predicted using the employed similarity features. The prediction models were constructed using classification algorithms, including logistic regression, random forest and support vector machine algorithms. Upon cross-validation, the random forest approach showed the best performance (AUC?=?0.948) and also performed well in an external validation assessment using an unseen independent dataset (AUC?=?0.828). Finally, the constructed model was applied to predict potential indications for existing drugs and herbal compounds. As a result, new indications for 20 existing drugs and 31 herbal compounds were predicted and validated using clinical trial data. CONCLUSIONS:The predicted results were validated manually confirming the performance and underlying mechanisms - for example, irinotecan as a treatment for neuroblastoma. From the prediction, herbal compounds were considered to be drug candidates for related diseases which is important to be further developed. The proposed prediction model can contribute to drug discovery by suggesting drug candidates from herbal compounds which have potentials but few were studied.
Project description:Drug-induced liver injury (DILI) is a major factor in the development of drugs and the safety of drugs. If the DILI cannot be effectively predicted during the development of the drug, it will cause the drug to be withdrawn from markets. Therefore, DILI is crucial at the early stages of drug research. This work presents a 2-class ensemble classifier model for predicting DILI, with 2D molecular descriptors and fingerprints on a dataset of 450 compounds. The purpose of our study is to investigate which are the key molecular fingerprints that may cause DILI risk, and then to obtain a reliable ensemble model to predict DILI risk with these key factors. Experimental results suggested that 8 molecular fingerprints are very critical for predicting DILI, and also obtained the best ratio of molecular fingerprints to molecular descriptors. The result of the 5-fold cross-validation of the ensemble vote classifier method obtain an accuracy of 77.25%, and the accuracy of the test set was 81.67%. This model could be used for drug-induced liver injury prediction.
Project description:BACKGROUND: Drugs that bind to common targets likely exert similar activities. In this target-centric view, the inclusion of richer target information may better represent the relationships between drugs and their activities. Under this assumption, we expanded the "common binding rule" assumption of QSAR to create a new drug-drug relationship score (DRS). METHOD: Our method uses various chemical features to encode drug target information into the drug-drug relationship information. Specifically, drug pairs were transformed into numerical vectors containing the basal drug properties and their differences. After that, machine learning techniques such as data cleaning, dimension reduction, and ensemble classifier were used to prioritize drug pairs bound to a common target. In other words, the estimation of the drug-drug relationship is restated as a large-scale classification problem, which provides the framework for using state-of-the-art machine learning techniques with thousands of chemical features for newly defining drug-drug relationships. CONCLUSIONS: Various aspects of the presented score were examined to determine its reliability and usefulness: the abundance of common domains for the predicted drug pairs, c.a. 80% coverage for known targets, successful identifications of unknown targets, and a meaningful correlation with another cutting-edge method for analyzing drug similarities. The most significant strength of our method is that the DRS can be used to describe phenotypic similarities, such as pharmacological effects.
Project description:N6-methyladenosine (m6A) is a prevalent RNA methylation modification involved in several biological processes. Hundreds or thousands of m6A sites identified from different species using high-throughput experiments provides a rich resource to construct in-silico approaches for identifying m6A sites. The existing m6A predictors are developed using conventional machine-learning (ML) algorithms and most are species-centric. In this paper, we develop a novel cross-species deep-learning classifier based on bidirectional Gated Recurrent Unit (BGRU) for the prediction of m6A sites. In comparison with conventional ML approaches, BGRU achieves outstanding performance for the Mammalia dataset that contains over fifty thousand m6A sites but inferior for the Saccharomyces cerevisiae dataset that covers around a thousand positives. The accuracy of BGRU is sensitive to the data size and the sensitivity is compensated by the integration of a random forest classifier with a novel encoding of enhanced nucleic acid content. The integrated approach dubbed as BGRU-based Ensemble RNA Methylation site Predictor (BERMP) has competitive performance in both cross-validation test and independent test. BERMP also outperforms existing m6A predictors for different species. Therefore, BERMP is a novel multi-species tool for identifying m6A sites with high confidence. This classifier is freely available at http://www.bioinfogo.org/bermp.