Overlap matrix completion for predicting drug-associated indications.
ABSTRACT: Identification of potential drug-associated indications is critical for either approved or novel drugs in drug repositioning. Current computational methods based on drug similarity and disease similarity have been developed to predict drug-disease associations. When more reliable drug- or disease-related information becomes available and is integrated, the prediction precision can be continuously improved. However, it is a challenging problem to effectively incorporate multiple types of prior information, representing different characteristics of drugs and diseases, to identify promising drug-disease associations. In this study, we propose an overlap matrix completion (OMC) for bilayer networks (OMC2) and tri-layer networks (OMC3) to predict potential drug-associated indications, respectively. OMC is able to efficiently exploit the underlying low-rank structures of the drug-disease association matrices. In OMC2, first of all, we construct one bilayer network from drug-side aspect and one from disease-side aspect, and then obtain their corresponding block adjacency matrices. We then propose the OMC2 algorithm to fill out the values of the missing entries in these two adjacency matrices, and predict the scores of unknown drug-disease pairs. Moreover, we further extend OMC2 to OMC3 to handle tri-layer networks. Computational experiments on various datasets indicate that our OMC methods can effectively predict the potential drug-disease associations. Compared with the other state-of-the-art approaches, our methods yield higher prediction accuracy in 10-fold cross-validation and de novo experiments. In addition, case studies also confirm the effectiveness of our methods in identifying promising indications for existing drugs in practical applications.
Project description:Adjacency matrices are useful for storing pairwise interaction data, such as correlations between gene pairs in a pathway or similarities between genes and conditions. The aMatReader app enables users to import one or multiple adjacency matrix files into Cytoscape, where each file represents an edge attribute in a network. Our goal was to import the diverse adjacency matrix formats produced by existing scripts and libraries written in R, MATLAB, and Python, and facilitate importing that data into Cytoscape. To accelerate the import process, aMatReader attempts to predict matrix import parameters by analyzing the first two lines of the file. We also exposed CyREST endpoints to allow researchers to import network matrix data directly into Cytoscape from their language of choice. Many analysis tools deal with networks in the form of an adjacency matrix, and exposing the aMatReader API to automation users enables scripts to transfer those networks directly into Cytoscape with little effort.
Project description:Current studies have shown that long non-coding RNAs (lncRNAs) play a crucial role in a variety of fundamental biological processes related to complex human diseases. The prediction of latent disease-lncRNA associations can help to understand the pathogenesis of complex human diseases at the level of lncRNA, which also contributes to the detection of disease biomarkers, and the diagnosis, treatment, prognosis and prevention of disease. Nevertheless, it is still a challenging and urgent task to accurately identify latent disease-lncRNA association. Discovering latent links on the basis of biological experiments is time-consuming and wasteful, necessitating the development of computational prediction models. In this study, a computational prediction model has been remodeled as a matrix completion framework of the recommendation system by completing the unknown items in the rating matrix. A novel method named faster randomized matrix completion for latent disease-lncRNA association prediction (FRMCLDA) has been proposed by virtue of improved randomized partial SVD (rSVD-BKI) on a heterogeneous bilayer network. First, the correlated data source and experimentally validated information of diseases and lncRNAs are integrated to construct a heterogeneous bilayer network. Next, the integrated heterogeneous bilayer network can be formalized as a comprehensive adjacency matrix which includes lncRNA similarity matrix, disease similarity matrix, and disease-lncRNA association matrix where the uncertain disease-lncRNA associations are referred to as blank items. Then, a matrix approximate to the original adjacency matrix has been designed with predicted scores to retrieve the blank items. The construction of the approximate matrix could be equivalently resolved by the nuclear norm minimization. Finally, a faster singular value thresholding algorithm with a randomized partial SVD combing a new sub-space reuse technique has been utilized to complete the adjacency matrix. The results of leave-one-out cross-validation (LOOCV) experiments and 5-fold cross-validation (5-fold CV) experiments on three different benchmark databases have confirmed the availability and adaptability of FRMCLDA in inferring latent relationships of disease-lncRNA pairs, and in inferring lncRNAs correlated with novel diseases without any prior interaction information. Additionally, case studies have shown that FRMCLDA is able to effectively predict latent lncRNAs correlated with three widespread malignancies: prostate cancer, colon cancer, and gastric cancer.
Project description:BACKGROUND:In this work, we present a new coarse grained representation of RNA dynamics. It is based on adjacency matrices and their interactions patterns obtained from molecular dynamics simulations. RNA molecules are well-suited for this representation due to their composition which is mainly modular and assessable by the secondary structure alone. These interactions can be represented as adjacency matrices of k nucleotides. Based on those, we define transitions between states as changes in the adjacency matrices which form Markovian dynamics. The intense computational demand for deriving the transition probability matrices prompted us to develop StreAM-[Formula: see text], a stream-based algorithm for generating such Markov models of k-vertex adjacency matrices representing the RNA. RESULTS:We benchmark StreAM-[Formula: see text] (a) for random and RNA unit sphere dynamic graphs (b) for the robustness of our method against different parameters. Moreover, we address a riboswitch design problem by applying StreAM-[Formula: see text] on six long term molecular dynamics simulation of a synthetic tetracycline dependent riboswitch (500 ns) in combination with five different antibiotics. CONCLUSIONS:The proposed algorithm performs well on large simulated as well as real world dynamic graphs. Additionally, StreAM-[Formula: see text] provides insights into nucleotide based RNA dynamics in comparison to conventional metrics like the root-mean square fluctuation. In the light of experimental data our results show important design opportunities for the riboswitch.
Project description:A molecular similarity measure has been developed using molecular topological graphs and atomic partial charges. Two kinds of topological graphs were used. One is the ordinary adjacency matrix and the other is a matrix which represents the minimum path length between two atoms of the molecule. The ordinary adjacency matrix is suitable to compare the local structures of molecules such as functional groups, and the other matrix is suitable to compare the global structures of molecules. The combination of these two matrices gave a similarity measure. This method was applied to in silico drug screening, and the results showed that it was effective as a similarity measure.
Project description:With the ever increasing cost and time required for drug development, new strategies for drug development are highly demanded, whereas repurposing old drugs has attracted much attention in drug discovery. In this paper, we introduce a new network pharmacology approach, namely PINA, to predict potential novel indications of old drugs based on the molecular networks affected by drugs and associated with diseases. Benchmark results on FDA approved drugs have shown the superiority of PINA over traditional computational approaches in identifying new indications of old drugs. We further extend PINA to predict the novel indications of Traditional Chinese Medicines (TCMs) with Liuwei Dihuang Wan (LDW) as a case study. The predicted indications, including immune system disorders and tumor, are validated by expert knowledge and evidences from literature, demonstrating the effectiveness of our proposed computational approach.
Project description:We investigated the ability of football teams to develop a particular playing style by looking at their passing patterns. Using the information contained in the pass sequences during matches, we constructed the pitch passing networks of teams, whose nodes are the divisions of the pitch for a given spatial scale and links account for the number of passes from region to region. We translated football passings networks into their corresponding adjacency matrices. We calculated the correlations between matrices of the same team to quantify how consistent the passing patterns of a given team are. Next, we quantified the differences with other teams' matrices and obtained an identifiability parameter that indicates how unique are the passing patterns of a given team. Consistency and identifiability rankings were calculated during a whole season, allowing to detect those teams of a league whose passing patterns are different from the rest. Furthermore, we found differences between teams playing at home or away. Finally, we used the identifiability parameter to investigate what teams imposed their passing patterns over the rivals during a given match.
Project description:Inferring potential drug indications, for either novel or approved drugs, is a key step in drug development. Previous computational methods in this domain have focused on either drug repositioning or matching drug and disease gene expression profiles. Here, we present a novel method for the large-scale prediction of drug indications (PREDICT) that can handle both approved drugs and novel molecules. Our method is based on the observation that similar drugs are indicated for similar diseases, and utilizes multiple drug-drug and disease-disease similarity measures for the prediction task. On cross-validation, it obtains high specificity and sensitivity (AUC=0.9) in predicting drug indications, surpassing existing methods. We validate our predictions by their overlap with drug indications that are currently under clinical trials, and by their agreement with tissue-specific expression information on the drug targets. We further show that disease-specific genetic signatures can be used to accurately predict drug indications for new diseases (AUC=0.92). This lays the computational foundation for future personalized drug treatments, where gene expression signatures from individual patients would replace the disease-specific signatures.
Project description:MOTIVATION:Computational drug repositioning is a cost-effective strategy to identify novel indications for existing drugs. Drug repositioning is often modeled as a recommendation system problem. Taking advantage of the known drug-disease associations, the objective of the recommendation system is to identify new treatments by filling out the unknown entries in the drug-disease association matrix, which is known as matrix completion. Underpinned by the fact that common molecular pathways contribute to many different diseases, the recommendation system assumes that the underlying latent factors determining drug-disease associations are highly correlated. In other words, the drug-disease matrix to be completed is low-rank. Accordingly, matrix completion algorithms efficiently constructing low-rank drug-disease matrix approximations consistent with known associations can be of immense help in discovering the novel drug-disease associations. RESULTS:In this article, we propose to use a bounded nuclear norm regularization (BNNR) method to complete the drug-disease matrix under the low-rank assumption. Instead of strictly fitting the known elements, BNNR is designed to tolerate the noisy drug-drug and disease-disease similarities by incorporating a regularization term to balance the approximation error and the rank properties. Moreover, additional constraints are incorporated into BNNR to ensure that all predicted matrix entry values are within the specific interval. BNNR is carried out on an adjacency matrix of a heterogeneous drug-disease network, which integrates the drug-drug, drug-disease and disease-disease networks. It not only makes full use of available drugs, diseases and their association information, but also is capable of dealing with cold start naturally. Our computational results show that BNNR yields higher drug-disease association prediction accuracy than the current state-of-the-art methods. The most significant gain is in prediction precision measured as the fraction of the positive predictions that are truly positive, which is particularly useful in drug design practice. Cases studies also confirm the accuracy and reliability of BNNR. AVAILABILITY AND IMPLEMENTATION:The code of BNNR is freely available at https://github.com/BioinformaticsCSU/BNNR. SUPPLEMENTARY INFORMATION:Supplementary data are available at Bioinformatics online.
Project description:Drug repositioning helps fully explore indications for marketed drugs and clinical candidates. Here we show that the clinical side-effects (SEs) provide a human phenotypic profile for the drug, and this profile can suggest additional disease indications. We extracted 3,175 SE-disease relationships by combining the SE-drug relationships from drug labels and the drug-disease relationships from PharmGKB. Many relationships provide explicit repositioning hypotheses, such as drugs causing hypoglycemia are potential candidates for diabetes. We built Naïve Bayes models to predict indications for 145 diseases using the SEs as features. The AUC was above 0.8 in 92% of these models. The method was extended to predict indications for clinical compounds, 36% of the models achieved AUC above 0.7. This suggests that closer attention should be paid to the SEs observed in trials not just to evaluate the harmful effects, but also to rationally explore the repositioning potential based on this "clinical phenotypic assay".