Predicting microRNA-disease associations using bipartite local models and hubness-aware regression.
ABSTRACT: The development and progression of numerous complex human diseases have been confirmed to be associated with microRNAs (miRNAs) by various experimental and clinical studies. Predicting potential miRNA-disease associations can help us understand the underlying molecular and cellular mechanisms of diseases and promote the development of disease treatment and diagnosis. Due to the high cost of conventional experimental verification, proposing a new computational method for miRNA-disease association prediction is an efficient and economical way. Since previous computational models ignored the hubness phenomenon, we presented a novel computational model of Bipartite Local models and Hubness-Aware Regression for MiRNA-Disease Association prediction (BLHARMDA). In this method, we first used known miRNA-disease associations to calculate the Jaccard similarity between miRNAs and between diseases, then utilized a modified kNNs model in the bipartite local model method. As a result, we effectively alleviated the detriments from 'bad' hubs. BLHARMDA obtained AUCs of 0.9141 and 0.8390 in the global and local leave-one-out cross validation, respectively, which outperformed most of the previous models and proved high prediction performance of BLHARMDA. Besides, the standard deviation of 0.0006 in 5-fold cross validation confirmed our model's prediction stability and the averaged prediction accuracy of 0.9120 showed the high precision of our model. In addition, to further evaluate our model's accuracy, we implemented BLHARMDA on three typical human diseases in three different types of case studies. As a result, 49 (Esophageal Neoplasms), 50 (Lung Neoplasms) and 50 (Carcinoma Hepatocellular) out of the top 50 related miRNAs were validated by recent experimental discoveries.
Project description:As increasing experimental studies have shown that microRNAs (miRNAs) are closely related to multiple biological processes and the prevention, diagnosis and treatment of human diseases, a growing number of researchers are focusing on the identification of associations between miRNAs and diseases. Identifying such associations purely via experiments is costly and demanding, which prompts researchers to develop computational methods to complement the experiments. In this paper, a novel prediction model named Ensemble of Kernel Ridge Regression based MiRNA-Disease Association prediction (EKRRMDA) was developed. EKRRMDA obtained features of miRNAs and diseases by integrating the disease semantic similarity, the miRNA functional similarity and the Gaussian interaction profile kernel similarity for diseases and miRNAs. Under the computational framework that utilized ensemble learning and feature dimensionality reduction, multiple base classifiers that combined two Kernel Ridge Regression classifiers from the miRNA side and disease side, respectively, were obtained based on random selection of features. Then average strategy for these base classifiers was adopted to obtain final association scores of miRNA-disease pairs. In the global and local leave-one-out cross validation, EKRRMDA attained the AUCs of 0.9314 and 0.8618, respectively. Moreover, the model's average AUC with standard deviation in 5-fold cross validation was 0.9275 ± 0.0008. In addition, we implemented three different types of case studies on predicting miRNAs associated with five important diseases. As a result, there were 90% (Esophageal Neoplasms), 86% (Kidney Neoplasms), 86% (Lymphoma), 98% (Lung Neoplasms), and 96% (Breast Neoplasms) of the top 50 predicted miRNAs verified to have associations with these diseases.
Project description:Predicting novel microRNA (miRNA)-disease associations is clinically significant due to miRNAs' potential roles of diagnostic biomarkers and therapeutic targets for various human diseases. Previous studies have demonstrated the viability of utilizing different types of biological data to computationally infer new disease-related miRNAs. Yet researchers face the challenge of how to effectively integrate diverse datasets and make reliable predictions. In this study, we presented a computational model named Laplacian Regularized Sparse Subspace Learning for MiRNA-Disease Association prediction (LRSSLMDA), which projected miRNAs/diseases' statistical feature profile and graph theoretical feature profile to a common subspace. It used Laplacian regularization to preserve the local structures of the training data and a L1-norm constraint to select important miRNA/disease features for prediction. The strength of dimensionality reduction enabled the model to be easily extended to much higher dimensional datasets than those exploited in this study. Experimental results showed that LRSSLMDA outperformed ten previous models: the AUC of 0.9178 in global leave-one-out cross validation (LOOCV) and the AUC of 0.8418 in local LOOCV indicated the model's superior prediction accuracy; and the average AUC of 0.9181+/-0.0004 in 5-fold cross validation justified its accuracy and stability. In addition, three types of case studies further demonstrated its predictive power. Potential miRNAs related to Colon Neoplasms, Lymphoma, Kidney Neoplasms, Esophageal Neoplasms and Breast Neoplasms were predicted by LRSSLMDA. Respectively, 98%, 88%, 96%, 98% and 98% out of the top 50 predictions were validated by experimental evidences. Therefore, we conclude that LRSSLMDA would be a valuable computational tool for miRNA-disease association prediction.
Project description:Recently, a growing number of biological research and scientific experiments have demonstrated that microRNA (miRNA) affects the development of human complex diseases. Discovering miRNA-disease associations plays an increasingly vital role in devising diagnostic and therapeutic tools for diseases. However, since uncovering associations via experimental methods is expensive and time-consuming, novel and effective computational methods for association prediction are in demand. In this study, we developed a computational model of Matrix Decomposition and Heterogeneous Graph Inference for miRNA-disease association prediction (MDHGI) to discover new miRNA-disease associations by integrating the predicted association probability obtained from matrix decomposition through sparse learning method, the miRNA functional similarity, the disease semantic similarity, and the Gaussian interaction profile kernel similarity for diseases and miRNAs into a heterogeneous network. Compared with previous computational models based on heterogeneous networks, our model took full advantage of matrix decomposition before the construction of heterogeneous network, thereby improving the prediction accuracy. MDHGI obtained AUCs of 0.8945 and 0.8240 in the global and the local leave-one-out cross validation, respectively. Moreover, the AUC of 0.8794+/-0.0021 in 5-fold cross validation confirmed its stability of predictive performance. In addition, to further evaluate the model's accuracy, we applied MDHGI to four important human cancers in three different kinds of case studies. In the first type, 98% (Esophageal Neoplasms) and 98% (Lymphoma) of top 50 predicted miRNAs have been confirmed by at least one of the two databases (dbDEMC and miR2Disease) or at least one experimental literature in PubMed. In the second type of case study, what made a difference was that we removed all known associations between the miRNAs and Lung Neoplasms before implementing MDHGI on Lung Neoplasms. As a result, 100% (Lung Neoplasms) of top 50 related miRNAs have been indexed by at least one of the three databases (dbDEMC, miR2Disease and HMDD V2.0) or at least one experimental literature in PubMed. Furthermore, we also tested our prediction method on the HMDD V1.0 database to prove the applicability of MDHGI to different datasets. The results showed that 50 out of top 50 miRNAs related with the breast neoplasms were validated by at least one of the three databases (HMDD V2.0, dbDEMC, and miR2Disease) or at least one experimental literature.
Project description:In recent years, increasing associations between microRNAs (miRNAs) and human diseases have been identified. Based on accumulating biological data, many computational models for potential miRNA-disease associations inference have been developed, which saves time and expenditure on experimental studies, making great contributions to researching molecular mechanism of human diseases and developing new drugs for disease treatment. In this paper, we proposed a novel computational method named Ensemble of Decision Tree based MiRNA-Disease Association prediction (EDTMDA), which innovatively built a computational framework integrating ensemble learning and dimensionality reduction. For each miRNA-disease pair, the feature vector was extracted by calculating the statistical measures, graph theoretical measures, and matrix factorization results for the miRNA and disease, respectively. Then multiple base learnings were built to yield many decision trees (DTs) based on random selection of negative samples and miRNA/disease features. Particularly, Principal Components Analysis was applied to each base learning to reduce feature dimensionality and hence remove the noise or redundancy. Average strategy was adopted for these DTs to get final association scores between miRNAs and diseases. In model performance evaluation, EDTMDA showed AUC of 0.9309 in global leave-one-out cross validation (LOOCV) and AUC of 0.8524 in local LOOCV. Additionally, AUC of 0.9192+/-0.0009 in 5-fold cross validation proved the model's reliability and stability. Furthermore, three types of case studies for four human diseases were implemented. As a result, 94% (Esophageal Neoplasms), 86% (Kidney Neoplasms), 96% (Breast Neoplasms) and 88% (Carcinoma Hepatocellular) of top 50 predicted miRNAs were confirmed by experimental evidences in literature.
Project description:In recent years, microRNAs (miRNAs) are attracting an increasing amount of researchers' attention, as accumulating studies show that miRNAs play important roles in various basic biological processes and that dysregulation of miRNAs is connected with diverse human diseases, particularly cancers. However, the experimental methods to identify associations between miRNAs and diseases remain costly and laborious. In this study, we developed a computational method named Network Distance Analysis for MiRNA-Disease Association prediction (NDAMDA) which could effectively predict potential miRNA-disease associations. The highlight of this method was the use of not only the direct network distance between 2 miRNAs (diseases) but also their respective mean network distances to all other miRNAs (diseases) in the network. The model's reliable performance was certified by the AUC of 0.8920 in global leave-one-out cross-validation (LOOCV), 0.8062 in local LOOCV and the average AUCs of 0.8935 ± 0.0009 in fivefold cross-validation. Moreover, we applied NDAMDA to 3 different case studies to predict potential miRNAs related to breast neoplasms, lymphoma, oesophageal neoplasms, prostate neoplasms and hepatocellular carcinoma. Results showed that 86%, 72%, 86%, 86% and 84% of the top 50 predicted miRNAs were supported by experimental association evidence. Therefore, NDAMDA is a reliable method for predicting disease-related miRNAs.
Project description:Associations between microRNAs (miRNAs) and human diseases have been identified by increasing studies and discovering new ones is an ongoing process in medical laboratories. To improve experiment productivity, researchers computationally infer potential associations from biological data, selecting the most promising candidates for experimental verification. Predicting potential miRNA-disease association has become a research area of growing importance. This paper presents a model of Extreme Gradient Boosting Machine for MiRNA-Disease Association (EGBMMDA) prediction by integrating the miRNA functional similarity, the disease semantic similarity, and known miRNA-disease associations. The statistical measures, graph theoretical measures, and matrix factorization results for each miRNA-disease pair were calculated and used to form an informative feature vector. The vector for known associated pairs obtained from the HMDD v2.0 database was used to train a regression tree under the gradient boosting framework. EGBMMDA was the first decision tree learning-based model used for predicting miRNA-disease associations. Respectively, AUCs of 0.9123 and 0.8221 in global and local leave-one-out cross-validation proved the model's reliable performance. Moreover, the 0.9048?±?0.0012 AUC in fivefold cross-validation confirmed its stability. We carried out three different types of case studies of predicting potential miRNAs related to Colon Neoplasms, Lymphoma, Prostate Neoplasms, Breast Neoplasms, and Esophageal Neoplasms. The results indicated that, respectively, 98%, 90%, 98%, 100%, and 98% of the top 50 predictions for the five diseases were confirmed by experiments. Therefore, EGBMMDA appears to be a useful computational resource for miRNA-disease association prediction.
Project description:More and more research works have indicated that microRNAs (miRNAs) play indispensable roles in exploring the pathogenesis of diseases. Detecting miRNA-disease associations by experimental techniques in biology is expensive and time-consuming. Hence, it is important to propose reliable and accurate computational methods to exploring potential miRNAs related diseases. In our work, we develop a novel method (BRWHNHA) to uncover potential miRNAs associated with diseases based on hybrid recommendation algorithm and unbalanced bi-random walk. We first integrate the Gaussian interaction profile kernel similarity into the miRNA functional similarity network and the disease semantic similarity network. Then we calculate the transition probability matrix of bipartite network by using hybrid recommendation algorithm. Finally, we adopt unbalanced bi-random walk on the heterogeneous network to infer undiscovered miRNA-disease relationships. We tested BRWHNHA on 22 diseases based on five-fold cross-validation and achieves reliable performance with average AUC of 0.857, which an area under the ROC curve ranging from 0.807 to 0.924. As a result, BRWHNHA significantly improves the performance of inferring potential miRNA-disease association compared with previous methods. Moreover, the case studies on lung neoplasms and prostate neoplasms also illustrate that BRWHNHA is superior to previous prediction methods and is more advantageous in exploring potential miRNAs related diseases. All source codes can be downloaded from https://github.com/myl446/BRWHNHA .
Project description:Cumulative verified experimental studies have demonstrated that microRNAs (miRNAs) could be closely related with the development and progression of human complex diseases. Based on the assumption that functional similar miRNAs may have a strong correlation with phenotypically similar diseases and vice versa, researchers developed various effective computational models which combine heterogeneous biologic data sets including disease similarity network, miRNA similarity network, and known disease-miRNA association network to identify potential relationships between miRNAs and diseases in biomedical research. Considering the limitations in previous computational study, we introduced a novel computational method of Ranking-based KNN for miRNA-Disease Association prediction (RKNNMDA) to predict potential related miRNAs for diseases, and our method obtained an AUC of 0.8221 based on leave-one-out cross validation. In addition, RKNNMDA was applied to 3 kinds of important human cancers for further performance evaluation. The results showed that 96%, 80% and 94% of predicted top 50 potential related miRNAs for Colon Neoplasms, Esophageal Neoplasms, and Prostate Neoplasms have been confirmed by experimental literatures, respectively. Moreover, RKNNMDA could be used to predict potential miRNAs for diseases without any known miRNAs, and it is anticipated that RKNNMDA would be of great use for novel miRNA-disease association identification.
Project description:Recently, microRNAs (miRNAs) have drawn more and more attentions because accumulating experimental studies have indicated miRNA could play critical roles in multiple biological processes as well as the development and progression of human complex diseases. Using the huge number of known heterogeneous biological datasets to predict potential associations between miRNAs and diseases is an important topic in the field of biology, medicine, and bioinformatics. In this study, considering the limitations in the previous computational methods, we developed the computational model of Heterogeneous Graph Inference for MiRNA-Disease Association prediction (HGIMDA) to uncover potential miRNA-disease associations by integrating miRNA functional similarity, disease semantic similarity, Gaussian interaction profile kernel similarity, and experimentally verified miRNA-disease associations into a heterogeneous graph. HGIMDA obtained AUCs of 0.8781 and 0.8077 based on global and local leave-one-out cross validation, respectively. Furthermore, HGIMDA was applied to three important human cancers for performance evaluation. As a result, 90% (Colon Neoplasms), 88% (Esophageal Neoplasms) and 88% (Kidney Neoplasms) of top 50 predicted miRNAs are confirmed by recent experiment reports. Furthermore, HGIMDA could be effectively applied to new diseases and new miRNAs without any known associations, which overcome the important limitations of many previous computational models.
Project description:Increasing evidences have indicated that microRNAs (miRNAs) are functionally associated with the development and progression of various complex human diseases. However, the roles of miRNAs in multiple biological processes or various diseases and their underlying molecular mechanisms still have not been fully understood yet. Predicting potential miRNA-disease associations by integrating various heterogeneous biological datasets is of great significance to the biomedical research. Computational methods could obtain potential miRNA-disease associations in a short time, which significantly reduce the experimental time and cost. Considering the limitations in previous computational methods, we developed the model of Within and Between Score for MiRNA-Disease Association prediction (WBSMDA) to predict potential miRNAs associated with various complex diseases. WBSMDA could be applied to the diseases without any known related miRNAs. The AUC of 0.8031 based on Leave-one-out cross validation has demonstrated its reliable performance. WBSMDA was further applied to Colon Neoplasms, Prostate Neoplasms, and Lymphoma for the identification of their potential related miRNAs. As a result, 90%, 84%, and 80% of predicted miRNA-disease pairs in the top 50 prediction list for these three diseases have been confirmed by recent experimental literatures, respectively. It is anticipated that WBSMDA would be a useful resource for potential miRNA-disease association identification.