Comparing neural-network scoring functions and the state of the art: applications to common library screening.
ABSTRACT: We compare established docking programs, AutoDock Vina and Schrödinger's Glide, to the recently published NNScore scoring functions. As expected, the best protocol to use in a virtual-screening project is highly dependent on the target receptor being studied. However, the mean screening performance obtained when candidate ligands are docked with Vina and rescored with NNScore 1.0 is not statistically different than the mean performance obtained when docking and scoring with Glide. We further demonstrate that the Vina and NNScore docking scores both correlate with chemical properties like small-molecule size and polarizability. Compensating for these potential biases leads to improvements in virtual screen performance. Composite NNScore-based scoring functions suited to a specific receptor further improve performance. We are hopeful that the current study will prove useful for those interested in computer-aided drug design.
Project description:Identification of chemical compounds with specific biological activities is an important step in both chemical biology and drug discovery. When the structure of the intended target is available, one approach is to use molecular docking programs to assess the chemical complementarity of small molecules with the target; such calculations provide a qualitative measure of affinity that can be used in virtual screening (VS) to rank order a list of compounds according to their potential to be active. rDock is a molecular docking program developed at Vernalis for high-throughput VS (HTVS) applications. Evolved from RiboDock, the program can be used against proteins and nucleic acids, is designed to be computationally very efficient and allows the user to incorporate additional constraints and information as a bias to guide docking. This article provides an overview of the program structure and features and compares rDock to two reference programs, AutoDock Vina (open source) and Schrödinger's Glide (commercial). In terms of computational speed for VS, rDock is faster than Vina and comparable to Glide. For binding mode prediction, rDock and Vina are superior to Glide. The VS performance of rDock is significantly better than Vina, but inferior to Glide for most systems unless pharmacophore constraints are used; in that case rDock and Glide are of equal performance. The program is released under the Lesser General Public License and is freely available for download, together with the manuals, example files and the complete test sets, at http://rdock.sourceforge.net/
Project description:In this study, we developed a novel algorithm to improve the screening performance of an arbitrary docking scoring function by recalibrating the docking score of a query compound based on its structure similarity with a set of training compounds, while the extra computational cost is neglectable. Two popular docking methods, Glide and AutoDock Vina were adopted as the original scoring functions to be processed with our new algorithm and similar improvement performance was achieved. Predicted binding affinities were compared against experimental data from ChEMBL and DUD-E databases. 11 representative drug receptors from diverse drug target categories were applied to evaluate the hybrid scoring function. The effects of four different fingerprints (FP2, FP3, FP4, and MACCS) and the four different compound similarity effect (CSE) functions were explored. Encouragingly, the screening performance was significantly improved for all 11 drug targets especially when CSE = S<sup>4</sup> (S is the Tanimoto structural similarity) and FP2 fingerprint were applied. The average predictive index (PI) values increased from 0.34 to 0.66 and 0.39 to 0.71 for the Glide and AutoDock vina scoring functions, respectively. To evaluate the performance of the calibration algorithm in drug lead identification, we also imposed an upper limit on the structural similarity to mimic the real scenario of screening diverse libraries for which query ligands are general-purpose screening compounds and they are not necessarily structurally similar to reference ligands. Encouragingly, we found our hybrid scoring function still outperformed the original docking scoring function. The hybrid scoring function was further evaluated using external datasets for two systems and we found the PI values increased from 0.24 to 0.46 and 0.14 to 0.42 for A2AR and CFX systems, respectively. In a conclusion, our calibration algorithm can significantly improve the virtual screening performance in both drug lead optimization and identification phases with neglectable computational cost.
Project description:The failure of default scoring functions to ensure virtual screening enrichment is a persistent problem for the molecular docking algorithms used in structure-based drug discovery. To remedy this problem, elaborate rescoring and postprocessing schemes have been developed with a varying degree of success, specificity, and cost. The negative image-based rescoring (R-NiB) has been shown to improve the flexible docking performance markedly with a variety of drug targets. The yield improvement is achieved by comparing the alternative docking poses against the negative image of the target protein's ligand-binding cavity. In other words, the shape and electrostatics of the binding pocket is directly used in the similarity comparison to rank the explicit docking poses. Here, the PANTHER/ShaEP-based R-NiB methodology is tested with six popular docking softwares, including GLIDE, PLANTS, GOLD, DOCK, AUTODOCK, and AUTODOCK VINA, using five validated benchmark sets. Overall, the results indicate that R-NiB outperforms the default docking scoring consistently and inexpensively, demonstrating that the methodology is ready for wide-scale virtual screening usage.
Project description:Target fishing often relies on the use of reverse docking to identify potential target proteins of ligands from protein database. The limitation of reverse docking is the accuracy of current scoring funtions used to distinguish true target from non-target proteins. Many contemporary scoring functions are designed for the virtual screening of small molecules without special optimization for reverse docking, which would be easily influenced by the properties of protein pockets, resulting in scoring bias to the proteins with certain properties. This bias would cause lots of false positives in reverse docking, interferring the identification of true targets. In this paper, we have conducted a large-scale reverse docking (5000 molecules to 100 proteins) to study the scoring bias in reverse docking by DOCK, Glide, and AutoDock Vina. And we found that there were actually some frequency hits, namely interference proteins in all three docking procedures. After analyzing the differences of pocket properties between these interference proteins and the others, we speculated that the interference proteins have larger contact area (related to the size and shape of protein pockets) with ligands (for all three docking programs) or higher hydrophobicity (for Glide), which could be the causes of scoring bias. Then we applied the score normalization method to eliminate this scoring bias, which was effective to make docking score more balanced between different proteins in the reverse docking of benchmark dataset. Later, the Astex Diver Set was utilized to validate the effect of score normalization on actual cases of reverse docking, showing that the accuracy of target prediction significantly increased by 21.5% in the reverse docking by Glide after score normalization, though there was no obvious change in the reverse docking by DOCK and AutoDock Vina. Our results demonstrate the effectiveness of score normalization to eliminate the scoring bias and improve the accuracy of target prediction in reverse docking. Moreover, the properties of protein pockets causing scoring bias to certain proteins we found here can provide the theory basis to further optimize the scoring functions of docking programs for future research.
Project description:The accuracy of five docking programs at reproducing crystallographic structures of complexes of 8 macrolides and 12 related macrocyclic structures, all with their corresponding receptors, was evaluated. Self-docking calculations indicated excellent performance in all cases (mean RMSD values ? 1.0) and confirmed the speed of AutoDock Vina. Afterwards, the lowest-energy conformer of each molecule and all the conformers lying 0-10 kcal/mol above it (as given by Macrocycle, from MacroModel 10.0) were subjected to standard docking calculations. While each docking method has its own merits, the observed speed of the programs was as follows: Glide 6.6 > AutoDock Vina 1.1.2 > DOCK 6.5 >> AutoDock 4.2.6 > AutoDock 3.0.5. For most of the complexes, the five methods predicted quite correct poses of ligands at the binding sites, but the lower RMSD values for the poses of highest affinity were in the order: Glide 6.6 ? AutoDock Vina ? DOCK 6.5 > AutoDock 4.2.6 >> AutoDock 3.0.5. By choosing the poses closest to the crystal structure the order was: AutoDock Vina > Glide 6.6 ? DOCK 6.5 ? AutoDock 4.2.6 >> AutoDock 3.0.5. Re-scoring (AutoDock 4.2.6//AutoDock Vina, Amber Score and MM-GBSA) improved the agreement between the calculated and experimental data. For all intents and purposes, these three methods are equally reliable.
Project description:Autodock Vina is a very popular, and highly cited, open source docking program. Here we present a scoring function which we call Vinardo (Vina RaDii Optimized). Vinardo is based on Vina, and was trained through a novel approach, on state of the art datasets. We show that the traditional approach to train empirical scoring functions, using linear regression to optimize the correlation of predicted and experimental binding affinities, does not result in a function with optimal docking capabilities. On the other hand, a combination of scoring, minimization, and re-docking on carefully curated training datasets allowed us to develop a simplified scoring function with optimum docking performance. This article provides an overview of the development of the Vinardo scoring function, highlights its differences with Vina, and compares the performance of the two scoring functions in scoring, docking and virtual screening applications. Vinardo outperforms Vina in all tests performed, for all datasets analyzed. The Vinardo scoring function is available as an option within Smina, a fork of Vina, which is freely available under the GNU Public License v2.0 from http://smina.sf.net. Precompiled binaries, source code, documentation and a tutorial for using Smina to run the Vinardo scoring function are available at the same address.
Project description:Virtual screening by molecular docking has become a widely used approach to lead discovery in the pharmaceutical industry when a high-resolution structure of the biological target of interest is available. The performance of three widely used docking programs (Glide, GOLD, and DOCK) for virtual database screening is studied when they are applied to the same protein target and ligand set. Comparisons of the docking programs and scoring functions using a large and diverse data set of pharmaceutically interesting targets and active compounds are carried out. We focus on the problem of docking and scoring flexible compounds which are sterically capable of docking into a rigid conformation of the receptor. The Glide XP methodology is shown to consistently yield enrichments superior to the two alternative methods, while GOLD outperforms DOCK on average. The study also shows that docking into multiple receptor structures can decrease the docking error in screening a diverse set of active compounds.
Project description:A promising protein target for computational drug development, the human cluster of differentiation 38 (CD38), plays a crucial role in many physiological and pathological processes, primarily through the upstream regulation of factors that control cytoplasmic Ca2+ concentrations. Recently, a small-molecule inhibitor of CD38 was shown to slow down pathways relating to aging and DNA damage. We examined the performance of seven docking programs for their ability to model protein-ligand interactions with CD38. A test set of twelve CD38 crystal structures, containing crystallized biologically relevant substrates, were used to assess pose prediction. The rankings for each program based on the median RMSD between the native and predicted were Vina, AD4 > PLANTS, Gold, Glide, Molegro > rDock. Forty-two compounds with known affinities were docked to assess the accuracy of the programs at affinity/ranking predictions. The rankings based on scoring power were: Vina, PLANTS > Glide, Gold > Molegro >> AutoDock 4 >> rDock. Out of the top four performing programs, Glide had the only scoring function that did not appear to show bias towards overpredicting the affinity of the ligand-based on its size. Factors that affect the reliability of pose prediction and scoring are discussed. General limitations and known biases of scoring functions are examined, aided in part by using molecular fingerprints and Random Forest classifiers. This machine learning approach may be used to systematically diagnose molecular features that are correlated with poor scoring accuracy.
Project description:Repurposing has gained momentum globally and become an alternative avenue for drug discovery because of its better success rate, and reduced cost, time and issues related to safety than the conventional drug discovery process. Several drugs have already been successfully repurposed for other clinical conditions including drug resistant tuberculosis (DR-TB). Though TB can be cured completely with the use of currently available anti-tubercular drugs, emergence of drug resistant strains of Mycobacterium tuberculosis and the huge death toll globally, together necessitate urgently newer and effective drugs for TB. Therefore, we performed virtual screening of 1554 FDA approved drugs against murE, which is essential for peptidoglycan biosynthesis of M. tuberculosis. We used Glide and AutoDock Vina for virtual screening and applied rigid docking algorithm followed by induced fit docking algorithm in order to enhance the quality of the docking prediction and to prioritize drugs for repurposing. We found 17 drugs binding strongly with murE and three of them, namely, lymecycline, acarbose and desmopressin were consistently present within top 10 ranks by both Glide and AutoDock Vina in the induced fit docking algorithm, which strongly indicates that these three drugs are potential candidates for further studies towards repurposing for TB.
Project description:Rescoring is a simple approach that theoretically could improve the original docking results. In this study AutoDock Vina was used as a docked engine and three other scoring functions besides the original scoring function, Vina, as well as their combinations as consensus scoring functions were employed to explore the effect of rescoring on virtual screenings that had been done on diverse targets. Rescoring by DrugScore produces the most number of cases with significant changes in screening power. Thus, the DrugScore results were used to build a simple model based on two binding site descriptors that could predict possible improvement by DrugScore rescoring. Furthermore, generally the screening power of all rescoring approach as well as original AutoDock Vina docking results correlated with the Maximum Theoretical Shape Complementarity (MTSC) and Maximum Distance from Center of Mass and all Alpha spheres (MDCMA). Therefore, it was suggested that, with a more complete set of binding site descriptors, it could be possible to find robust relationship between binding site descriptors and response to certain molecular docking programs and scoring functions. The results could be helpful for future researches aiming to do a virtual screening using AutoDock Vina and/or rescoring using DrugScore.