Incorporating structural similarity into a scoring function to enhance the prediction of binding affinities.
ABSTRACT: In this study, we developed a novel algorithm to improve the screening performance of an arbitrary docking scoring function by recalibrating the docking score of a query compound based on its structure similarity with a set of training compounds, while the extra computational cost is neglectable. Two popular docking methods, Glide and AutoDock Vina were adopted as the original scoring functions to be processed with our new algorithm and similar improvement performance was achieved. Predicted binding affinities were compared against experimental data from ChEMBL and DUD-E databases. 11 representative drug receptors from diverse drug target categories were applied to evaluate the hybrid scoring function. The effects of four different fingerprints (FP2, FP3, FP4, and MACCS) and the four different compound similarity effect (CSE) functions were explored. Encouragingly, the screening performance was significantly improved for all 11 drug targets especially when CSE = S4 (S is the Tanimoto structural similarity) and FP2 fingerprint were applied. The average predictive index (PI) values increased from 0.34 to 0.66 and 0.39 to 0.71 for the Glide and AutoDock vina scoring functions, respectively. To evaluate the performance of the calibration algorithm in drug lead identification, we also imposed an upper limit on the structural similarity to mimic the real scenario of screening diverse libraries for which query ligands are general-purpose screening compounds and they are not necessarily structurally similar to reference ligands. Encouragingly, we found our hybrid scoring function still outperformed the original docking scoring function. The hybrid scoring function was further evaluated using external datasets for two systems and we found the PI values increased from 0.24 to 0.46 and 0.14 to 0.42 for A2AR and CFX systems, respectively. In a conclusion, our calibration algorithm can significantly improve the virtual screening performance in both drug lead optimization and identification phases with neglectable computational cost.
Project description:The failure of default scoring functions to ensure virtual screening enrichment is a persistent problem for the molecular docking algorithms used in structure-based drug discovery. To remedy this problem, elaborate rescoring and postprocessing schemes have been developed with a varying degree of success, specificity, and cost. The negative image-based rescoring (R-NiB) has been shown to improve the flexible docking performance markedly with a variety of drug targets. The yield improvement is achieved by comparing the alternative docking poses against the negative image of the target protein's ligand-binding cavity. In other words, the shape and electrostatics of the binding pocket is directly used in the similarity comparison to rank the explicit docking poses. Here, the PANTHER/ShaEP-based R-NiB methodology is tested with six popular docking softwares, including GLIDE, PLANTS, GOLD, DOCK, AUTODOCK, and AUTODOCK VINA, using five validated benchmark sets. Overall, the results indicate that R-NiB outperforms the default docking scoring consistently and inexpensively, demonstrating that the methodology is ready for wide-scale virtual screening usage.
Project description:BACKGROUND:Virtual screening is vital for contemporary drug discovery but striking performance fluctuations are commonly encountered, thus hampering error-free use. Results and Methodology: A conceptual framework is suggested for combining screening algorithms characterized by orthogonality (docking-scoring calculations, 3D shape similarity, 2D fingerprint similarity) into a simple, efficient and expansible python-based consensus ranking scheme. An original experimental dataset is created for comparing individual screening methods versus the novel approach. Its utilization leads to identification and phosphoproteomic evaluation of a cell-active DYRK1? inhibitor. CONCLUSION:Consensus ranking considerably stabilizes screening performance at reasonable computational cost, whereas individual screens are heavily dependent on calculation settings. Results indicate that the novel approach, currently available as a free online tool, is highly suitable for prospective screening by nonexperts.
Project description:Despite the large computational costs of molecular docking, the default scoring functions are often unable to recognize the active hits from the inactive molecules in large-scale virtual screening experiments. Thus, even though a correct binding pose might be sampled during the docking, the active compound or its biologically relevant pose is not necessarily given high enough score to arouse the attention. Various rescoring and post-processing approaches have emerged for improving the docking performance. Here, it is shown that the very early enrichment (number of actives scored higher than 1% of the highest ranked decoys) can be improved on average 2.5-fold or even 8.7-fold by comparing the docking-based ligand conformers directly against the target protein's cavity shape and electrostatics. The similarity comparison of the conformers is performed without geometry optimization against the negative image of the target protein's ligand-binding cavity using the negative image-based (NIB) screening protocol. The viability of the NIB rescoring or the R-NiB, pioneered in this study, was tested with 11 target proteins using benchmark libraries. By focusing on the shape/electrostatics complementarity of the ligand-receptor association, the R-NiB is able to improve the early enrichment of docking essentially without adding to the computing cost. By implementing consensus scoring, in which the R-NiB and the original docking scoring are weighted for optimal outcome, the early enrichment is improved to a level that facilitates effective drug discovery. Moreover, the use of equal weight from the original docking scoring and the R-NiB scoring improves the yield in most cases.
Project description:We compare established docking programs, AutoDock Vina and Schrödinger's Glide, to the recently published NNScore scoring functions. As expected, the best protocol to use in a virtual-screening project is highly dependent on the target receptor being studied. However, the mean screening performance obtained when candidate ligands are docked with Vina and rescored with NNScore 1.0 is not statistically different than the mean performance obtained when docking and scoring with Glide. We further demonstrate that the Vina and NNScore docking scores both correlate with chemical properties like small-molecule size and polarizability. Compensating for these potential biases leads to improvements in virtual screen performance. Composite NNScore-based scoring functions suited to a specific receptor further improve performance. We are hopeful that the current study will prove useful for those interested in computer-aided drug design.
Project description:Accuracy and limitations of automatic scoring of sleep stages and electroencephalogram arousals from a single derivation (Fp1 -Fp2 ) were studied in 29 healthy adults using a portable wireless polysomnographic recorder. All recordings were scored five times: twice by a referent scorer who viewed the standard polysomnographic montage and observed the American Academy of Sleep Medicine rules (referent scoring and blind rescoring); and once by the same scorer who viewed only the Fp1 -Fp2 signal (alternative scoring), by another expert from the same institution, and by the algorithm. Automatic, alternative and independent expert scoring were compared with the referent scoring on an epoch-by-epoch basis. The algorithm's agreement with the reference (81.0%, Cohen's ? = 0.75) was comparable to the inter-rater agreement (83.3%, Cohen's ? = 0.78) or agreement between the referent scoring and manual scoring of the frontopolar derivation (80.7%, Cohen's ? = 0.75). Most misclassifications by the algorithm occurred during uneventful wake/sleep transitions, whereas cortical arousals, rapid eye movement and stable non-rapid eye movement sleep were detected accurately. The algorithm yielded accurate estimates of total sleep time, sleep efficiency, sleep latency, arousal indices and times spent in different stages. The findings affirm the utility of automatic scoring of stages and arousals from a single frontopolar derivation as a method for assessment of sleep architecture in healthy adults.
Project description:Computational modeling of drug binding to proteins is an integral component of direct drug design. Particularly, structure-based virtual screening is often used to perform large-scale modeling of putative associations between small organic molecules and their pharmacologically relevant protein targets. Because of a large number of drug candidates to be evaluated, an accurate and fast docking engine is a critical element of virtual screening. Consequently, highly optimized docking codes are of paramount importance for the effectiveness of virtual screening methods. In this communication, we describe the implementation, tuning and performance characteristics of GeauxDock, a recently developed molecular docking program. GeauxDock is built upon the Monte Carlo algorithm and features a novel scoring function combining physics-based energy terms with statistical and knowledge-based potentials. Developed specifically for heterogeneous computing platforms, the current version of GeauxDock can be deployed on modern, multi-core Central Processing Units (CPUs) as well as massively parallel accelerators, Intel Xeon Phi and NVIDIA Graphics Processing Unit (GPU). First, we carried out a thorough performance tuning of the high-level framework and the docking kernel to produce a fast serial code, which was then ported to shared-memory multi-core CPUs yielding a near-ideal scaling. Further, using Xeon Phi gives 1.9× performance improvement over a dual 10-core Xeon CPU, whereas the best GPU accelerator, GeForce GTX 980, achieves a speedup as high as 3.5×. On that account, GeauxDock can take advantage of modern heterogeneous architectures to considerably accelerate structure-based virtual screening applications. GeauxDock is open-sourced and publicly available at www.brylinski.org/geauxdock and https://figshare.com/articles/geauxdock_tar_gz/3205249.
Project description:Ligand docking into homology models of G-protein-coupled receptors (GPCRs) is a widely used approach in computational compound screening. The generation of "double-hypothetical" models of ligand-target complexes has intrinsic accuracy limitations that further complicate compound ranking and selection compared to those of X-ray structures. Given these uncertainties, we have explored "fuzzy 3D similarity" between hypothetical binding modes of known ligands in homology models and docking poses of database compounds as an alternative to conventional scoring schemes. Therefore, GPCR homology models at varying accuracy levels were generated and used for docking. Increases in recall performance were observed for fuzzy 3D similarity ranking using single or multiple ligand poses compared to that of conventional scoring functions and interaction fingerprints. Fuzzy similarity ranking was also successfully applied to docking into an external model of a GPCR for which no experimental structure is currently available. Taken together, our results indicate that the use of putative ligand poses, albeit approximate at best, increases the odds of identifying active compounds in docking screens of GPCR homology models.
Project description:False negative docking outcomes for highly symmetric molecules are a barrier to the accurate evaluation of docking programs, scoring functions, and protocols. This work describes an implementation of a symmetry-corrected root-mean-square deviation (RMSD) method into the program DOCK based on the Hungarian algorithm for solving the minimum assignment problem, which dynamically assigns atom correspondence in molecules with symmetry. The algorithm adds only a trivial amount of computation time to the RMSD calculations and is shown to increase the reported overall docking success rate by approximately 5% when tested over 1043 receptor-ligand systems. For some families of protein systems the results are even more dramatic, with success rate increases up to 16.7%. Several additional applications of the method are also presented including as a pairwise similarity metric to compare molecules during de novo design, as a scoring function to rank-order virtual screening results, and for the analysis of trajectories from molecular dynamics simulation. The new method, including source code, is available to registered users of DOCK6 ( http://dock.compbio.ucsf.edu ).
Project description:Molecular docking is the most commonly used technique in the modern drug discovery process where computational approaches involving docking algorithms are used to dock small molecules into macromolecular target structures. Over the recent years several evaluation studies have been reported by independent scientists comparing the performance of the docking programs by using default 'black box' protocols supplied by the software companies. Such studies have to be considered carefully as the docking programs can be tweaked towards optimum performance by selecting the parameters suitable for the target of interest. In this study we address the problem of selecting an appropriate docking and scoring function combination (88 docking algorithm-scoring functions) for substrate specificity predictions for feruloyl esterases, an industrially relevant enzyme family. We also propose the 'Key Interaction Score System' (KISS), a more biochemically meaningful measure for evaluation of docking programs based on pose prediction accuracy.
Project description:Evaluation of docking results is one of the most important problems for virtual screening and in silico drug design. Modern approaches for the identification of active compounds in a large data set of docked molecules use energy scoring functions. One of the general and most significant limitations of these methods relates to inaccurate binding energy estimation, which results in false scoring of docked compounds. Automatic analysis of poses using self-organizing maps (AuPosSOM) represents an alternative approach for the evaluation of docking results based on the clustering of compounds by the similarity of their contacts with the receptor. A scoring function was developed for the identification of the active compounds in the AuPosSOM clustered dataset. In addition, the AuPosSOM efficiency for the clustering of compounds and the identification of key contacts considered as important for its activity, were also improved. Benchmark tests for several targets revealed that together with the developed scoring function, AuPosSOM represents a good alternative to the energy-based scoring functions for the evaluation of docking results.