Fragment-based Shape Signatures: a new tool for virtual screening and drug discovery.
ABSTRACT: Since its introduction in 2003, the Shape Signatures method has been successfully applied in a number of drug design projects. Because it uses a ray-tracing approach to directly measure molecular shape and properties (as opposed to relying on chemical structure), it excels at scaffold hopping, and is extraordinarily easy to use. Despite its advantages, a significant drawback of the method has hampered its application to certain classes of problems; namely, when the chemical structures considered are large and contain heterogeneous ring-systems, the method produces descriptors that tend to merely measure the overall size of the molecule, and begin to lose selective power. To remedy this, the approach has been reformulated to automatically decompose compounds into fragments using ring systems as anchors, and to likewise partition the ray-trace in accordance with the fragment assignments. Subsequently, descriptors are generated that are fragment-based, and query and target molecules are compared by mapping query fragments onto target fragments in all ways consistent with the underlying chemical connectivity. This has proven to greatly extend the selective power of the method, while maintaining the ease of use and scaffold-hopping capabilities that characterized the original implementation. In this work, we provide a full conceptual description of the next generation Shape Signatures, and we underline the advantages of the method by discussing its practical applications to ligand-based virtual screening. The new approach can also be applied in receptor-based mode, where protein-binding sites (partitioned into subsites) can be matched against the new fragment-based Shape Signatures descriptors of library compounds.
Project description:We previously described a structure-based fragment hopping for lead optimization using a pre-docked fragment database, "LeadOp," that conceptually replaced "bad" fragments of a ligand with "good" fragments while leaving the core of the ligand intact thus improving the compound's activity. LeadOp was proven to optimize the query molecules and systematically developed improved analogs for each of our example systems. However, even with the fragment-based design from common building blocks, it is still a challenge for synthesis. In this work, "LeadOp+R" was developed based on 198 classical chemical reactions to consider the synthetic accessibility while optimizing leads. LeadOp+R first allows user to identify a preserved space defined by the volume occupied by a fragment of the query molecule to be preserved. Then LeadOp+R searches for building blocks with the same preserved space as initial reactants and grows molecules toward the preferred receptor-ligand interactions according to reaction rules from reaction database in LeadOp+R. Multiple conformers of each intermediate product were considered and evaluated at each step. The conformer with the best group efficiency score would be selected as the initial conformer of the next building block until the program finished optimization for all selected receptor-ligand interactions. The LeadOp+R method was tested with two biomolecular systems: Tie-2 kinase and human 5-lipoxygenase. The LeadOp+R methodology was able to optimize the query molecules and systematically developed improved analogs for each of our example systems. The suggested synthetic routes for compounds proposed by LeadOp+R were the same as the published synthetic routes devised by the synthetic/organic chemists.
Project description:The discovery of novel ligand chemotypes allows to explore uncharted regions in chemical space, thereby potentially improving synthetic accessibility, potency, and the drug-likeness of molecules. Here, we demonstrate the scaffold-hopping ability of the new Weighted Holistic Atom Localization and Entity Shape (WHALES) molecular descriptors compared to seven state-of-the-art molecular representations on 30,000 compounds and 182 biological targets. In a prospective application, we apply WHALES to the discovery of novel retinoid X receptor (RXR) modulators. WHALES descriptors identified four agonists with innovative molecular scaffolds, populating uncharted regions of the chemical space. One of the agonists, possessing a rare non-acidic chemotype, revealed high selectivity on 12 nuclear receptors and comparable efficacy as bexarotene on induction of ATP-binding cassette transporter A1, angiopoietin like protein 4 and apolipoprotein E. The outcome of this research supports WHALES as an innovative tool to explore novel regions of the chemical space and to detect novel bioactive chemotypes by straightforward similarity searching.
Project description:Tyrosinase is the key enzyme involved in the human pigmentation process, as well as the undesired browning of fruits and vegetables. Compounds inhibiting tyrosinase catalytic activity are an important class of cosmetic and dermatological agents which show high potential as depigmentation agents used for skin lightening. The multi-step protocol employed for the identification of novel tyrosinase inhibitors incorporated the Shape Signatures computational algorithm for rapid screening of chemical libraries. This algorithm converts the size and shape of a molecule, as well its surface charge distribution and other bio-relevant properties, into compact histograms (signatures) that lend themselves to rapid comparison between molecules. Shape Signatures excels at scaffold hopping across different chemical families, which enables identification of new actives whose molecular structure is distinct from other known actives. Using this approach, we identified a novel class of depigmentation agents that demonstrated promise for skin lightening product development.
Project description:We have applied simulated annealing of chemical potential (SACP) to a diverse set of ?150 very small molecules to provide insights into new interactions in the binding pocket of human renin, a historically difficult target for which to find low molecular weight (MW) inhibitors with good bioavailability. In one of its many uses in drug discovery, SACP provides an efficient, thermodynamically principled method of ranking chemotype replacements for scaffold hopping and manipulating physicochemical characteristics for drug development. We introduce the use of Constrained Fragment Analysis (CFA) to construct and analyze ligands composed of linking those fragments with predicted high affinity. This technique addresses the issue of effectively linking fragments together and provides a predictive mechanism to rank order prospective inhibitors for synthesis. The application of these techniques to the identification of novel inhibitors of human renin is described. Synthesis of a limited set of designed compounds provided potent, low MW analogs (IC50s<100nM) with good oral bioavailability (F>20-58%).
Project description:The maximum common property similarity (MCPhd) method is presented using descriptors as a new approach to determine the similarity between two chemical compounds or molecular graphs. This method uses the concept of maximum common property arising from the concept of maximum common substructure and is based on the electrotopographic state index for atoms. A new algorithm to quantify the similarity values of chemical structures based on the presented maximum common property concept is also developed in this paper. To verify the validity of this approach, the similarity of a sample of compounds with antimalarial activity is calculated and compared with the results obtained by four different similarity methods: the small molecule subgraph detector (SMSD), molecular fingerprint based (OBabel_FP2), ISIDA descriptors and shape-feature similarity (SHAFTS). The results obtained by the MCPhd method differ significantly from those obtained by the compared methods, improving the quantification of the similarity. A major advantage of the proposed method is that it helps to understand the analogy or proximity between physicochemical properties of the molecular fragments or subgraphs compared with the biological response or biological activity. In this new approach, more than one property can be potentially used. The method can be considered a hybrid procedure because it combines descriptor and the fragment approaches.
Project description:<h4>Motivation</h4>Using molecular similarity to discover bioactive small molecules with novel chemical scaffolds can be computationally demanding. We describe Ultra-fast Shape Recognition with Atom Types (UFSRAT), an efficient algorithm that considers both the 3D distribution (shape) and electrostatics of atoms to score and retrieve molecules capable of making similar interactions to those of the supplied query.<h4>Results</h4>Computational optimization and pre-calculation of molecular descriptors enables a query molecule to be run against a database containing 3.8 million molecules and results returned in under 10 seconds on modest hardware. UFSRAT has been used in pipelines to identify bioactive molecules for two clinically relevant drug targets; FK506-Binding Protein 12 and 11?-hydroxysteroid dehydrogenase type 1. In the case of FK506-Binding Protein 12, UFSRAT was used as the first step in a structure-based virtual screening pipeline, yielding many actives, of which the most active shows a KD, app of 281 µM and contains a substructure present in the query compound. Success was also achieved running solely the UFSRAT technique to identify new actives for 11?-hydroxysteroid dehydrogenase type 1, for which the most active displays an IC50 of 67 nM in a cell based assay and contains a substructure radically different to the query. This demonstrates the valuable ability of the UFSRAT algorithm to perform scaffold hops.<h4>Availability and implementation</h4>A web-based implementation of the algorithm is freely available at http://opus.bch.ed.ac.uk/ufsrat/.
Project description:The goals of the present study were to apply a generalized regression model and support vector machine (SVM) models with Shape Signatures descriptors, to the domain of blood-brain barrier (BBB) modeling.The Shape Signatures method is a novel computational tool that was used to generate molecular descriptors utilized with the SVM classification technique with various BBB datasets. For comparison purposes we have created a generalized linear regression model with eight MOE descriptors and these same descriptors were also used to create SVM models.The generalized regression model was tested on 100 molecules not in the model and resulted in a correlation r2 = 0.65. SVM models with MOE descriptors were superior to regression models, while Shape Signatures SVM models were comparable or better than those with MOE descriptors. The best 2D shape signature models had 10-fold cross validation prediction accuracy between 80-83% and leave-20%-out testing prediction accuracy between 80-82% as well as correctly predicting 84% of BBB+ compounds (n = 95) in an external database of drugs.Our data indicate that Shape Signatures descriptors can be used with SVM and these models may have utility for predicting blood-brain barrier permeation in drug discovery.
Project description:Molecular shape is an important concept in drug design and virtual screening. Shape similarity typically uses either alignment methods, which dynamically optimize molecular poses with respect to the query molecular shape, or feature vector methods, which are computationally less demanding but less accurate. The computational cost of alignment can be reduced by pre-aligning shapes, as is done with the Volumetric-Aligned Molecular Shapes (VAMS) method. Here, we introduce and evaluate fragment oriented molecular shapes (FOMS), where shapes are aligned based on molecular fragments. FOMS enables the use of shape constraints, a novel method for precisely specifying molecular shape queries that provides the ability to perform partial shape matching and supports search algorithms that function on an interactive time scale. When evaluated using the challenging Maximum Unbiased Validation dataset, shape constraints were able to extract significantly enriched subsets of compounds for the majority of targets, and FOMS matched or exceeded the performance of both VAMS and an optimizing alignment method of shape similarity search.
Project description:The wwLigCSRre web server performs ligand-based screening using a 3D molecular similarity engine. Its aim is to provide an online versatile facility to assist the exploration of the chemical similarity of families of compounds, or to propose some scaffold hopping from a query compound. The service allows the user to screen several chemically diversified focused banks, such as Kinase-, CNS-, GPCR-, Ion-channel-, Antibacterial-, Anticancer- and Analgesic-focused libraries. The server also provides the possibility to screen the DrugBank and DSSTOX/Carcinogenic compounds databases. User banks can also been downloaded. The 3D similarity search combines both geometrical (3D) and physicochemical information. Starting from one 3D ligand molecule as query, the screening of such databases can lead to unraveled compound scaffold as hits or help to optimize previously identified hit molecules in a SAR (Structure activity relationship) project. wwLigCSRre can be accessed at http://bioserv.rpbs.univ-paris-diderot.fr/wwLigCSRre.html.
Project description:<h4>Background</h4>Mammalian target of rapamycin (mTOR) is a central controller of cell growth, proliferation, metabolism, and angiogenesis. Thus, there is a great deal of interest in developing clinical drugs based on mTOR. In this paper, in silico models based on multi-scaffolds were developed to predict mTOR inhibitors or non-inhibitors.<h4>Methods</h4>First 1,264 diverse compounds were collected and categorized as mTOR inhibitors and non-inhibitors. Two methods, recursive partitioning (RP) and naïve Bayesian (NB), were used to build combinatorial classification models of mTOR inhibitors versus non-inhibitors using physicochemical descriptors, fingerprints, and atom center fragments (ACFs).<h4>Results</h4>A total of 253 models were constructed and the overall predictive accuracies of the best models were more than 90% for both the training set of 964 and the external test set of 300 diverse compounds. The scaffold hopping abilities of the best models were successfully evaluated through predicting 37 new recently published mTOR inhibitors. Compared with the best RP and Bayesian models, the classifier based on ACFs and Bayesian shows comparable or slightly better in performance and scaffold hopping abilities. A web server was developed based on the ACFs and Bayesian method (http://rcdd.sysu.edu.cn/mtor/). This web server can be used to predict whether a compound is an mTOR inhibitor or non-inhibitor online.<h4>Conclusion</h4>In silico models were constructed to predict mTOR inhibitors using recursive partitioning and naïve Bayesian methods, and a web server (mTOR Predictor) was also developed based on the best model results. Compound prediction or virtual screening can be carried out through our web server. Moreover, the favorable and unfavorable fragments for mTOR inhibitors obtained from Bayesian classifiers will be helpful for lead optimization or the design of new mTOR inhibitors.