Automated ligand fitting by core-fragment fitting and extension into density.
ABSTRACT: A procedure for fitting of ligands to electron-density maps by first fitting a core fragment of the ligand to density and then extending the remainder of the ligand into density is presented. The approach was tested by fitting 9327 ligands over a wide range of resolutions (most are in the range 0.8-4.8 A) from the Protein Data Bank (PDB) into (Fo - Fc)exp(i phi(c)) difference density calculated using entries from the PDB without these ligands. The procedure was able to place 58% of these 9327 ligands within 2 A (r.m.s.d.) of the coordinates of the atoms in the original PDB entry for that ligand. The success of the fitting procedure was relatively insensitive to the size of the ligand in the range 10-100 non-H atoms and was only moderately sensitive to resolution, with the percentage of ligands placed near the coordinates of the original PDB entry for fits in the range 58-73% over all resolution ranges tested.
Project description:BACKGROUND:Many Protein Data Bank (PDB) users assume that the deposited structural models are of high quality but forget that these models are derived from the interpretation of experimental data. The accuracy of atom coordinates is not homogeneous between models or throughout the same model. To avoid basing a research project on a flawed model, we present a tool for assessing the quality of ligands and binding sites in crystallographic models from the PDB. RESULTS:The Validation HElper for LIgands and Binding Sites (VHELIBS) is software that aims to ease the validation of binding site and ligand coordinates for non-crystallographers (i.e., users with little or no crystallography knowledge). Using a convenient graphical user interface, it allows one to check how ligand and binding site coordinates fit to the electron density map. VHELIBS can use models from either the PDB or the PDB_REDO databank of re-refined and re-built crystallographic models. The user can specify threshold values for a series of properties related to the fit of coordinates to electron density (Real Space R, Real Space Correlation Coefficient and average occupancy are used by default). VHELIBS will automatically classify residues and ligands as Good, Dubious or Bad based on the specified limits. The user is also able to visually check the quality of the fit of residues and ligands to the electron density map and reclassify them if needed. CONCLUSIONS:VHELIBS allows inexperienced users to examine the binding site and the ligand coordinates in relation to the experimental data. This is an important step to evaluate models for their fitness for drug discovery purposes such as structure-based pharmacophore development and protein-ligand docking experiments.
Project description:Crystal structures of protein-ligand complexes are often used to infer biology and inform structure-based drug discovery. Hence, it is important to build accurate, reliable models of ligands that give confidence in the interpretation of the respective protein-ligand complex. This paper discusses key stages in the ligand-fitting process, including ligand binding-site identification, ligand description and conformer generation, ligand fitting, refinement and subsequent validation. The CCP4 suite contains a number of software tools that facilitate this task: AceDRG for the creation of ligand descriptions and conformers, Lidia and JLigand for two-dimensional and three-dimensional ligand editing and visual analysis, Coot for density interpretation, ligand fitting, analysis and validation, and REFMAC5 for macromolecular refinement. In addition to recent advancements in automatic carbohydrate building in Coot (LO/Carb) and ligand-validation tools (FLEV), the release of the CCP4i2 GUI provides an integrated solution that streamlines the ligand-fitting workflow, seamlessly passing results from one program to the next. The ligand-fitting process is illustrated using instructive practical examples, including problematic cases such as post-translational modifications, highlighting the need for careful analysis and rigorous validation.
Project description:A semi-automated computational procedure to assist in the identification of bound ligands from unknown electron density has been developed. The atomic surface surrounding the density blob is compared to a library of three-dimensional ligand binding surfaces extracted from the Protein Data Bank (PDB). Ligands corresponding to surfaces which share physicochemical texture and geometric shape similarities are considered for assignment. The method is benchmarked against a set of well represented ligands from the PDB, in which we show that we can identify the correct ligand based on the corresponding binding surface. Finally, we apply the method during model building and refinement stages from structural genomics targets in which unknown density blobs were discovered. A semi-automated computational method is described which aims to assist crystallographers with assigning the identity of a ligand corresponding to unknown electron density. Using shape and physicochemical similarity assessments between the protein surface surrounding the density and a database of known ligand binding surfaces, a plausible list of candidate ligands are identified for consideration. The method is validated against highly observed ligands from the Protein Data Bank and results are shown from its use in a high-throughput structural genomics pipeline.
Project description:Despite significant advances in resolution, the potential for cryo-electron microscopy (EM) to be used in determining the structures of protein-drug complexes remains unrealized. Determination of accurate structures and coordination of bound ligands necessitates simultaneous fitting of the models into the density envelopes, exhaustive sampling of the ligand geometries, and, most importantly, concomitant rearrangements in the side chains to optimize the binding energy changes. In this article, we present a flexible-fitting pipeline where molecular dynamics flexible fitting (MDFF) is used to refine structures of protein-ligand complexes from 3 to 5 Å electron density data. Enhanced sampling is employed to explore the binding pocket rearrangements. To provide a model that can accurately describe the conformational dynamics of the chemically diverse set of small-molecule drugs inside MDFF, we use QM/MM and neural-network potential (NNP)/MM models of protein-ligand complexes, where the ligand is represented using the QM or NNP model, and the protein is represented using established molecular mechanical force fields (e.g., CHARMM). This pipeline offers structures commensurate to or better than recently submitted high-resolution cryo-EM or X-ray models, even when given medium to low-resolution data as input. The use of the NNPs makes the algorithm more robust to the choice of search models, offering a radius of convergence of 6.5 Å for ligand structure determination. The quality of the predicted structures was also judged by density functional theory calculations of ligand strain energy. This strain potential energy is found to systematically decrease with better fitting to density and improved ligand coordination, indicating correct binding interactions. A computationally inexpensive protocol for computing strain energy is reported as part of the model analysis protocol that monitors both the ligand fit as well as model quality.
Project description:Protein Data Bank (PDB) file contains atomic data for protein and ligand in protein-ligand complexes. Structure data file (SDF) contains data for atoms, bonds, connectivity and coordinates of molecule for ligands. We describe PDBToSDF as a tool to separate the ligand data from pdb file for the calculation of ligand properties like molecular weight, number of hydrogen bond acceptors, hydrogen bond receptors easily.
Project description:The efficiency of the ligand-building module of ARP/wARP version 6.1 has been assessed through extensive tests on a large variety of protein-ligand complexes from the PDB, as available from the Uppsala Electron Density Server. Ligand building in ARP/wARP involves two main steps: automatic identification of the location of the ligand and the actual construction of its atomic model. The first step is most successful for large ligands. The second step, ligand construction, is more powerful with X-ray data at high resolution and ligands of small to medium size. Both steps are successful for ligands with low to moderate atomic displacement parameters. The results highlight the strengths and weaknesses of both the method of ligand building and the large-scale validation procedure and help to identify means of further improvement.
Project description:The properties and reactivities of transition metal complexes are often discussed in terms of Ligand Field Theory (LFT), and with ab initio LFT a direct connection to quantum chemical wavefunctions was recently established. The Angular Overlap Model (AOM) is a widely used, ligand-specific parameterization scheme of the ligand field splitting that has, however, been restricted by the availability and resolution of experimental data. Using ab initio LFT, we present here a generalised, symmetry-independent and automated fitting procedure for AOM parameters that is even applicable to formally underdetermined or experimentally inaccessible systems. This method allows quantitative evaluations of assumptions commonly made in AOM applications, for example, transferability or the relative magnitudes of AOM parameters, and the response of the ligand field to structural or electronic changes. A two-dimensional spectrochemical series of tetrahedral halido metalates ([M<sup>II</sup> X<sub>4</sub> ]<sup>2-</sup> , M=Mn-Cu) served as a case study. A previously unknown linear relationship between the halide ligands' chemical hardness and their AOM parameters was found. The impartial and automated procedure for identifying AOM parameters introduced here can be used to systematically improve our understanding of ligand-metal interactions in coordination complexes.
Project description:Cryo-elecron microscopy (cryo-EM) can provide important structural information of large macromolecular assemblies in different conformational states. Recent years have seen an increase in structures deposited in the Protein Data Bank (PDB) by fitting a high-resolution structure into its low-resolution cryo-EM map. A commonly used protocol for accommodating the conformational changes between the X-ray structure and the cryo-EM map is rigid body fitting of individual domains. With the emergence of different flexible fitting approaches, there is a need to compare and revise these different protocols for the fitting. We have applied three diverse automated flexible fitting approaches on a protein dataset for which rigid domain fitting (RDF) models have been deposited in the PDB. In general, a consensus is observed in the conformations, which indicates a convergence from these theoretically different approaches to the most probable solution corresponding to the cryo-EM map. However, the result shows that the convergence might not be observed for proteins with complex conformational changes or with missing densities in cryo-EM map. In contrast, RDF structures deposited in the PDB can represent conformations that not only differ from the consensus obtained by flexible fitting but also from X-ray crystallography. Thus, this study emphasizes that a "consensus" achieved by the use of several automated flexible fitting approaches can provide a higher level of confidence in the modeled configurations. Following this protocol not only increases the confidence level of fitting, but also highlights protein regions with uncertain fitting. Hence, this protocol can lead to better interpretation of cryo-EM data.
Project description:A new procedure, AXES, is introduced for fitting small-angle X-ray scattering (SAXS) data to macromolecular structures and ensembles of structures. By using explicit water models to account for the effect of solvent, and by restricting the adjustable fitting parameters to those that dominate experimental uncertainties, including sample/buffer rescaling, detector dark current, and, within a narrow range, hydration layer density, superior fits between experimental high resolution structures and SAXS data are obtained. AXES results are found to be more discriminating than standard Crysol fitting of SAXS data when evaluating poorly or incorrectly modeled protein structures. AXES results for ensembles of structures previously generated for ubiquitin show improved fits over fitting of the individual members of these ensembles, indicating these ensembles capture the dynamic behavior of proteins in solution.
Project description:Advances in electron microscopy (EM) allow for structure determination of large biological assemblies at increasingly higher resolutions. A key step in this process is fitting multiple component structures into an EM-derived density map of their assembly. Here, we describe a web server for this task. The server takes as input a set of protein structures in the PDB format and an EM density map in the MRC format. The output is an ensemble of models ranked by their quality of fit to the density map. The models can be viewed online or downloaded from the website. The service is available at; http://salilab.org/multifit/ and http://bioinfo3d.cs.tau.ac.il/.