Project description:The field of protein design has grown enormously in the past few decades. In this review we discuss the minimalist approach to design of artificial enzymes, in which protein sequences are created with the minimum number of elements for folding and function. This method relies on identifying starting points in catalytically inert scaffolds for active site installation. The progress of the field from the original helical assemblies of the 1980s to the more complex structures of the present day is discussed, highlighting the variety of catalytic reactions which have been achieved using these methods. We outline the strengths and weaknesses of the minimalist approaches, describe representative design cases and put it in the general context of the de novo design of proteins.
Project description:The Rosetta de novo enzyme design protocol has been used to design enzyme catalysts for a variety of chemical reactions, and in principle can be applied to any arbitrary chemical reaction of interest. The process has four stages: 1) choice of a catalytic mechanism and corresponding minimal model active site, 2) identification of sites in a set of scaffold proteins where this minimal active site can be realized, 3) optimization of the identities of the surrounding residues for stabilizing interactions with the transition state and primary catalytic residues, and 4) evaluation and ranking the resulting designed sequences. Stages two through four of this process can be carried out with the Rosetta package, while stage one needs to be done externally. Here, we demonstrate how to carry out the Rosetta enzyme design protocol from start to end in detail using for illustration the triosephosphate isomerase reaction.
Project description:The recent covid crisis has provided important lessons for academia and industry regarding digital reorganization. Among the fascinating lessons from these times is the huge potential of data analytics and artificial intelligence. The crisis exponentially accelerated the adoption of analytics and artificial intelligence, and this momentum is predicted to continue into the 2020s and beyond. Drug development is a costly and time-consuming business, and only a minority of approved drugs generate returns exceeding the research and development costs. As a result, there is a huge drive to make drug discovery cheaper and faster. With modern algorithms and hardware, it is not too surprising that the new technologies of artificial intelligence and other computational simulation tools can help drug developers. In only two years of covid research, many novel molecules have been designed/identified using artificial intelligence methods with astonishing results in terms of time and effectiveness. This paper reviews the most significant research on artificial intelligence in de novo drug design for COVID-19 pharmaceutical research.
Project description:Generative artificial intelligence offers a fresh view on molecular design. We present the first-time prospective application of a deep learning model for designing new druglike compounds with desired activities. For this purpose, we trained a recurrent neural network to capture the constitution of a large set of known bioactive compounds represented as SMILES strings. By transfer learning, this general model was fine-tuned on recognizing retinoid X and peroxisome proliferator-activated receptor agonists. We synthesized five top-ranking compounds designed by the generative model. Four of the compounds revealed nanomolar to low-micromolar receptor modulatory activity in cell-based assays. Apparently, the computational model intrinsically captured relevant chemical and biological knowledge without the need for explicit rules. The results of this study advocate generative artificial intelligence for prospective de novo molecular design, and demonstrate the potential of these methods for future medicinal chemistry.
Project description:An extensive search for active therapeutic agents against the SARS-CoV-2 is being conducted across the globe. While computational docking simulations remain a popular method of choice for the in silico ligand design and high-throughput screening of therapeutic agents, it is severely limited in the discovery of new candidate ligands owing to the high computational cost and vast chemical space. Here, we present a de novo molecular design strategy that leverages artificial intelligence (AI) to discover new therapeutic agents against SARS-CoV-2. A Monte Carlo tree search algorithm combined with a multitask neural network surrogate model for expensive docking simulations, and recurrent neural networks for rollouts, is used in an iterative search and retrain strategy. Using Vina scores as the target objective to measure binding to either the isolated spike protein (S-protein) at its host receptor region or to the S-protein/angiotensin converting enzyme 2 receptor interface, we generate several (∼100's) new therapeutic agents that outperform Food and Drug Administration (FDA) (∼1000's) and non-FDA molecules (∼million). Our AI strategy is broadly applicable for accelerated design and discovery of chemical molecules with any user-desired functionality.
Project description:Automating the molecular design-make-test-analyze cycle accelerates hit and lead finding for drug discovery. Using deep learning for molecular design and a microfluidics platform for on-chip chemical synthesis, liver X receptor (LXR) agonists were generated from scratch. The computational pipeline was tuned to explore the chemical space of known LXRα agonists and generate novel molecular candidates. To ensure compatibility with automated on-chip synthesis, the chemical space was confined to the virtual products obtainable from 17 one-step reactions. Twenty-five de novo designs were successfully synthesized in flow. In vitro screening of the crude reaction products revealed 17 (68%) hits, with up to 60-fold LXR activation. The batch resynthesis, purification, and retesting of 14 of these compounds confirmed that 12 of them were potent LXR agonists. These results support the suitability of the proposed design-make-test-analyze framework as a blueprint for automated drug design with artificial intelligence and miniaturized bench-top synthesis.
Project description:Natural metalloproteins perform many functions - ranging from sensing to electron transfer and catalysis - in which the position and property of each ligand and metal, is dictated by protein structure. De novo protein design aims to define an amino acid sequence that encodes a specific structure and function, providing a critical test of the hypothetical inner workings of (metallo)proteins. To date, de novo metalloproteins have used simple, symmetric tertiary structures - uncomplicated by the large size and evolutionary marks of natural proteins - to interrogate structure-function hypotheses. In this Review, we discuss de novo design applications, such as proteins that induce complex, increasingly asymmetric ligand geometries to achieve function, as well as the use of more canonical ligand geometries to achieve stability. De novo design has been used to explore how proteins fine-tune redox potentials and catalyse both oxidative and hydrolytic reactions. With an increased understanding of structure-function relationships, functional proteins including O2-dependent oxidases, fast hydrolases, and multi-proton/multi-electron reductases, have been created. In addition, proteins can now be designed using xeno-biological metals or cofactors and principles from inorganic chemistry to derive new-to-nature functions. These results and the advances in computational protein design suggest a bright future for the de novo design of diverse, functional metalloproteins.
Project description:We report the cocrystal structures of a computationally designed and experimentally optimized retro-aldol enzyme with covalently bound substrate analogs. The structure with a covalently bound mechanism-based inhibitor is similar to, but not identical with, the design model, with an RMSD of 1.4 Å over active-site residues and equivalent substrate atoms. As in the design model, the binding pocket orients the substrate through hydrophobic interactions with the naphthyl moiety such that the oxygen atoms analogous to the carbinolamine and ?-hydroxyl oxygens are positioned near a network of bound waters. However, there are differences between the design model and the structure: the orientation of the naphthyl group and the conformation of the catalytic lysine are slightly different; the bound water network appears to be more extensive; and the bound substrate analog exhibits more conformational heterogeneity than typical native enzyme-inhibitor complexes. Alanine scanning of the active-site residues shows that both the catalytic lysine and the residues around the binding pocket for the substrate naphthyl group make critical contributions to catalysis. Mutating the set of water-coordinating residues also significantly reduces catalytic activity. The crystal structure of the enzyme with a smaller substrate analog that lacks naphthyl ring shows the catalytic lysine to be more flexible than in the naphthyl-substrate complex; increased preorganization of the active site would likely improve catalysis. The covalently bound complex structures and mutagenesis data highlight the strengths and weaknesses of the de novo enzyme design strategy.
Project description:Copper-containing metalloenzymes constitute a major class of proteins which catalyze a myriad of reactions in nature. Inspired by the structural and functional characteristics of this unique class of metalloenzymes, we report the conception, design, characterization, and functional studies of a de novo artificial copper peptide (ArCuP) within a trimeric self-assembled polypeptide scaffold that activates and reduces peroxide. Using a first principles approach, the ArCuP was designed to coordinate one Cu via three His residues introduced at an a site of the peptide scaffold. X-ray crystallographic, UV-vis and EPR data demonstrate that Cu binds via the Nε atoms of His forming a T2Cu environment. When reacted with hydrogen peroxide, the putative copper-hydroperoxo species is formed where a reductive priming step accelerates the rate of its formation and reduction. Mass spectrometry was used to identify specific residues undergoing oxidative modification, which showed His oxidation only in the reduced state. The redox behavior of the ArCuP was elucidated by protein film voltammetry. Detailed characterization of the electrocatalytic behavior of the ArCuP led us to determine the catalytic parameters (KM, kcat), which established the peroxidase activity of the ArCuP. Combined spectroscopic and electrochemical data showed a pH-dependence on the reactivity, which was optimum at pH 7.5.
Project description:The Escherichia coli disulfide isomerase, DsbC is a V-shaped homodimer with each monomer comprising a dimerization region that forms part of a putative peptide-binding pocket and a thioredoxin catalytic domain. Disulfide isomerases from prokaryotes and eukaryotes exhibit little sequence homology but display very similar structural organization with two thioredoxin domains facing each other on top of the dimerization/peptide-binding region. To aid the understanding of the mechanistic significance of thioredoxin domain dimerization and of the peptide-binding cleft of DsbC, we constructed a series of protein chimeras comprising unrelated protein dimerization domains fused to thioredoxin superfamily enzymes. Chimeras consisting of the dimerization domain and the alpha-helical linker of the bacterial proline cis/trans isomerase FkpA and the periplasmic oxidase DsbA gave rise to enzymes that catalyzed the folding of multidisulfide substrate proteins in vivo with comparable efficiency to E. coli DsbC. In addition, expression of FkpA-DsbAs conferred modest resistance to CuCl2, a phenotype that depends on disulfide bond isomerization. Selection for resistance to elevated CuCl2 concentrations led to the isolation of FkpA-DsbA mutants containing a single amino acid substitution that changed the active site of the DsbA domain from CPHC into CPYC, increasing the similarity to the DsbC active site (CGYC). Unlike DsbC, which is resistant to oxidation by DsbB-DsbA and does not normally catalyze disulfide bond formation under physiological conditions, the FkpA-DsbA chimeras functioned both as oxidases and isomerases. The engineering of these efficient artificial isomerases delineates the key features of catalysis of disulfide bond isomerization and enhances our understanding of its evolution.