Project description:A density functional theory (DFT) study of the 1H- and 13C-NMR chemical shifts of the geometric isomers of 18:2 ω-7 conjugated linoleic acid (CLA) and nine model compounds is presented, using five functionals and two basis sets. The results are compared with available experimental data from solution high resolution nuclear magnetic resonance (NMR). The experimental 1H chemical shifts exhibit highly diagnostic resonances due to the olefinic protons of the conjugated double bonds. The "inside" olefinic protons of the conjugated double bonds are deshielded than those of the "outside" protons. Furthermore, in the cis/trans isomers, the signals of the cis bonds are more deshielded than those of the trans bonds. These regularities of the experimental 1H chemical shifts of the olefinic protons of the conjugated double bonds are reproduced very accurately for the lowest energy DFT optimized single conformer, for all functionals and basis sets used. The other low energy conformers have negligible effects on the computational 1H-NMR chemical shifts. We conclude that proton NMR chemical shifts are more discriminating than carbon, and DFT calculations can provide a valuable tool for (i) the accurate prediction of 1H-NMR chemical shifts even with less demanding functionals and basis sets; (ii) the unequivocal identification of geometric isomerism of CLAs that occur in nature, and (iii) to derive high resolution structures in solution.
Project description:Computer prediction of NMR chemical shifts plays an increasingly important role in molecular structure assignment and elucidation for organic molecule studies. Density functional theory (DFT) and gauge-including atomic orbital (GIAO) have established a framework to predict NMR chemical shifts but often at a significant computational expense with a limited prediction accuracy. Recent advancements in deep learning methods, especially graph neural networks (GNNs), have shown promise in improving the accuracy of predicting experimental chemical shifts, either by using 2D molecular topological features or 3D conformational representation. This study presents a new 3D GNN model to predict 1H and 13C chemical shifts, CSTShift, that combines atomic features with DFT-calculated shielding tensor descriptors, capturing both isotropic and anisotropic shielding effects. Utilizing the NMRShiftDB2 data set and conducting DFT optimization and GIAO calculations at the B3LYP/6-31G(d) level, we prepared the NMRShiftDB2-DFT data set of high-quality 3D structures and shielding tensors with corresponding experimentally measured 1H and 13C chemical shifts. The developed CSTShift models achieve the state-of-the-art prediction performance on both the NMRShiftDB2-DFT test data set and external CHESHIRE data set. Further case studies on identifying correct structures from two groups of constitutional isomers show its capability for structure assignment and elucidation. The source code and data are accessible at https://yzhang.hpc.nyu.edu/IMA.
Project description:Scaling factors are reported for use in predicting 19F NMR chemical shifts for fluorinated (hetero)aromatic compounds with relatively low levels of theory. Our recommended scaling factors were developed using a curated data set of 52 compounds, with 100 individual 19F shifts spanning a range of 153 ppm. With a maximum deviation of 6.5 ppm between experimental and computed shifts, or 4% of the range tested, these scaling factors allow for the assignment of chemical shifts to specific fluorines in multifluorinated aromatics. The utility of this approach is highlighted by several structural reassignments.
Project description:Inferring molecular structure from Nuclear Magnetic Resonance (NMR) measurements requires an accurate forward model that can predict chemical shifts from 3D structure. Current forward models are limited to specific molecules like proteins and state-of-the-art models are not differentiable. Thus they cannot be used with gradient methods like biased molecular dynamics. Here we use graph neural networks (GNNs) for NMR chemical shift prediction. Our GNN can model chemical shifts accurately and capture important phenomena like hydrogen bonding induced downfield shift between multiple proteins, secondary structure effects, and predict shifts of organic molecules. Previous empirical NMR models of protein NMR have relied on careful feature engineering with domain expertise. These GNNs are trained from data alone with no feature engineering yet are as accurate and can work on arbitrary molecular structures. The models are also efficient, able to compute one million chemical shifts in about 5 seconds. This work enables a new category of NMR models that have multiple interacting types of macromolecules.
Project description:NMR spectroscopy plays a major role in the determination of the structures and dynamics of proteins and other biological macromolecules. Chemical shifts are the most readily and accurately measurable NMR parameters, and they reflect with great specificity the conformations of native and nonnative states of proteins. We show, using 11 examples of proteins representative of the major structural classes and containing up to 123 residues, that it is possible to use chemical shifts as structural restraints in combination with a conventional molecular mechanics force field to determine the conformations of proteins at a resolution of 2 angstroms or better. This strategy should be widely applicable and, subject to further development, will enable quantitative structural analysis to be carried out to address a range of complex biological problems not accessible to current structural techniques.
Project description:An automated fragmentation quantum mechanics/molecular mechanics approach (AFNMR) has shown promising results in chemical shift calculations for biomolecules. Sample results for ubiquitin, and an RNA hairpin and helix are presented, and used to recent directions in quantum calculations. Trends in chemical shift are stable with regards to change in density functional or basis sets, and the use of the small "pcSseg-0" basis, which was optimized for chemical shift prediction [1], opens the way to more extensive conformational averaging, which can often be necessary, even for fairly well-defined structures.
Project description:Despite the formidable progress in Nuclear Magnetic Resonance (NMR) spectroscopy, quality assessment of NMR-derived structures remains as an important problem. Thus, validation of protein structures is essential for the spectroscopists, since it could enable them to detect structural flaws and potentially guide their efforts in further refinement. Moreover, availability of accurate and efficient validation tools would help molecular biologists and computational chemists to evaluate quality of available experimental structures and to select a protein model which is the most suitable for a given scientific problem. The 13C? nuclei are ubiquitous in proteins, moreover, their shieldings are easily obtainable from NMR experiments and represent a rich source of encoded structural information that makes 13C? chemical shifts an attractive candidate for use in computational methods aimed at determination and validation of protein structures. In this chapter, the basis of a novel methodology of computing, at the quantum chemical level of theory, the 13C? shielding for the amino acid residues in proteins is described. We also identify and examine the main factors affecting the 13C?-shielding computation. Finally, we illustrate how the information encoded in the 13C chemical shifts can be used for a number of applications, viz., from protein structure prediction of both ?-helical and ?-sheet conformations, to determination of the fraction of the tautomeric forms of the imidazole ring of histidine in proteins as a function of pH or to accurate detection of structural flaws, at a residue-level, in NMR-determined protein models.
Project description:Despite the formidable progress in Nuclear Magnetic Resonance (NMR) spectroscopy, quality assessment of NMR-derived structures remains as an important problem. Thus, validation of protein structures is essential for the spectroscopists, since it could enable them to detect structural flaws and potentially guide their efforts in further refinement. Moreover, availability of accurate and efficient validation tools would help molecular biologists and computational chemists to evaluate quality of available experimental structures and to select a protein model which is the most suitable for a given scientific problem. The 13C? nuclei are ubiquitous in proteins, moreover, their shieldings are easily obtainable from NMR experiments and represent a rich source of encoded structural information that makes 13C? chemical shifts an attractive candidate for use in computational methods aimed at determination and validation of protein structures. In this chapter, the basis of a novel methodology of computing, at the quantum chemical level of theory, the 13C? shielding for the amino acid residues in proteins is described. We also identify and examine the main factors affecting the 13C?-shielding computation. Finally, we illustrate how the information encoded in the 13C chemical shifts can be used for a number of applications, viz., from protein structure prediction of both ?-helical and ?-sheet conformations, to determination of the fraction of the tautomeric forms of the imidazole ring of histidine in proteins as a function of pH or to accurate detection of structural flaws, at a residue-level, in NMR-determined protein models.
Project description:In this investigation, semiempirical NMR chemical shift prediction methods are used to evaluate the dynamically averaged values of backbone chemical shifts obtained from unbiased molecular dynamics (MD) simulations of proteins. MD-averaged chemical shift predictions generally improve agreement with experimental values when compared to predictions made from static X-ray structures. Improved chemical shift predictions result from population-weighted sampling of multiple conformational states and from sampling smaller fluctuations within conformational basins. Improved chemical shift predictions also result from discrete changes to conformations observed in X-ray structures, which may result from crystal contacts, and are not always reflective of conformational dynamics in solution. Chemical shifts are sensitive reporters of fluctuations in backbone and side chain torsional angles, and averaged (1)H chemical shifts are particularly sensitive reporters of fluctuations in aromatic ring positions and geometries of hydrogen bonds. In addition, poor predictions of MD-averaged chemical shifts can identify spurious conformations and motions observed in MD simulations that may result from force field deficiencies or insufficient sampling and can also suggest subsets of conformational space that are more consistent with experimental data. These results suggest that the analysis of dynamically averaged NMR chemical shifts from MD simulations can serve as a powerful approach for characterizing protein motions in atomistic detail.
Project description:A recently determined set of 20 NMR-derived conformations of a 48-residue all-alpha-helical protein, (PDB ID code 2JVD), is validated here by comparing the observed (13)C(alpha) chemical shifts with those computed at the density functional level of theory. In addition, a recently introduced physics-based method, aimed at determining protein structures by using NOE-derived distance constraints together with observed and computed (13)C(alpha) chemical shifts, was applied to determine a new set of 10 conformations, (Set-bt), as a blind test for the same protein. A cross-validation of these two sets of conformations in terms of the agreement between computed and observed (13)C(alpha) chemical shifts, several stereochemical quality factors, and some NMR quality assessment scores reveals the good quality of both sets of structures. We also carried out an analysis of the agreement between the observed and computed (13)C(alpha) chemical shifts for a slightly longer construct of the protein solved by x-ray crystallography at 2.0-A resolution (PDB ID code 3BHP) with an identical amino acid residue sequence to the 2JVD structure for the first 46 residues. Our results reveal that both of the NMR-derived sets, namely 2JVD and Set-bt, are somewhat better representations of the observed (13)C(alpha) chemical shifts in solution than the 3BHP crystal structure. In addition, the (13)C(alpha)-based validation analysis appears to be more sensitive to subtle structural differences across the three sets of structures than any other NMR quality-assessment scores used here, and, although it is computationally intensive, this analysis has potential value as a standard procedure to determine, refine, and validate protein structures.