Project description:X-ray crystallography provides the most accurate models of protein-ligand structures. These models serve as the foundation of many computational methods including structure prediction, molecular modelling, and structure-based drug design. The success of these computational methods ultimately depends on the quality of the underlying protein-ligand models. X-ray crystallography offers the unparalleled advantage of a clear mathematical formalism relating the experimental data to the protein-ligand model. In the case of X-ray crystallography, the primary experimental evidence is the electron density of the molecules forming the crystal. The first step in the generation of an accurate and precise crystallographic model is the interpretation of the electron density of the crystal, typically carried out by construction of an atomic model. The atomic model must then be validated for fit to the experimental electron density and also for agreement with prior expectations of stereochemistry. Stringent validation of protein-ligand models has become possible as a result of the mandatory deposition of primary diffraction data, and many computational tools are now available to aid in the validation process. Validation of protein-ligand complexes has revealed some instances of overenthusiastic interpretation of ligand density. Fundamental concepts and metrics of protein-ligand quality validation are discussed and we highlight software tools to assist in this process. It is essential that end users select high quality protein-ligand models for their computational and biological studies, and we provide an overview of how this can be achieved.
Project description:There is a growing public concern about the lack of reproducibility of experimental data published in peer-reviewed scientific literature. Herein, we review the most recent alerts regarding experimental data quality and discuss initiatives taken thus far to address this problem, especially in the area of chemical genomics. Going beyond just acknowledging the issue, we propose a chemical and biological data curation workflow that relies on existing cheminformatics approaches to flag, and when appropriate, correct possibly erroneous entries in large chemogenomics data sets. We posit that the adherence to the best practices for data curation is important for both experimental scientists who generate primary data and deposit them in chemical genomics databases and computational researchers who rely on these data for model development.
Project description:Proteomics plays a vital role in biomedical research in the post-genomic era. With the technological revolution and emerging computational and statistic models, proteomic methodology has evolved rapidly in the past decade and shed light on solving complicated biomedical problems. Here, we summarize scientific research and clinical practice of existing and emerging high-throughput proteomics approaches, including mass spectrometry, protein pathway array, next-generation tissue microarrays, single-cell proteomics, single-molecule proteomics, Luminex, Simoa and Olink Proteomics. We also discuss important computational methods and statistical algorithms that can maximize the mining of proteomic data with clinical and/or other 'omics data. Various principles and precautions are provided for better utilization of these tools. In summary, the advances in high-throughput proteomics will not only help better understand the molecular mechanisms of pathogenesis, but also to identify the signature signaling networks of specific diseases. Thus, modern proteomics have a range of potential applications in basic research, prognostic oncology, precision medicine, and drug discovery.
Project description:The rapid spread of SARS-CoV-2 and its continuing impact on human health has prompted the need for effective and rapid development of monoclonal antibody therapeutics. In this study, we investigate polyclonal antibodies in serum and B cells from the whole blood of three donors with SARS-CoV-2 immunity to find high-affinity anti-SARS-CoV-2 antibodies to escape variants. Serum IgG antibodies were selected by their affinity to the receptor-binding domain (RBD) and non-RBD sites on the spike protein of Omicron subvariant B.1.1.529 from each donor. Antibodies were analyzed by bottom-up mass spectrometry, and matched to single- and bulk-cell sequenced repertoires for each donor. The antibodies observed in serum were recombinantly expressed, and characterized to assess domain binding, cross-reactivity between different variants, and capacity to inhibit RBD binding to host protein. Donors infected with early Omicron subvariants had serum antibodies with subnanomolar affinity to RBD that also showed binding activity to a newer Omicron subvariant BQ.1.1. The donors also showed a convergent immune response. Serum antibodies and other single- and bulk-cell sequences were similar to publicly reported anti-SARS-CoV-2 antibodies, and the characterized serum antibodies had the same variant-binding and neutralization profiles as their reported public sequences. The serum antibodies analyzed were a subset of anti-SARS-CoV-2 antibodies in the B cell repertoire, which demonstrates significant dynamics between the B cells and circulating antibodies in peripheral blood.
Project description:BackgroundResistance to chemotherapy and molecularly targeted therapies is a major factor in limiting the effectiveness of cancer treatment. In many cases, resistance can be linked to genetic changes in target proteins, either pre-existing or evolutionarily selected during treatment. Key to overcoming this challenge is an understanding of the molecular determinants of drug binding. Using multi-stage pipelines of molecular simulations we can gain insights into the binding free energy and the residence time of a ligand, which can inform both stratified and personal treatment regimes and drug development. To support the scalable, adaptive and automated calculation of the binding free energy on high-performance computing resources, we introduce the High-throughput Binding Affinity Calculator (HTBAC). HTBAC uses a building block approach in order to attain both workflow flexibility and performance.ResultsWe demonstrate close to perfect weak scaling to hundreds of concurrent multi-stage binding affinity calculation pipelines. This permits a rapid time-to-solution that is essentially invariant of the calculation protocol, size of candidate ligands and number of ensemble simulations.ConclusionsAs such, HTBAC advances the state of the art of binding affinity calculations and protocols. HTBAC provides the platform to enable scientists to study a wide range of cancer drugs and candidate ligands in order to support personalized clinical decision making based on genome sequencing and drug discovery.
Project description:Laser capture microdissection (LCM) has become an indispensable tool for mass spectrometry-based proteomic analysis of specific regions obtained from formalin-fixed paraffin-embedded (FFPE) tissue samples in both clinical and research settings. Low protein yields from LCM samples along with laborious sample processing steps present challenges for proteomic analysis without sacrificing protein and peptide recovery. Automation of sample preparation workflows is still under development, especially for samples such as laser-capture microdissected tissues. Here, we present a simplified and rapid workflow using adaptive focused acoustics (AFA) technology for sample processing for high-throughput FFPE-based proteomics. We evaluated three different workflows: standard extraction method followed by overnight trypsin digestion, AFA-assisted extraction and overnight trypsin digestion, and AFA-assisted extraction simultaneously performed with trypsin digestion. The use of AFA-based ultrasonication enables automated sample processing for high-throughput proteomic analysis of LCM-FFPE tissues in 96-well and 384-well formats. Further, accelerated trypsin digestion combined with AFA dramatically reduced the overall processing times. LC-MS/MS analysis revealed a slightly higher number of protein and peptide identifications in AFA accelerated workflows compared to standard and AFA overnight workflows. Further, we did not observe any difference in the proportion of peptides identified with missed cleavages or deamidated peptides across the three different workflows. Overall, our results demonstrate that the workflow described in this study enables rapid and high-throughput sample processing with greatly reduced sample handling, which is amenable to automation.
Project description:Mass spectrometry (MS)-based proteomics provides unprecedented opportunities for understanding the structure and function of proteins in complex biological systems; however, protein solubility and sample preparation before MS remain a bottleneck preventing high-throughput proteomics. Herein, we report a high-throughput bottom-up proteomic method enabled by a newly developed MS-compatible photocleavable surfactant, 4-hexylphenylazosulfonate (Azo) that facilitates robust protein extraction, rapid enzymatic digestion (30 min compared to overnight), and subsequent MS-analysis following UV degradation. Moreover, we developed an Azo-aided bottom-up method for analysis of integral membrane proteins, which are key drug targets and are generally underrepresented in global proteomic studies. Furthermore, we demonstrated the ability of Azo to serve as an "all-in-one" MS-compatible surfactant for both top-down and bottom-up proteomics, with streamlined workflows for high-throughput proteomics amenable to clinical applications.
Project description:Major advances have been made to improve the sensitivity of mass analyzers, spectral quality, and speed of data processing enabling more comprehensive proteome discovery and quantitation. While focus has recently begun shifting toward robust proteomics sample preparation efforts, a high-throughput proteomics sample preparation is still lacking. We report the development of a highly automated universal 384-well plate sample preparation platform with high reproducibility and adaptability for extraction of proteins from cells within a culture plate. Digestion efficiency was excellent in comparison to a commercial digest peptide standard with minimal sample loss while improving sample preparation throughput by 20- to 40-fold (the entire process from plated cells to clean peptides is complete in ∼300 min). Analysis of six human cell types, including two primary cell samples, identified and quantified ∼4,000 proteins for each sample in a single high-performance liquid chromatography (HPLC)-tandem mass spectrometry injection with only 100-10K cells, thus demonstrating universality of the platform. The selected protein was further quantified using a developed HPLC-multiple reaction monitoring method for HeLa digests with two heavy labeled internal standard peptides spiked in. Excellent linearity was achieved across different cell numbers indicating a potential for target protein quantitation in clinical research.
Project description:Phosphotyrosine (pY) enrichment is critical for expanding fundamental and clinical understanding of cellular signaling by mass spectrometry-based proteomics. However, current pY enrichment methods exhibit a high cost per sample and limited reproducibility due to expensive affinity reagents and manual processing. We present rapid-robotic phosphotyrosine proteomics (R2-pY), which uses a magnetic particle processor and pY superbinders or antibodies. R2-pY handles 96 samples in parallel, requires 2 days to go from cell lysate to mass spectrometry injections, and results in global proteomic, phosphoproteomic and tyrosine specific phosphoproteomic samples. We benchmark the method on HeLa cells stimulated with pervanadate and serum and report over 4000 unique pY sites from 1 mg of peptide input, strong reproducibility between replicates, and phosphopeptide enrichment efficiencies above 99%. R2-pY extends our previously reported R2-P2 proteomic and global phosphoproteomic sample preparation framework, opening the door to large-scale studies of pY signaling in concert with global proteome and phosphoproteome profiling.
Project description:BACKGROUND: High-resolution tandem mass spectra can now be readily acquired with hybrid instruments, such as LTQ-Orbitrap and LTQ-FT, in high-throughput shotgun proteomics workflows. The improved spectral quality enables more accurate de novo sequencing for identification of post-translational modifications and amino acid polymorphisms. RESULTS: In this study, a new de novo sequencing algorithm, called Vonode, has been developed specifically for analysis of such high-resolution tandem mass spectra. To fully exploit the high mass accuracy of these spectra, a unique scoring system is proposed to evaluate sequence tags based primarily on mass accuracy information of fragment ions. Consensus sequence tags were inferred for 11,422 spectra with an average peptide length of 5.5 residues from a total of 40,297 input spectra acquired in a 24-hour proteomics measurement of Rhodopseudomonas palustris. The accuracy of inferred consensus sequence tags was 84%. According to our comparison, the performance of Vonode was shown to be superior to the PepNovo v2.0 algorithm, in terms of the number of de novo sequenced spectra and the sequencing accuracy. CONCLUSIONS: Here, we improved de novo sequencing performance by developing a new algorithm specifically for high-resolution tandem mass spectral data. The Vonode algorithm is freely available for download at http://compbio.ornl.gov/Vonode.