Project description:There is a continuous interest in increasing proteins stability to enhance their usability in numerous biomedical and biotechnological applications. A number of in silico tools for the prediction of the effect of mutations on protein stability have been developed recently. However, only single-point mutations with a small effect on protein stability are typically predicted with the existing tools and have to be followed by laborious protein expression, purification, and characterization. Here, we present FireProt, a web server for the automated design of multiple-point thermostable mutant proteins that combines structural and evolutionary information in its calculation core. FireProt utilizes sixteen tools and three protein engineering strategies for making reliable protein designs. The server is complemented with interactive, easy-to-use interface that allows users to directly analyze and optionally modify designed thermostable mutants. FireProt is freely available at http://loschmidt.chemi.muni.cz/fireprot.
Project description:BackgroundDisulfide engineering is an important biotechnological tool that has advanced a wide range of research. The introduction of novel disulfide bonds into proteins has been used extensively to improve protein stability, modify functional characteristics, and to assist in the study of protein dynamics. Successful use of this technology is greatly enhanced by software that can predict pairs of residues that will likely form a disulfide bond if mutated to cysteines.ResultsWe had previously developed and distributed software for this purpose: Disulfide by Design (DbD). The original DbD program has been widely used; however, it has a number of limitations including a Windows platform dependency. Here, we introduce Disulfide by Design 2.0 (DbD2), a web-based, platform-independent application that significantly extends functionality, visualization, and analysis capabilities beyond the original program. Among the enhancements to the software is the ability to analyze the B-factor of protein regions involved in predicted disulfide bonds. Importantly, this feature facilitates the identification of potential disulfides that are not only likely to form but are also expected to provide improved thermal stability to the protein.ConclusionsDbD2 provides platform-independent access and significantly extends the original functionality of DbD. A web server hosting DbD2 is provided at http://cptweb.cpt.wayne.edu/DbD2/.
Project description:Large-scale data acquisition and analysis are often required in the successful implementation of the design, build, test, and learn (DBTL) cycle in biosystems design. However, it has long been hindered by experimental cost, variability, biases, and missed insights from traditional analysis methods. Here, we report the application of an integrated robotic system coupled with machine learning algorithms to fully automate the DBTL process for biosystems design. As proof of concept, we have demonstrated its capacity by optimizing the lycopene biosynthetic pathway. This fully-automated robotic platform, BioAutomata, evaluates less than 1% of possible variants while outperforming random screening by 77%. A paired predictive model and Bayesian algorithm select experiments which are performed by Illinois Biological Foundry for Advanced Biomanufacturing (iBioFAB). BioAutomata excels with black-box optimization problems, where experiments are expensive and noisy and the success of the experiment is not dependent on extensive prior knowledge of biological mechanisms.
Project description:With the advent of computer-aided drug design (CADD), traditional physical testing of thousands of molecules has now been replaced by target-focused drug discovery, where potentially bioactive molecules are predicted by computer software before their physical synthesis. However, despite being a significant breakthrough, CADD still faces various limitations and challenges. The increasing availability of data on small molecules has created a need to streamline the sourcing of data from different databases and automate the processing and cleaning of data into a form that can be used by multiple CADD software applications. Several standalone software packages are available to aid the drug designer, each with its own specific application, requiring specialized knowledge and expertise for optimal use. These applications require their own input and output files, making it a challenge for nonexpert users or multidisciplinary discovery teams. Here, we have developed a new software platform called DataPype, which wraps around these different software packages. It provides a unified automated workflow to search for hit compounds using specialist software. Additionally, multiple virtual screening packages can be used in the one workflow, and if different ways of looking at potential hit compounds all predict the same set of molecules, we have higher confidence that we should make or purchase and test the molecules. Importantly, DataPype can run on computer servers, speeding up the virtual screening for new compounds. Combining access to multiple CADD tools within one interface will enhance the early stage of drug discovery, increase usability, and enable the use of parallel computing.
Project description:A number of anti-retroviral drugs are being used for treating Human Immunodeficiency Virus (HIV) infection. Due to emergence of drug resistant strains, there is a constant quest to discover more effective anti-HIV compounds. In this endeavor, computational tools have proven useful in accelerating drug discovery. Although methods were published to design a class of compounds against a specific HIV protein, but an integrated web server for the same is lacking. Therefore, we have developed support vector machine based regression models using experimentally validated data from ChEMBL repository. Quantitative structure activity relationship based features were selected for predicting inhibition activity of a compound against HIV proteins namely protease (PR), reverse transcriptase (RT) and integrase (IN). The models presented a maximum Pearson correlation coefficient of 0.78, 0.76, 0.74 and 0.76, 0.68, 0.72 during tenfold cross-validation on IC50 and percent inhibition datasets of PR, RT, IN respectively. These models performed equally well on the independent datasets. Chemical space mapping, applicability domain analyses and other statistical tests further support robustness of the predictive models. Currently, we have identified a number of chemical descriptors that are imperative in predicting the compound inhibition potential. HIVprotI platform ( http://bioinfo.imtech.res.in/manojk/hivproti ) would be useful in virtual screening of inhibitors as well as designing of new molecules against the important HIV proteins for therapeutics development.
Project description:Researchers are increasingly seeking to interpret molecular data within a multi-omics context to gain a more comprehensive picture of their study system. OmicsNet (www.omicsnet.ca) is a web-based tool developed to allow users to easily build, visualize, and analyze multi-omics networks to study rich relationships among lists of 'omics features of interest. Three major improvements have been introduced in OmicsNet 2.0, which include: (i) enhanced network visual analytics with eleven 2D graph layout options and a novel 3D module layout; (ii) support for three new 'omics types: single nucleotide polymorphism (SNP) list from genetic variation studies; taxon list from microbiome profiling studies, as well as liquid chromatography-mass spectrometry (LC-MS) peaks from untargeted metabolomics; and (iii) measures to improve research reproducibility by coupling R command history with the release of the companion OmicsNetR package, and generation of persistent links to share interactive network views. We performed a case study using the multi-omics data obtained from a recent large-scale investigation on inflammatory bowel disease (IBD) and demonstrated that OmicsNet was able to quickly create meaningful multi-omics context to facilitate hypothesis generation and mechanistic insights.
Project description:BackgroundLanguage and the ability to communicate effectively are key factors in mental health and well-being. Despite this critical importance, research on language is limited by the lack of a scalable phenotyping toolkit.MethodsHere, we describe and showcase Lingo - a flexible online battery of language and nonverbal reasoning skills based on seven widely used tasks (COWAT, picture narration, vocal rhythm entrainment, rapid automatized naming, following directions, sentence repetition, and nonverbal reasoning). The current version of Lingo takes approximately 30 minutes to complete, is entirely open source, and allows for a wide variety of performance metrics to be extracted. We asked > 1,300 individuals from multiple samples to complete Lingo, then investigated the validity and utility of the resulting data.ResultsWe conducted an exploratory factor analysis across 14 features derived from the seven assessments, identifying five factors. Four of the five factors showed acceptable test-retest reliability (Pearson's R > 0.7). Factor 2 showed the highest reliability (Pearson's R = 0.95) and loaded primarily on sentence repetition task performance. We validated Lingo with objective measures of language ability by comparing performance to gold-standard assessments: CELF-5 and the VABS-3. Factor 2 was significantly associated with the CELF-5 "core language ability" scale (Pearson's R = 0.77, p-value < 0.05) and the VABS-3 "communication" scale (Pearson's R = 0.74, p-value < 0.05). Factor 2 was positively associated with phenotypic and genetic measures of socieconomic status. Interestingly, we found the parents of children with language impairments had lower Factor 2 scores (p-value < 0.01). Finally, we found Lingo factor scores were significantly predictive of numerous psychiatric and neurodevelopmental conditions.ConclusionsTogether, these analyses support Lingo as a powerful platform for scalable deep phenotyping of language and other cognitive abilities. Additionally, exploratory analyses provide supporting evidence for the heritability of language ability and the complex relationship between mental health and language.
Project description:Objective:To compare the accuracy of cephalometric analyses made with fully automated tracings, computerized tracing, and app-aided tracings with equivalent hand-traced measurements, and to evaluate the tracing time for each cephalometric analysis method. Methods:Pre-treatment lateral cephalometric radiographs of 40 patients were randomly selected. Eight angular and 4 linear parameters were measured by 1 operator using 3 methods: computerized tracing with software Dolphin Imaging 13.01(Dolphin Imaging and Management Solutions, Chatsworth, Calif, USA), app-aided tracing using the CephNinja 3.51 app (Cyncronus LLC, WA, USA), and web-based fully automated tracing with CephX (ORCA Dental AI, Las Vegas, NV). Correction of CephX landmarks was also made. Manual tracings were performed by 3 operators. Remeasurement of 15 radiographs was carried out to determine the intra-examiner and inter-examiner (manual tracings) correlation coefficient (ICC). Inter-group comparisons were made with one-way analysis of variance. The Tukey test was used for post hoc testing. Results:Overall, greater variability was found with CephX compared with the other methods. Differences in GoGn-SN (°), I-NA (°), I-NB (°), I-NA (mm), and I-NB (mm) were statistically (p<0.05) and clinically significant using CephX, whereas CephNinja and Dolphin were comparable to manual tracings. Correction of CephX landmarks gave similar results to CephNinja and Dolphin. All the ICCs exceeded 0.85, except for I-NA (°), I-NB (°), and I-NB (mm), which were traced with CephX. The shortest analyzing time was obtained with CephX. Conclusion:Fully automatic analysis with CephX needs to be more reliable. However, CephX analysis with manual correction is promising for use in clinical practice because it is comparable to CephNinja and Dolphin, and the analyzing time is significantly shorter.
Project description:ObjectivesThe research evaluated participant satisfaction with the content and format of the "Web 2.0 101: Introduction to Second Generation Web Tools" course and measured the impact of the course on participants' self-evaluated knowledge of Web 2.0 tools.MethodsThe "Web 2.0 101" online course was based loosely on the Learning 2.0 model. Content was provided through a course blog and covered a wide range of Web 2.0 tools. All Medical Library Association members were invited to participate. Participants were asked to complete a post-course survey. Respondents who completed the entire course or who completed part of the course self-evaluated their knowledge of nine social software tools and concepts prior to and after the course using a Likert scale. Additional qualitative information about course strengths and weaknesses was also gathered.ResultsRespondents' self-ratings showed a significant change in perceived knowledge for each tool, using a matched pair Wilcoxon signed rank analysis (P<0.0001 for each tool/concept). Overall satisfaction with the course appeared high. Hands-on exercises were the most frequently identified strength of the course; the length and time-consuming nature of the course were considered weaknesses by some.ConclusionLearning 2.0-style courses, though demanding time and self-motivation from participants, can increase knowledge of Web 2.0 tools.
Project description:There is a great interest in increasing proteins' stability to widen their usability in numerous biomedical and biotechnological applications. However, native proteins cannot usually withstand the harsh industrial environment, since they are evolved to function under mild conditions. Ancestral sequence reconstruction is a well-established method for deducing the evolutionary history of genes. Besides its applicability to discover the most probable evolutionary ancestors of the modern proteins, ancestral sequence reconstruction has proven to be a useful approach for the design of highly stable proteins. Recently, several computational tools were developed, which make the ancestral reconstruction algorithms accessible to the community, while leaving the most crucial steps of the preparation of the input data on users' side. FireProtASR aims to overcome this obstacle by constructing a fully automated workflow, allowing even the unexperienced users to obtain ancestral sequences based on a sequence query as the only input. FireProtASR is complemented with an interactive, easy-to-use web interface and is freely available at https://loschmidt.chemi.muni.cz/fireprotasr/.