Project description:In the growing field of proteomics, tools for the in silico analysis of proteins and even of whole proteomes are of crucial importance to make best use of the accumulating amount of data. To utilise this data for healthcare and drug development, first the characteristics of proteomes of entire species-mainly the human-have to be understood, before secondly differentiation between individuals can be surveyed. Specialised databases about nucleic acid sequences, protein sequences, protein tertiary structure, genome analysis, and proteome analysis represent useful resources for analysis, characterisation, and classification of protein sequences. Different from most proteomics tools focusing on similarity searches, structure analysis and prediction, detection of specific regions, alignments, data mining, 2D PAGE analysis, or protein modelling, respectively, comprehensive databases like the proteome analysis database benefit from the information stored in different databases and make use of different protein analysis tools to provide computational analysis of whole proteomes.
Project description:The COVID-19 disease caused by the SARS-CoV-2 virus is a health crisis worldwide. While developing novel drugs and vaccines is long, repurposing existing drugs against COVID-19 can yield treatments with known preclinical, pharmacokinetic, pharmacodynamic, and toxicity profiles, which can rapidly enter clinical trials. In this study, we present a novel network-based drug repurposing platform to identify candidates for the treatment of COVID-19. At the time of the initial outbreak, knowledge about SARS-CoV-2 was lacking, but based on its similarity with other viruses, we sought to identify repurposing candidates to be tested rapidly at the clinical or preclinical levels. We first analyzed the genome sequence of SARS-CoV-2 and confirmed SARS as the closest virus by genome similarity, followed by MERS and other human coronaviruses. Using text mining and database searches, we obtained 34 COVID-19-related genes to seed the construction of a molecular network where our module detection and drug prioritization algorithms identified 24 disease-related human pathways, five modules, and 78 drugs to repurpose. Based on clinical knowledge, we re-prioritized 30 potentially repurposable drugs against COVID-19 (including pseudoephedrine, andrographolide, chloroquine, abacavir, and thalidomide). Our work shows how in silico repurposing analyses can yield testable candidates to accelerate the response to novel disease outbreaks.
Project description:Databases and computational tools for mimotopes have been an important part of phage display study. Five special databases and eighteen algorithms, programs and web servers and their applications are reviewed in this paper. Although these bioinformatics resources have been widely used to exclude target-unrelated peptides, characterize small molecules-protein interactions and map protein-protein interactions, a lot of problems are still waiting to be solved. With the improvement of these tools, they are expected to serve the phage display community better.
Project description:Quadruplexes (G4s) are of interest, which increases with the number of identified G4 structures and knowledge about their biomedical potential. These unique motifs form in many organisms, including humans, where their appearance correlates with various diseases. Scientists store and analyze quadruplexes using recently developed bioinformatic tools-many of them focused on DNA structures. With an expanding collection of G4 RNAs, we check how existing tools deal with them. We review all available bioinformatics resources dedicated to quadruplexes and examine their usefulness in G4 RNA analysis. We distinguish the following subsets of resources: databases, tools to predict putative quadruplex sequences, tools to predict secondary structure with quadruplexes and tools to analyze and visualize quadruplex structures. We share the results obtained from processing specially created RNA datasets with these tools. Contact: mszachniuk@cs.put.poznan.pl Supplementary information: Supplementary data are available at Briefings in Bioinformatics online.
Project description:The next-generation sequencing (NGS) technology has revolutionized our previous understanding of the plant genomes, relying on its innate advantages, such as high throughput and deep sequencing depth. In addition to the protein-coding gene loci, massive transcription signals have been detected within intergenic or intragenic regions. Most of these signals belong to non-coding ones, considering their weak protein-coding potential. Generally, these transcripts could be divided into long non-coding RNAs and small non-coding RNAs (sRNAs) based on their sequence length. In addition to the well-known microRNAs (miRNAs), many plant endogenous sRNAs were collectively referred to as small interfering RNAs. However, an increasing number of unclassified sRNA species are being discovered by NGS. The high heterogeneity of these novel sRNAs greatly hampered the mechanistic studies, especially on the clear description of their biogenesis and action pathways. Fortunately, public databases, bioinformatics softwares and NGS datasets are increasingly available for plant sRNA research. Here, by summarizing these valuable resources, we proposed a general workflow to decipher the RDR (RNA-dependent RNA polymerase)- and DCL (Dicer-like)-dependent biogenesis pathways, and the Argonaute-mediated action modes (such as target cleavages and chromatin modifications) for specific sRNA species in plants. Taken together, we hope that by summarizing a list of the public resources, this work will facilitate the plant biologists to perform classification and functional characterization of the interesting sRNA species.
Project description:Haem d(1) is a unique tetrapyrrole molecule that serves as a prosthetic group of cytochrome cd(1), which reduces nitrite to nitric oxide during the process of denitrification. Very little information is available regarding the biosynthesis of haem d(1). The extreme difficulty in studying the haem d(1) biosynthetic pathway can be partly attributed to the lack of a theoretical basis for experimental investigation. We report here a gene cluster encoding enzymes involved in the biosynthesis of haem d(1) in two heliobacterial species, Heliobacillus mobilis and Heliophilum fasciatum. The gene organization of the cluster is conserved between the two species, and contains a complete set of genes that lead to the biosynthesis of uroporphyrinogen III and genes thought to be involved in the late steps of haem d(1) biosynthesis. Detailed bioinformatics analysis of some of the proteins encoded in the gene cluster revealed important clues to the precise biochemical roles of the proteins in the biosynthesis of haem d(1), as well as the membrane transport and insertion of haem d(1) into an apocytochrome during the maturation of cytochrome cd(1).
Project description:NEIBank is an integrated resource for genomics and bioinformatics in vision research. It includes expressed sequence tag (EST) data and sequence-verified cDNA clones for multiple eye tissues of several species, web-based access to human eye-specific SAGE data through EyeSAGE, and comprehensive, annotated databases of known human eye disease genes and candidate disease gene loci. All expression- and disease-related data are integrated in EyeBrowse, an eye-centric genome browser. NEIBank provides a comprehensive overview of current knowledge of the transcriptional repertoires of eye tissues and their relation to pathology.
Project description:BackgroundSevere influenza is a serious global health issue that leads to prolonged hospitalization and mortality on a significant scale. The pathogenesis of this infectious disease is poorly understood. Therefore, this study aimed to identify the key genes associated with severe influenza patients necessitating invasive mechanical ventilation.MethodsThe current study utilized two publicly accessible gene expression profiles (GSE111368 and GSE21802) from the Gene Expression Omnibus database. The research focused on identifying the genes exhibiting differential expression between severe and non-severe influenza patients. We employed three machine learning algorithms, namely the Least Absolute Shrinkage and Selection Operator regression model, Random Forest, and Support Vector Machine-Recursive Feature Elimination, to detect potential key genes. The key gene was further selected based on the diagnostic performance of the target genes substantiated in the dataset GSE101702. A single-sample gene set enrichment analysis algorithm was applied to evaluate the participation of immune cell infiltration and their associations with key genes.ResultsA total of 44 differentially expressed genes were recognized; among them, we focused on 10 common genes, namely PCOLCE2, HLA_DPA1, LOC653061, TDRD9, MPO, HLA_DQA1, MAOA, S100P, RAP1GAP, and CA1. To ensure the robustness of our findings, we employed overlapping LASSO regression, Random Forest, and SVM-RFE algorithms. By utilizing these algorithms, we were able to pinpoint the aforementioned 10 genes as potential biomarkers for distinguishing between both cases of influenza (severe and non-severe). However, the gene HLA_DPA1 has been recognized as a crucial factor in the pathological condition of severe influenza. Notably, the validation dataset revealed that this gene exhibited the highest area under the receiver operating characteristic curve, with a value of 0.891. The use of single-sample gene set enrichment analysis has provided valuable insights into the immune responses of patients afflicted with severe influenza that have further revealed a categorical correlation between the expression of HLA_DPA1 and lymphocytes.ConclusionThe findings indicated that the HLA_DPA1 gene may play a crucial role in the immune-pathological condition of severe influenza and could serve as a promising therapeutic target for patients infected with severe influenza.
Project description:Outbreaks of COVID-19 caused by the novel coronavirus SARS-CoV-2 is still a threat to global human health. In order to understand the biology of SARS-CoV-2 and developing drug against COVID-19, a vast amount of genomic, proteomic, interatomic, and clinical data is being generated, and the bioinformatics researchers produced databases, webservers and tools to gather those publicly available data and provide an opportunity of analyzing such data. However, these bioinformatics resources are scattered and researchers need to find them from different resources discretely. To facilitate researchers in finding the resources in one frame, we have developed an integrated web portal called OverCOVID (http://bis.zju.edu.cn/overcovid/). The publicly available webservers, databases and tools associated with SARS-CoV-2 have been incorporated in the resource page. In addition, a network view of the resources is provided to display the scope of the research. Other information like SARS-CoV-2 strains is visualized and various layers of interaction resources is listed in distinct pages of the web portal. As an integrative web portal, the OverCOVID will help the scientist to search the resources and accelerate the clinical research of SARS-CoV-2.