Prediction of host - pathogen protein interactions between Mycobacterium tuberculosis and Homo sapiens using sequence motifs.
ABSTRACT: BACKGROUND: Emergence of multiple drug resistant strains of M. tuberculosis (MDR-TB) threatens to derail global efforts aimed at reigning in the pathogen. Co-infections of M. tuberculosis with HIV are difficult to treat. To counter these new challenges, it is essential to study the interactions between M. tuberculosis and the host to learn how these bacteria cause disease. RESULTS: We report a systematic flow to predict the host pathogen interactions (HPIs) between M. tuberculosis and Homo sapiens based on sequence motifs. First, protein sequences were used as initial input for identifying the HPIs by 'interolog' method. HPIs were further filtered by prediction of domain-domain interactions (DDIs). Functional annotations of protein and publicly available experimental results were applied to filter the remaining HPIs. Using such a strategy, 118 pairs of HPIs were identified, which involve 43 proteins from M. tuberculosis and 48 proteins from Homo sapiens. A biological interaction network between M. tuberculosis and Homo sapiens was then constructed using the predicted inter- and intra-species interactions based on the 118 pairs of HPIs. Finally, a web accessible database named PATH (Protein interactions of M. tuberculosis and Human) was constructed to store these predicted interactions and proteins. CONCLUSIONS: This interaction network will facilitate the research on host-pathogen protein-protein interactions, and may throw light on how M. tuberculosis interacts with its host.
Project description:BACKGROUND:Mycobacterium tuberculosis infection in humans is often associated with extended period of latency. To adapt to the hostile hypoxic environment inside a macrophage, M. tuberculosis cells undergo several physiological and metabolic changes. Previous studies have mostly focused on inspecting individual facets of this complex process. In order to gain deeper insights into the infection process and to understand the coordination among different regulatory/ metabolic pathways in the pathogen, the current in silico study investigates three aspects, namely, (i) host-pathogen interactions (HPIs) between human and M. tuberculosis proteins, (ii) gene regulatory network pertaining to adaptation of M. tuberculosis to hypoxia and (iii) alterations in M. tuberculosis metabolism under hypoxic condition. Subsequently, cross-talks between these components have been probed to evaluate possible gene-regulatory events as well as HPIs which are likely to drive metabolic changes during pathogen's adaptation to the intra-host hypoxic environment. RESULTS:The newly identified HPIs suggest the pathogen's ability to subvert host mediated reactive oxygen intermediates/ reactive nitrogen intermediates (ROI/ RNI) stress as well as their potential role in modulating host cell cycle and cytoskeleton structure. The results also indicate a significantly pronounced effect of HPIs on hypoxic metabolism of M. tuberculosis. Findings from the current study underscore the necessity of investigating the infection process from a systems-level perspective incorporating different facets of intra-cellular survival of the pathogen. CONCLUSIONS:The comprehensive host-pathogen interaction network, a Boolean model of M. tuberculosis H37Rv (Mtb) hypoxic gene-regulation, as well as a genome scale metabolic model of Mtb, built for this study are expected to be useful resources for future studies on tuberculosis infection.
Project description:<h4>Background</h4>Bacterial invasive infection and host immune response is fundamental to the understanding of pathogen pathogenesis and the discovery of effective therapeutic drugs. However, there are very few experimental studies on the signaling cross-talks between bacteria and human host to date.<h4>Methods</h4>In this work, taking M. tuberculosis H37Rv (MTB) that is co-evolving with its human host as an example, we propose a general computational framework that exploits the known bacterial pathogen protein interaction networks in STRING database to predict pathogen-host protein interactions and their signaling cross-talks. In this framework, significant interlogs are derived from the known pathogen protein interaction networks to train a predictive l<sub>2</sub>-regularized logistic regression model.<h4>Results</h4>The computational results show that the proposed method achieves excellent performance of cross validation as well as low predicted positive rates on the less significant interlogs and non-interlogs, indicating a low risk of false discovery. We further conduct gene ontology (GO) and pathway enrichment analyses of the predicted pathogen-host protein interaction networks, which potentially provides insights into the machinery that M. tuberculosis H37Rv targets human genes and signaling pathways. In addition, we analyse the pathogen-host protein interactions related to drug resistance, inhibition of which potentially provides an alternative solution to M. tuberculosis H37Rv drug resistance.<h4>Conclusions</h4>The proposed machine learning framework has been verified effective for predicting bacteria-host protein interactions via known bacterial protein interaction networks. For a vast majority of bacterial pathogens that lacks experimental studies of bacteria-host protein interactions, this framework is supposed to achieve a general-purpose applicability. The predicted protein interaction networks between M. tuberculosis H37Rv and Homo sapiens, provided in the Additional files, promise to gain applications in the two fields: (1) providing an alternative solution to drug resistance; (2) revealing the patterns that M. tuberculosis H37Rv genes target human immune signaling pathways.
Project description:BACKGROUND: H. sapiens-M. tuberculosis H37Rv protein-protein interaction (PPI) data are essential for understanding the infection mechanism of the formidable pathogen M. tuberculosis H37Rv. Computational prediction is an important strategy to fill the gap in experimental H. sapiens-M. tuberculosis H37Rv PPI data. Homology-based prediction is frequently used in predicting both intra-species and inter-species PPIs. However, some limitations are not properly resolved in several published works that predict eukaryote-prokaryote inter-species PPIs using intra-species template PPIs. RESULTS: We develop a stringent homology-based prediction approach by taking into account (i) differences between eukaryotic and prokaryotic proteins and (ii) differences between inter-species and intra-species PPI interfaces. We compare our stringent homology-based approach to a conventional homology-based approach for predicting host-pathogen PPIs, based on cellular compartment distribution analysis, disease gene list enrichment analysis, pathway enrichment analysis and functional category enrichment analysis. These analyses support the validity of our prediction result, and clearly show that our approach has better performance in predicting H. sapiens-M. tuberculosis H37Rv PPIs. Using our stringent homology-based approach, we have predicted a set of highly plausible H. sapiens-M. tuberculosis H37Rv PPIs which might be useful for many of related studies. Based on our analysis of the H. sapiens-M. tuberculosis H37Rv PPI network predicted by our stringent homology-based approach, we have discovered several interesting properties which are reported here for the first time. We find that both host proteins and pathogen proteins involved in the host-pathogen PPIs tend to be hubs in their own intra-species PPI network. Also, both host and pathogen proteins involved in host-pathogen PPIs tend to have longer primary sequence, tend to have more domains, tend to be more hydrophilic, etc. And the protein domains from both host and pathogen proteins involved in host-pathogen PPIs tend to have lower charge, and tend to be more hydrophilic. CONCLUSIONS: Our stringent homology-based prediction approach provides a better strategy in predicting PPIs between eukaryotic hosts and prokaryotic pathogens than a conventional homology-based approach. The properties we have observed from the predicted H. sapiens-M. tuberculosis H37Rv PPI network are useful for understanding inter-species host-pathogen PPI networks and provide novel insights for host-pathogen interaction studies.
Project description:BACKGROUND: H. sapiens-M. tuberculosis H37Rv protein-protein interaction (PPI) data are very important information to illuminate the infection mechanism of M. tuberculosis H37Rv. But current H. sapiens-M. tuberculosis H37Rv PPI data are very scarce. This seriously limits the study of the interaction between this important pathogen and its host H. sapiens. Computational prediction of H. sapiens-M. tuberculosis H37Rv PPIs is an important strategy to fill in the gap. Domain-domain interaction (DDI) based prediction is one of the frequently used computational approaches in predicting both intra-species and inter-species PPIs. However, the performance of DDI-based host-pathogen PPI prediction has been rather limited. RESULTS: We develop a stringent DDI-based prediction approach with emphasis on (i) differences between the specific domain sequences on annotated regions of proteins under the same domain ID and (ii) calculation of the interaction strength of predicted PPIs based on the interacting residues in their interaction interfaces. We compare our stringent DDI-based approach to a conventional DDI-based approach for predicting PPIs based on gold standard intra-species PPIs and coherent informative Gene Ontology terms assessment. The assessment results show that our stringent DDI-based approach achieves much better performance in predicting PPIs than the conventional approach. Using our stringent DDI-based approach, we have predicted a small set of reliable H. sapiens-M. tuberculosis H37Rv PPIs which could be very useful for a variety of related studies. We also analyze the H. sapiens-M. tuberculosis H37Rv PPIs predicted by our stringent DDI-based approach using cellular compartment distribution analysis, functional category enrichment analysis and pathway enrichment analysis. The analyses support the validity of our prediction result. Also, based on an analysis of the H. sapiens-M. tuberculosis H37Rv PPI network predicted by our stringent DDI-based approach, we have discovered some important properties of domains involved in host-pathogen PPIs. We find that both host and pathogen proteins involved in host-pathogen PPIs tend to have more domains than proteins involved in intra-species PPIs, and these domains have more interaction partners than domains on proteins involved in intra-species PPI. CONCLUSIONS: The stringent DDI-based prediction approach reported in this work provides a stringent strategy for predicting host-pathogen PPIs. It also performs better than a conventional DDI-based approach in predicting PPIs. We have predicted a small set of accurate H. sapiens-M. tuberculosis H37Rv PPIs which could be very useful for a variety of related studies.
Project description:Infectious diseases are the leading cause of death, particularly in developing countries. Although many drugs are available for treating the most common infectious diseases, in many cases the mechanism of action of these drugs or even their targets in the pathogen remain unknown. In addition, the key factors or processes in pathogens that facilitate infection and disease progression are often not well understood. Since proteins do not work in isolation, understanding biological systems requires a better understanding of the interconnectivity between proteins in different pathways and processes, which includes both physical and other functional interactions. Such biological networks can be generated within organisms or between organisms sharing a common environment using experimental data and computational predictions. Though different data sources provide different levels of accuracy, confidence in interactions can be measured using interaction scores. Connections between interacting proteins in biological networks can be represented as graphs and edges, and thus studied using existing algorithms and tools from graph theory. There are many different applications of biological networks, and here we discuss three such applications, specifically applied to the infectious disease tuberculosis, with its causative agent Mycobacterium tuberculosis and host, Homo sapiens. The applications include the use of the networks for function prediction, comparison of networks for evolutionary studies, and the generation and use of host-pathogen interaction networks.
Project description:Serotype O157:H7, an enterohemorrhagic Escherichia coli (EHEC), is known to cause gastrointestinal and systemic illnesses ranging from diarrhea and hemorrhagic colitis to potentially fatal hemolytic uremic syndrome. Specific genetic factors like ompA, nsrR, and LEE genes are known to play roles in EHEC pathogenesis. However, these factors are not specific to EHEC and their presence in several non-pathogenic strains indicates that additional factors are involved in pathogenicity. We propose a comprehensive effort to screen for such potential genetic elements, through investigation of biomolecular interactions between E. coli and their host. In this work, an in silico investigation of the protein-protein interactions (PPIs) between human cells and four EHEC strains (viz., EDL933, Sakai, EC4115, and TW14359) was performed in order to understand the virulence and host-colonization strategies of these strains. Potential host-pathogen interactions (HPIs) between human cells and the "non-pathogenic" E. coli strain MG1655 were also probed to evaluate whether and how the variations in the genomes could translate into altered virulence and host-colonization capabilities of the studied bacterial strains. Results indicate that a small subset of HPIs are unique to the studied pathogens and can be implicated in virulence. This subset of interactions involved E. coli proteins like YhdW, ChuT, EivG, and HlyA. These proteins have previously been reported to be involved in bacterial virulence. In addition, clear differences in lineage and clade-specific HPI profiles could be identified. Furthermore, available gene expression profiles of the HPI-proteins were utilized to estimate the proportion of proteins which may be involved in interactions. We hypothesized that a cumulative score of the ratios of bound:unbound proteins (involved in HPIs) would indicate the extent of colonization. Thus, we designed the Host Colonization Index (HCI) measure to determine the host colonization potential of the E. coli strains. Pathogenic strains of E. coli were observed to have higher HCIs as compared to a non-pathogenic laboratory strain. However, no significant differences among the HCIs of the two pathogenic groups were observed. Overall, our findings are expected to provide additional insights into EHEC pathogenesis and are likely to aid in designing alternate preventive and therapeutic strategies.
Project description:MPT63 is a small, major secreted protein of unknown function from Mycobacterium tuberculosis that has been shown to have immunogenic properties and has been implicated in virulence. A BLAST search identified that MPT63 has homologs only in other mycobacteria, and is therefore mycobacteria specific. As MPT63 is a secreted protein, mycobacteria specific, and implicated in virulence, MPT63 is an attractive drug target against the deadliest infectious disease, tuberculosis (TB). As part of the TB Structural Genomics Consortium, the X-ray crystal structure of MPT63 was determined to 1.5-Angstrom resolution with the hope of yielding functional information about MPT63. The structure of MPT63 is an antiparallel beta-sandwich immunoglobulin-like fold, with the unusual feature of the first beta-strand of the protein forming a parallel addition to the small antiparallel beta-sheet. MPT63 has weak structural similarity to many proteins with immunoglobulin folds, in particular, Homo sapiens beta2-adaptin, bovine arrestin, and Yersinia pseudotuberculosis invasin. Although the structure of MPT63 gives no conclusive evidence to its function, structural similarity suggests that MPT63 could be involved in cell-host interactions to facilitate endocytosis/phagocytosis.
Project description:Parasites can be used as unique markers to investigate host evolutionary history, independent of host data. Here we show that modern human head lice, Pediculus humanus, are composed of two ancient lineages, whose origin predates modern Homo sapiens by an order of magnitude (ca. 1.18 million years). One of the two louse lineages has a worldwide distribution and appears to have undergone a population bottleneck ca. 100,000 years ago along with its modern H. sapiens host. Phylogenetic and population genetic data suggest that the other lineage, found only in the New World, has remained isolated from the worldwide lineage for the last 1.18 million years. The ancient divergence between these two lice is contemporaneous with splits among early species of Homo, and cospeciation analyses suggest that the two louse lineages codiverged with a now extinct species of Homo and the lineage leading to modern H. sapiens. If these lice indeed codiverged with their hosts ca. 1.18 million years ago, then a recent host switch from an archaic species of Homo to modern H. sapiens is required to explain the occurrence of both lineages on modern H. sapiens. Such a host switch would require direct physical contact between modern and archaic forms of Homo.
Project description:Microbe induced cardiovascular diseases (CVDs) are less studied at present. Host-pathogen interactions (HPIs) between human proteins and microbial proteins associated with CVD can be found dispersed in existing molecular interaction databases. MorCVD database is a curated resource that combines 23,377 protein interactions between human host and 432 unique pathogens involved in CVDs in a single intuitive web application. It covers endocarditis, myocarditis, pericarditis and 16 other microbe induced CVDs. The HPI information has been compiled, curated, and presented in a freely accessible web interface ( http://morcvd.sblab-nsit.net/About ). Apart from organization, enrichment of the HPI data was done by adding hyperlinked protein ID, PubMed, gene ontology records. For each protein in the database, drug target and interactors (same as well as different species) information has been provided. The database can be searched by disease, protein ID, pathogen name or interaction detection method. Interactions detected by more than one method can also be listed. The information can be presented in tabular form or downloaded. A comprehensive help file has been developed to explain the various options available. Hence, MorCVD acts as a unified resource for retrieval of HPI data for researchers in CVD and microbiology.
Project description:BACKGROUND:In 2017, World Health Organization (WHO) published a catalogue of 12 families of antibiotic-resistant "priority pathogens" that are posing the greatest threats to human health. Six of these dreaded pathogens are known to infect the human gastrointestinal system. In addition to causing gastrointestinal and systemic infections, these pathogens can also affect the composition of other microbes constituting the healthy gut microbiome. Such aberrations in gut microbiome can significantly affect human physiology and immunity. Identifying the virulence mechanisms of these enteric pathogens are likely to help in developing newer therapeutic strategies to counter them. RESULTS:Using our previously published in silico approach, we have evaluated (and compared) Host-Pathogen Protein-Protein Interaction (HPI) profiles of four groups of enteric pathogens, namely, different species of Escherichia, Shigella, Salmonella and Vibrio. Results indicate that in spite of genus/ species specific variations, most enteric pathogens possess a common repertoire of HPIs. This core set of HPIs are probably responsible for the survival of these pathogen in the harsh nutrient-limiting environment within the gut. Certain genus/ species specific HPIs were also observed. CONSLUSIONS:The identified bacterial proteins involved in the core set of HPIs are expected to be helpful in understanding the pathogenesis of these dreaded gut pathogens in greater detail. Possible role of genus/ species specific variations in the HPI profiles in the virulence of these pathogens are also discussed. The obtained results are likely to provide an opportunity for development of novel therapeutic strategies against the most dreaded gut pathogens.