Project description:MotivationThe conformational B-cell epitopes are the specific sites on the antigens that have immune functions. The identification of conformational B-cell epitopes is of great importance to immunologists for facilitating the design of peptide-based vaccines. As an attempt to narrow the search for experimental validation, various computational models have been developed for the epitope prediction by using antigen structures. However, the application of these models is undermined by the limited number of available antigen structures. In contrast to the most of available structure-based methods, we here attempt to accurately predict conformational B-cell epitopes from antigen sequences.MethodsIn this paper, we explore various sequence-derived features, which have been observed to be associated with the location of epitopes or ever used in the similar tasks. These features are evaluated and ranked by their discriminative performance on the benchmark datasets. From the perspective of information science, the combination of various features can usually lead to better results than the individual features. In order to build the robust model, we adopt the ensemble learning approach to incorporate various features, and develop the ensemble model to predict conformational epitopes from antigen sequences.ResultsEvaluated by the leave-one-out cross validation, the proposed method gives out the mean AUC scores of 0.687 and 0.651 on two datasets respectively compiled from the bound structures and unbound structures. When compared with publicly available servers by using the independent dataset, our method yields better or comparable performance. The results demonstrate the proposed method is useful for the sequence-based conformational epitope prediction.AvailabilityThe web server and datasets are freely available at http://bcell.whu.edu.cn.
Project description:Immune responses can make protein therapeutics ineffective or even dangerous. We describe a general computational protein design method for reducing immunogenicity by eliminating known and predicted T-cell epitopes and maximizing the content of human peptide sequences without disrupting protein structure and function. We show that the method recapitulates previous experimental results on immunogenicity reduction, and we use it to disrupt T-cell epitopes in GFP and Pseudomonas exotoxin A without disrupting function.
Project description:BackgroundThe incomplete ground truth of training data of B-cell epitopes is a demanding issue in computational epitope prediction. The challenge is that only a small fraction of the surface residues of an antigen are confirmed as antigenic residues (positive training data); the remaining residues are unlabeled. As some of these uncertain residues can possibly be grouped to form novel but currently unknown epitopes, it is misguided to unanimously classify all the unlabeled residues as negative training data following the traditional supervised learning scheme.ResultsWe propose a positive-unlabeled learning algorithm to address this problem. The key idea is to distinguish between epitope-likely residues and reliable negative residues in unlabeled data. The method has two steps: (1) identify reliable negative residues using a weighted SVM with a high recall; and (2) construct a classification model on the positive residues and the reliable negative residues. Complex-based 10-fold cross-validation was conducted to show that this method outperforms those commonly used predictors DiscoTope 2.0, ElliPro and SEPPA 2.0 in every aspect. We conducted four case studies, in which the approach was tested on antigens of West Nile virus, dihydrofolate reductase, beta-lactamase, and two Ebola antigens whose epitopes are currently unknown. All the results were assessed on a newly-established data set of antigen structures not bound by antibodies, instead of on antibody-bound antigen structures. These bound structures may contain unfair binding information such as bound-state B-factors and protrusion index which could exaggerate the epitope prediction performance. Source codes are available on request.
Project description:Directed cell conversion (or transdifferentiation) of one somatic cell-type to another can be achieved by ectopic expression of a set of transcription factors. Since the experimental identification of transcription factors for transdifferentiation is extremely time-consuming and expensive, there are still relatively few transdifferentiations achieved in comparison to the number of human cell-types. However, the growing volume of transcriptional data available and the recent introduction of data-driven algorithmic approaches that predict factors for transdifferentiation holds great promise for accelerating this field. Here we review those computational methods whose in-silico predictions have been experimentally validated, highlighting differences and similarities. Our analysis reveals that the factors predicted by each method tend to be different due to varying source cells used, gene expression quantification and algorithmic steps. We show these differences have an impact on the regulatory influences downstream, with some methods favoring transcription factors regulating developmental progression and others favoring factors regulating mature cell processes. These computational approaches offer a starting point to predict and test novel factors for transdifferentiation. We argue that collecting high-quality gene expression data from single-cells or pure cell-populations across a broader set of cell-types would be necessary to improve the quality and consistency of the in-silico predictions.
Project description:Antibodies have become an indispensable tool for many biotechnological and clinical applications. They bind their molecular target (antigen) by recognizing a portion of its structure (epitope) in a highly specific manner. The ability to predict epitopes from antigen sequences alone is a complex task. Despite substantial effort, limited advancement has been achieved over the last decade in the accuracy of epitope prediction methods, especially for those that rely on the sequence of the antigen only. Here, we present BepiPred-2.0 (http://www.cbs.dtu.dk/services/BepiPred/), a web server for predicting B-cell epitopes from antigen sequences. BepiPred-2.0 is based on a random forest algorithm trained on epitopes annotated from antibody-antigen protein structures. This new method was found to outperform other available tools for sequence-based epitope prediction both on epitope data derived from solved 3D structures, and on a large collection of linear epitopes downloaded from the IEDB database. The method displays results in a user-friendly and informative way, both for computer-savvy and non-expert users. We believe that BepiPred-2.0 will be a valuable tool for the bioinformatics and immunology community.
Project description:BackgroundThe broad heterogeneity of antigen-antibody interactions brings tremendous challenges to the design of a widely applicable learning algorithm to identify conformational B-cell epitopes. Besides the intrinsic heterogeneity introduced by diverse species, extra heterogeneity can also be introduced by various data sources, adding another layer of complexity and further confounding the research.ResultsThis work proposed a staged heterogeneity learning method, which learns both characteristics and heterogeneity of data in a phased manner. The method was applied to identify antigenic residues of heterogenous conformational B-cell epitopes based on antigen sequences. In the first stage, the model learns the general epitope patterns of each kind of propensity from a large data set containing computationally defined epitopes. In the second stage, the model learns the heterogenous complementarity of these propensities from a relatively small guided data set containing experimentally determined epitopes. Moreover, we designed an algorithm to cluster the predicted individual antigenic residues into conformational B-cell epitopes so as to provide strong potential for real-world applications, such as vaccine development. With heterogeneity well learnt, the transferability of the prediction model was remarkably improved to handle new data with a high level of heterogeneity. The model has been tested on two data sets with experimentally determined epitopes, and on a data set with computationally defined epitopes. This proposed sequence-based method achieved outstanding performance - about twice that of existing methods, including the sequence-based predictor CBTOPE and three other structure-based predictors.ConclusionsThe proposed method uses only antigen sequence information, and thus has much broader applications.
Project description:MotivationB-cell epitope is a small area on the surface of an antigen that binds to an antibody. Accurately locating epitopes is of critical importance for vaccine development. Compared with wet-lab methods, computational methods have strong potential for efficient and large-scale epitope prediction for antigen candidates at much lower cost. However, it is still not clear which features are good determinants for accurate epitope prediction, leading to the unsatisfactory performance of existing prediction methods.Method and resultsWe propose a much more accurate B-cell epitope prediction method. Our method uses a new feature B factor (obtained from X-ray crystallography), combined with other basic physicochemical, statistical, evolutionary and structural features of each residue. These basic features are extended by a sequence window and a structure window. All these features are then learned by a two-stage random forest model to identify clusters of antigenic residues and to remove isolated outliers. Tested on a dataset of 55 epitopes from 45 tertiary structures, we prove that our method significantly outperforms all three existing structure-based epitope predictors. Following comprehensive analysis, it is found that features such as B factor, relative accessible surface area and protrusion index play an important role in characterizing B-cell epitopes. Our detailed case studies on an HIV antigen and an influenza antigen confirm that our second stage learning is effective for clustering true antigenic residues and for eliminating self-made prediction errors introduced by the first-stage learning.Availability and implementationSource codes are available on request.
Project description:BackgroundThe nature of epitopes on Bet v 1 recognized by natural IgG antibodies of birch pollen allergic patients and birch pollen-exposed but non-sensitized subjects has not been studied in detail.ObjectiveTo investigate IgE and IgG recognition of Bet v 1 and to study the effects of natural Bet v 1-specific IgG antibodies on IgE recognition of Bet v 1 and Bet v 1-induced basophil activation.MethodsSera from birch pollen allergic patients (BPA, n = 76), allergic patients without birch pollen allergy (NBPA, n = 40) and non-allergic individuals (NA, n = 48) were tested for IgE, IgG as well as IgG1 and IgG4 reactivity to folded recombinant Bet v 1, two unfolded recombinant Bet v 1 fragments comprising the N-terminal (F1) and C-terminal half of Bet v 1 (F2) and unfolded peptides spanning the corresponding sequences of Bet v 1 and the apple allergen Mal d 1 by ELISA or micro-array analysis. The ability of Bet v 1-specific serum antibodies from non-allergic subjects to inhibit allergic patients IgE or IgG binding to rBet v 1 or to unfolded Bet v 1-derivatives was assessed by competition ELISAs. Furthermore, the ability of serum antibodies from allergic and non-allergic subjects to modulate Bet v 1-induced basophil activation was investigated using rat basophilic leukaemia cells expressing the human FcεRI which had been loaded with IgE from BPA patients.ResultsIgE antibodies from BPA patients react almost exclusively with conformational epitopes whereas IgG, IgG1 and IgG4 antibodies from BPA, NBPA and NA subjects recognize mainly unfolded and sequential epitopes. IgG competition studies show that IgG specific for unfolded/sequential Bet v 1 epitopes is not inhibited by folded Bet v 1 and hence the latter seem to represent cryptic epitopes. IgG reactivity to Bet v 1 peptides did not correlate with IgG reactivity to the corresponding Mal d 1 peptides and therefore does not seem to be a result of primary sensitization to PR10 allergen-containing food. Natural Bet v 1-specific IgG antibodies inhibited IgE binding to Bet v 1 only poorly and could even enhance Bet v 1-specific basophil activation.ConclusionIgE and IgG antibodies from BPA patients and birch pollen-exposed non-sensitized subjects recognize different epitopes. These findings explain why natural allergen-specific IgG do not protect against allergic symptoms and suggest that allergen-specific IgE and IgG have different clonal origin.