Unknown

Dataset Information

0

CSM-Toxin: A Web-Server for Predicting Protein Toxicity.


ABSTRACT: Biologics are one of the most rapidly expanding classes of therapeutics, but can be associated with a range of toxic properties. In small-molecule drug development, early identification of potential toxicity led to a significant reduction in clinical trial failures, however we currently lack robust qualitative rules or predictive tools for peptide- and protein-based biologics. To address this, we have manually curated the largest set of high-quality experimental data on peptide and protein toxicities, and developed CSM-Toxin, a novel in-silico protein toxicity classifier, which relies solely on the protein primary sequence. Our approach encodes the protein sequence information using a deep learning natural languages model to understand "biological" language, where residues are treated as words and protein sequences as sentences. The CSM-Toxin was able to accurately identify peptides and proteins with potential toxicity, achieving an MCC of up to 0.66 across both cross-validation and multiple non-redundant blind tests, outperforming other methods and highlighting the robust and generalisable performance of our model. We strongly believe the CSM-Toxin will serve as a valuable platform to minimise potential toxicity in the biologic development pipeline. Our method is freely available as an easy-to-use webserver.

SUBMITTER: Morozov V 

PROVIDER: S-EPMC9966851 | biostudies-literature | 2023 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

CSM-Toxin: A Web-Server for Predicting Protein Toxicity.

Morozov Vladimir V   Rodrigues Carlos H M CHM   Ascher David B DB  

Pharmaceutics 20230128 2


Biologics are one of the most rapidly expanding classes of therapeutics, but can be associated with a range of toxic properties. In small-molecule drug development, early identification of potential toxicity led to a significant reduction in clinical trial failures, however we currently lack robust qualitative rules or predictive tools for peptide- and protein-based biologics. To address this, we have manually curated the largest set of high-quality experimental data on peptide and protein toxic  ...[more]

Similar Datasets

| S-EPMC4987933 | biostudies-literature
| S-EPMC7285591 | biostudies-literature
| S-EPMC9252832 | biostudies-literature
| S-EPMC1538894 | biostudies-literature
| S-EPMC4574079 | biostudies-literature
| S-EPMC3146400 | biostudies-literature
| S-EPMC2636624 | biostudies-literature
| S-EPMC4370743 | biostudies-literature
| S-EPMC2803855 | biostudies-other
| S-EPMC3597141 | biostudies-literature