Unknown

Dataset Information

0

MulinforCPI: enhancing precision of compound-protein interaction prediction through novel perspectives on multi-level information integration.


ABSTRACT: Forecasting the interaction between compounds and proteins is crucial for discovering new drugs. However, previous sequence-based studies have not utilized three-dimensional (3D) information on compounds and proteins, such as atom coordinates and distance matrices, to predict binding affinity. Furthermore, numerous widely adopted computational techniques have relied on sequences of amino acid characters for protein representations. This approach may constrain the model's ability to capture meaningful biochemical features, impeding a more comprehensive understanding of the underlying proteins. Here, we propose a two-step deep learning strategy named MulinforCPI that incorporates transfer learning techniques with multi-level resolution features to overcome these limitations. Our approach leverages 3D information from both proteins and compounds and acquires a profound understanding of the atomic-level features of proteins. Besides, our research highlights the divide between first-principle and data-driven methods, offering new research prospects for compound-protein interaction tasks. We applied the proposed method to six datasets: Davis, Metz, KIBA, CASF-2016, DUD-E and BindingDB, to evaluate the effectiveness of our approach.

SUBMITTER: Nguyen NQ 

PROVIDER: S-EPMC10768804 | biostudies-literature | 2023 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

MulinforCPI: enhancing precision of compound-protein interaction prediction through novel perspectives on multi-level information integration.

Nguyen Ngoc-Quang NQ   Park Sejeong S   Gim Mogan M   Kang Jaewoo J  

Briefings in bioinformatics 20231101 1


Forecasting the interaction between compounds and proteins is crucial for discovering new drugs. However, previous sequence-based studies have not utilized three-dimensional (3D) information on compounds and proteins, such as atom coordinates and distance matrices, to predict binding affinity. Furthermore, numerous widely adopted computational techniques have relied on sequences of amino acid characters for protein representations. This approach may constrain the model's ability to capture meani  ...[more]

Similar Datasets

| S-EPMC10933411 | biostudies-literature
| S-EPMC11190375 | biostudies-literature
| S-EPMC8443569 | biostudies-literature
| S-EPMC4004462 | biostudies-literature
| S-EPMC5603535 | biostudies-literature
| S-EPMC6993057 | biostudies-literature
| S-EPMC11361855 | biostudies-literature
| S-EPMC8696111 | biostudies-literature
| S-EPMC10423023 | biostudies-literature
| S-EPMC11014792 | biostudies-literature