Unknown

Dataset Information

0

Molecular function recognition by supervised projection pursuit machine learning.


ABSTRACT: Identifying mechanisms that control molecular function is a significant challenge in pharmaceutical science and molecular engineering. Here, we present a novel projection pursuit recurrent neural network to identify functional mechanisms in the context of iterative supervised machine learning for discovery-based design optimization. Molecular function recognition is achieved by pairing experiments that categorize systems with digital twin molecular dynamics simulations to generate working hypotheses. Feature extraction decomposes emergent properties of a system into a complete set of basis vectors. Feature selection requires signal-to-noise, statistical significance, and clustering quality to concurrently surpass acceptance levels. Formulated as a multivariate description of differences and similarities between systems, the data-driven working hypothesis is refined by analyzing new systems prioritized by a discovery-likelihood. Utility and generality are demonstrated on several benchmarks, including the elucidation of antibiotic resistance in TEM-52 beta-lactamase. The software is freely available, enabling turnkey analysis of massive data streams found in computational biology and material science.

SUBMITTER: Grear T 

PROVIDER: S-EPMC7895977 | biostudies-literature | 2021 Feb

REPOSITORIES: biostudies-literature

altmetric image

Publications

Molecular function recognition by supervised projection pursuit machine learning.

Grear Tyler T   Avery Chris C   Patterson John J   Jacobs Donald J DJ  

Scientific reports 20210219 1


Identifying mechanisms that control molecular function is a significant challenge in pharmaceutical science and molecular engineering. Here, we present a novel projection pursuit recurrent neural network to identify functional mechanisms in the context of iterative supervised machine learning for discovery-based design optimization. Molecular function recognition is achieved by pairing experiments that categorize systems with digital twin molecular dynamics simulations to generate working hypoth  ...[more]

Similar Datasets

| S-EPMC8725656 | biostudies-literature
| S-EPMC2459243 | biostudies-literature
| S-EPMC6620704 | biostudies-literature
| S-EPMC9200117 | biostudies-literature
2021-06-02 | GSE175942 | GEO
| S-EPMC6140545 | biostudies-literature
| S-EPMC8589823 | biostudies-literature