Ontology highlight
ABSTRACT: Motivation
The urgent need for efficient and sustainable biological production of fuels and high-value chemicals has elicited a wave of in silico techniques for identifying promising novel pathways to these compounds in large putative metabolic networks. To date, these approaches have primarily used general graph search algorithms, which are prohibitively slow as putative metabolic networks may exceed 1 million compounds. To alleviate this limitation, we report two methods--SimIndex (SI) and SimZyme--which use chemical similarity of 2D chemical fingerprints to efficiently navigate large metabolic networks and propose enzymatic connections between the constituent nodes. We also report a Byers-Waterman type pathway search algorithm for further paring down pertinent networks.Results
Benchmarking tests run with SI show it can reduce the number of nodes visited in searching a putative network by 100-fold with a computational time improvement of up to 10(5)-fold. Subsequent Byers-Waterman search application further reduces the number of nodes searched by up to 100-fold, while SimZyme demonstrates ∼ 90% accuracy in matching query substrates with enzymes. Using these modules, we have designed and annotated an alternative to the methylerythritol phosphate pathway to produce isopentenyl pyrophosphate with more favorable thermodynamics than the native pathway. These algorithms will have a significant impact on our ability to use large metabolic networks that lack annotation of promiscuous reactions.Availability and implementation
Python files will be available for download at http://tyolab.northwestern.edu/tools/.Supplementary information
Supplementary data are available at Bioinformatics online.
SUBMITTER: Pertusi DA
PROVIDER: S-EPMC4382900 | biostudies-literature | 2015 Apr
REPOSITORIES: biostudies-literature
Pertusi Dante A DA Stine Andrew E AE Broadbelt Linda J LJ Tyo Keith E J KE
Bioinformatics (Oxford, England) 20141121 7
<h4>Motivation</h4>The urgent need for efficient and sustainable biological production of fuels and high-value chemicals has elicited a wave of in silico techniques for identifying promising novel pathways to these compounds in large putative metabolic networks. To date, these approaches have primarily used general graph search algorithms, which are prohibitively slow as putative metabolic networks may exceed 1 million compounds. To alleviate this limitation, we report two methods--SimIndex (SI) ...[more]