Identifying the preferred subset of enzymatic profiles in nonlinear kinetic metabolic models via multiobjective global optimization and Pareto filters.
ABSTRACT: Optimization models in metabolic engineering and systems biology focus typically on optimizing a unique criterion, usually the synthesis rate of a metabolite of interest or the rate of growth. Connectivity and non-linear regulatory effects, however, make it necessary to consider multiple objectives in order to identify useful strategies that balance out different metabolic issues. This is a fundamental aspect, as optimization of maximum yield in a given condition may involve unrealistic values in other key processes. Due to the difficulties associated with detailed non-linear models, analysis using stoichiometric descriptions and linear optimization methods have become rather popular in systems biology. However, despite being useful, these approaches fail in capturing the intrinsic nonlinear nature of the underlying metabolic systems and the regulatory signals involved. Targeting more complex biological systems requires the application of global optimization methods to non-linear representations. In this work we address the multi-objective global optimization of metabolic networks that are described by a special class of models based on the power-law formalism: the generalized mass action (GMA) representation. Our goal is to develop global optimization methods capable of efficiently dealing with several biological criteria simultaneously. In order to overcome the numerical difficulties of dealing with multiple criteria in the optimization, we propose a heuristic approach based on the epsilon constraint method that reduces the computational burden of generating a set of Pareto optimal alternatives, each achieving a unique combination of objectives values. To facilitate the post-optimal analysis of these solutions and narrow down their number prior to being tested in the laboratory, we explore the use of Pareto filters that identify the preferred subset of enzymatic profiles. We demonstrate the usefulness of our approach by means of a case study that optimizes the ethanol production in the fermentation of Saccharomyces cerevisiae.
Project description:Comprehensive learning particle swarm optimization (CLPSO) is a powerful state-of-the-art single-objective metaheuristic. Extending from CLPSO, this paper proposes multiswarm CLPSO (MSCLPSO) for multiobjective optimization. MSCLPSO involves multiple swarms, with each swarm associated with a separate original objective. Each particle's personal best position is determined just according to the corresponding single objective. Elitists are stored externally. MSCLPSO differs from existing multiobjective particle swarm optimizers in three aspects. First, each swarm focuses on optimizing the associated objective using CLPSO, without learning from the elitists or any other swarm. Second, mutation is applied to the elitists and the mutation strategy appropriately exploits the personal best positions and elitists. Third, a modified differential evolution (DE) strategy is applied to some extreme and least crowded elitists. The DE strategy updates an elitist based on the differences of the elitists. The personal best positions carry useful information about the Pareto set, and the mutation and DE strategies help MSCLPSO discover the true Pareto front. Experiments conducted on various benchmark problems demonstrate that MSCLPSO can find nondominated solutions distributed reasonably over the true Pareto front in a single run.
Project description:The current state of the art for linear optimization in Flux Balance Analysis has been limited to single objective functions. Since mammalian systems perform various functions, a multiobjective approach is needed when seeking optimal flux distributions in these systems. In most of the available multiobjective optimization methods, there is a lack of understanding of when to use a particular objective, and how to combine and/or prioritize mutually competing objectives to achieve a truly optimal solution. To address these limitations we developed a soft constraints based linear physical programming-based flux balance analysis (LPPFBA) framework to obtain a multiobjective optimal solutions. The developed framework was first applied to compute a set of multiobjective optimal solutions for various pairs of objectives relevant to hepatocyte function (urea secretion, albumin, NADPH, and glutathione syntheses) in bioartificial liver systems. Next, simultaneous analysis of the optimal solutions for three objectives was carried out. Further, this framework was utilized to obtain true optimal conditions to improve the hepatic functions in a simulated bioartificial liver system. The combined quantitative and visualization framework of LPPFBA is applicable to any large-scale metabolic network system, including those derived by genomic analyses.
Project description:The current inverse planning methods for intensity modulated radiation therapy (IMRT) are limited because they are not designed to explore the trade-offs between the competing objectives of tumor and normal tissues. The goal was to develop an efficient multiobjective optimization algorithm that was flexible enough to handle any form of objective function and that resulted in a set of Pareto optimal plans.A hierarchical evolutionary multiobjective algorithm designed to quickly generate a small diverse Pareto optimal set of IMRT plans that meet all clinical constraints and reflect the optimal trade-offs in any radiation therapy plan was developed. The top level of the hierarchical algorithm is a multiobjective evolutionary algorithm (MOEA). The genes of the individuals generated in the MOEA are the parameters that define the penalty function minimized during an accelerated deterministic IMRT optimization that represents the bottom level of the hierarchy. The MOEA incorporates clinical criteria to restrict the search space through protocol objectives and then uses Pareto optimality among the fitness objectives to select individuals. The population size is not fixed, but a specialized niche effect, domination advantage, is used to control the population and plan diversity. The number of fitness objectives is kept to a minimum for greater selective pressure, but the number of genes is expanded for flexibility that allows a better approximation of the Pareto front.The MOEA improvements were evaluated for two example prostate cases with one target and two organs at risk (OARs). The population of plans generated by the modified MOEA was closer to the Pareto front than populations of plans generated using a standard genetic algorithm package. Statistical significance of the method was established by compiling the results of 25 multiobjective optimizations using each method. From these sets of 12-15 plans, any random plan selected from a MOEA population had a 11.3% +/- 0.7% chance of dominating any random plan selected by a standard genetic package with 0.04% +/- 0.02% chance of domination in reverse. By implementing domination advantage and protocol objectives, small and diverse populations of clinically acceptable plans that approximated the Pareto front could be generated in a fraction of 1 h. Acceleration techniques implemented on both levels of the hierarchical algorithm resulted in short, practical runtimes for multiobjective optimizations.The MOEA produces a diverse Pareto optimal set of plans that meet all dosimetric protocol criteria in a feasible amount of time. The final goal is to improve practical aspects of the algorithm and integrate it with a decision analysis tool or human interface for selection of the IMRT plan with the best possible balance of successful treatment of the target with low OAR dose and low risk of complication for any specific patient situation.
Project description:The computational evolution of gene networks functions like a forward genetic screen to generate, without preconceptions, all networks that can be assembled from a defined list of parts to implement a given function. Frequently networks are subject to multiple design criteria that cannot all be optimized simultaneously. To explore how these tradeoffs interact with evolution, we implement Pareto optimization in the context of gene network evolution. In response to a temporal pulse of a signal, we evolve networks whose output turns on slowly after the pulse begins, and shuts down rapidly when the pulse terminates. The best performing networks under our conditions do not fall into categories such as feed forward and negative feedback that also encode the input-output relation we used for selection. Pareto evolution can more efficiently search the space of networks than optimization based on a single ad hoc combination of the design criteria.
Project description:To investigate how using different sets of decision criteria impacts the quality of intensity modulated radiation therapy (IMRT) plans obtained by multiobjective optimization.A multiobjective optimization evolutionary algorithm (MOEA) was used to produce sets of IMRT plans. The MOEA consisted of two interacting algorithms: (i) a deterministic inverse planning optimization of beamlet intensities that minimizes a weighted sum of quadratic penalty objectives to generate IMRT plans and (ii) an evolutionary algorithm that selects the superior IMRT plans using decision criteria and uses those plans to determine the new weights and penalty objectives of each new plan. Plans resulting from the deterministic algorithm were evaluated by the evolutionary algorithm using a set of decision criteria for both targets and organs at risk (OARs). Decision criteria used included variation in the target dose distribution, mean dose, maximum dose, generalized equivalent uniform dose (gEUD), an equivalent uniform dose (EUD(alpha,beta) formula derived from the linear-quadratic survival model, and points on dose volume histograms (DVHs). In order to quantatively compare results from trials using different decision criteria, a neutral set of comparison metrics was used. For each set of decision criteria investigated, IMRT plans were calculated for four different cases: two simple prostate cases, one complex prostate Case, and one complex head and neck Case.When smaller numbers of decision criteria, more descriptive decision criteria, or less anti-correlated decision criteria were used to characterize plan quality during multiobjective optimization, dose to OARs and target dose variation were reduced in the final population of plans. Mean OAR dose and gEUD (a = 4) decision criteria were comparable. Using maximum dose decision criteria for OARs near targets resulted in inferior populations that focused solely on low target variance at the expense of high OAR dose. Target dose range, (D(max) - D(min)), decision criteria were found to be most effective for keeping targets uniform. Using target gEUD decision criteria resulted in much lower OAR doses but much higher target dose variation. EUD(alpha,beta) based decision criteria focused on a region of plan space that was a compromise between target and OAR objectives. None of these target decision criteria dominated plans using other criteria, but only focused on approaching a different area of the Pareto front.The choice of decision criteria implemented in the MOEA had a significant impact on the region explored and the rate of convergence toward the Pareto front. When more decision criteria, anticorrelated decision criteria, or decision criteria with insufficient information were implemented, inferior populations are resulted. When more informative decision criteria were used, such as gEUD, EUD(alpha,beta), target dose range, and mean dose, MOEA optimizations focused on approaching different regions of the Pareto front, but did not dominate each other. Using simple OAR decision criteria and target EUD(alpha,beta) decision criteria demonstrated the potential to generate IMRT plans that significantly reduce dose to OARs while achieving the same or better tumor control when clinical requirements on target dose variance can be met or relaxed.
Project description:Parameter optimization of a hydrological model is intrinsically a high dimensional, nonlinear, multivariable, combinatorial optimization problem which involves a set of different objectives. Currently, the assessment of optimization results for the hydrological model is usually made through calculations and comparisons of objective function values of simulated and observed variables. Thus, the proper selection of objective functions' combination for model parameter optimization has an important impact on the hydrological forecasting. There exist various objective functions, and how to analyze and evaluate the objective function combinations for selecting the optimal parameters has not been studied in depth. Therefore, to select the proper objective function combination which can balance the trade-off among various design objectives and achieve the overall best benefit, a simple and convenient framework for the comparison of the influence of different objective function combinations on the optimization results is urgently needed. In this paper, various objective functions related to parameters optimization of hydrological models were collected from the literature and constructed to nine combinations. Then, a selection and evaluation framework of objective functions is proposed for hydrological model parameter optimization, in which a multiobjective artificial bee colony algorithm named RMOABC is employed to optimize the hydrological model and obtain the Pareto optimal solutions. The parameter optimization problem of the Xinanjiang hydrological model was taken as the application case for long-term runoff prediction in the Heihe River basin. Finally, the technique for order preference by similarity to ideal solution (TOPSIS) based on the entropy theory is adapted to sort the Pareto optimal solutions to compare these combinations of objective functions and obtain the comprehensive optimal objective functions' combination. The experiments results demonstrate that the combination 2 of objective functions can provide more comprehensive and reliable dominant options (i.e., parameter sets) for practical hydrological forecasting in the study area. The entropy-based method has been proved that it is effective to analyze and evaluate the performance of different combinations of objective functions and can provide more comprehensive and impersonal decision support for hydrological forecasting.
Project description:Hyperspectral image (HSI) consists of hundreds of narrow spectral band components with rich spectral and spatial information. Extreme Learning Machine (ELM) has been widely used for HSI analysis. However, the classical ELM is difficult to use for sparse feature leaning due to its randomly generated hidden layer. In this paper, we propose a novel unsupervised sparse feature learning approach, called Evolutionary Multiobjective-based ELM (EMO-ELM), and apply it to HSI feature extraction. Specifically, we represent the task of constructing the ELM Autoencoder (ELM-AE) as a multiobjective optimization problem that takes the sparsity of hidden layer outputs and the reconstruction error as two conflicting objectives. Then, we adopt an Evolutionary Multiobjective Optimization (EMO) method to solve the two objectives, simultaneously. To find the best solution from the Pareto solution set and construct the best trade-off feature extractor, a curvature-based method is proposed to focus on the knee area of the Pareto solutions. Benefited from the EMO, the proposed EMO-ELM is less prone to fall into a local minimum and has fewer trainable parameters than gradient-based AEs. Experiments on two real HSIs demonstrate that the features learned by EMO-ELM not only preserve better sparsity but also achieve superior separability than many existing feature learning methods.
Project description:The accelerated discovery of materials for real world applications requires the achievement of multiple design objectives. The multidimensional nature of the search necessitates exploration of multimillion compound libraries over which even density functional theory (DFT) screening is intractable. Machine learning (e.g., artificial neural network, ANN, or Gaussian process, GP) models for this task are limited by training data availability and predictive uncertainty quantification (UQ). We overcome such limitations by using efficient global optimization (EGO) with the multidimensional expected improvement (EI) criterion. EGO balances exploitation of a trained model with acquisition of new DFT data at the Pareto front, the region of chemical space that contains the optimal trade-off between multiple design criteria. We demonstrate this approach for the simultaneous optimization of redox potential and solubility in candidate M(II)/M(III) redox couples for redox flow batteries from a space of 2.8 M transition metal complexes designed for stability in practical redox flow battery (RFB) applications. We show that a multitask ANN with latent-distance-based UQ surpasses the generalization performance of a GP in this space. With this approach, ANN prediction and EI scoring of the full space are achieved in minutes. Starting from ca. 100 representative points, EGO improves both properties by over 3 standard deviations in only five generations. Analysis of lookahead errors confirms rapid ANN model improvement during the EGO process, achieving suitable accuracy for predictive design in the space of transition metal complexes. The ANN-driven EI approach achieves at least 500-fold acceleration over random search, identifying a Pareto-optimal design in around 5 weeks instead of 50 years.
Project description:In developing improved protein variants by site-directed mutagenesis or recombination, there are often competing objectives that must be considered in designing an experiment (selecting mutations or breakpoints): stability versus novelty, affinity versus specificity, activity versus immunogenicity, and so forth. Pareto optimal experimental designs make the best trade-offs between competing objectives. Such designs are not "dominated"; that is, no other design is better than a Pareto optimal design for one objective without being worse for another objective. Our goal is to produce all the Pareto optimal designs (the Pareto frontier), to characterize the trade-offs and suggest designs most worth considering, but to avoid explicitly considering the large number of dominated designs. To do so, we develop a divide-and-conquer algorithm, Protein Engineering Pareto FRontier (PEPFR), that hierarchically subdivides the objective space, using appropriate dynamic programming or integer programming methods to optimize designs in different regions. This divide-and-conquer approach is efficient in that the number of divisions (and thus calls to the optimizer) is directly proportional to the number of Pareto optimal designs. We demonstrate PEPFR with three protein engineering case studies: site-directed recombination for stability and diversity via dynamic programming, site-directed mutagenesis of interacting proteins for affinity and specificity via integer programming, and site-directed mutagenesis of a therapeutic protein for activity and immunogenicity via integer programming. We show that PEPFR is able to effectively produce all the Pareto optimal designs, discovering many more designs than previous methods. The characterization of the Pareto frontier provides additional insights into the local stability of design choices as well as global trends leading to trade-offs between competing criteria.
Project description:Mathematical modeling of complex gene expression programs is an emerging tool for understanding disease mechanisms. However, identification of large models sometimes requires training using qualitative, conflicting or even contradictory data sets. One strategy to address this challenge is to estimate experimentally constrained model ensembles using multiobjective optimization. In this study, we used Pareto Optimal Ensemble Techniques (POETs) to identify a family of proof-of-concept signal transduction models. POETs integrate Simulated Annealing (SA) with Pareto optimality to identify models near the optimal tradeoff surface between competing training objectives. We modeled a prototypical-signaling network using mass-action kinetics within an ordinary differential equation (ODE) framework (64 ODEs in total). The true model was used to generate synthetic immunoblots from which the POET algorithm identified the 117 unknown model parameters. POET generated an ensemble of signaling models, which collectively exhibited population-like behavior. For example, scaled gene expression levels were approximately normally distributed over the ensemble following the addition of extracellular ligand. Also, the ensemble recovered robust and fragile features of the true model, despite significant parameter uncertainty. Taken together, these results suggest that experimentally constrained model ensembles could capture qualitatively important network features without exact parameter information.