Profiling the human response to physical exercise: a computational strategy for the identification and kinetic analysis of metabolic biomarkers.
ABSTRACT: BACKGROUND: In metabolomics, biomarker discovery is a highly data driven process and requires sophisticated computational methods for the search and prioritization of novel and unforeseen biomarkers in data, typically gathered in preclinical or clinical studies. In particular, the discovery of biomarker candidates from longitudinal cohort studies is crucial for kinetic analysis to better understand complex metabolic processes in the organism during physical activity. FINDINGS: In this work we introduce a novel computational strategy that allows to identify and study kinetic changes of putative biomarkers using targeted MS/MS profiling data from time series cohort studies or other cross-over designs. We propose a prioritization model with the objective of classifying biomarker candidates according to their discriminatory ability and couple this discovery step with a novel network-based approach to visualize, review and interpret key metabolites and their dynamic interactions within the network. The application of our method on longitudinal stress test data revealed a panel of metabolic signatures, i.e., lactate, alanine, glycine and the short-chain fatty acids C2 and C3 in trained and physically fit persons during bicycle exercise. CONCLUSIONS: We propose a new computational method for the discovery of new signatures in dynamic metabolic profiling data which revealed known and unexpected candidate biomarkers in physical activity. Many of them could be verified and confirmed by literature. Our computational approach is freely available as R package termed BiomarkeR under LGPL via CRAN http://cran.r-project.org/web/packages/BiomarkeR/.
Project description:Metabolic biomarkers may play an important role in the diagnosis, prognostication and assessment of response to pharmacological therapy in complex diseases. The process of discovering new metabolic biomarkers is a non-trivial task which involves a number of bioanalytical processing steps coupled with a computational approach for the search, prioritization and verification of new biomarker candidates. Kinetic analysis provides an additional dimension of complexity in time-series data, allowing for a more precise interpretation of biomarker dynamics in terms of molecular interaction and pathway modulation. A novel network-based computational strategy for the discovery of putative dynamic biomarker candidates is presented, enabling the identification and verification of unexpected metabolic signatures in complex diseases such as myocardial infarction. The novelty of the proposed method lies in combining metabolic time-series data into a superimposed graph representation, highlighting the strength of the underlying kinetic interaction of preselected analytes. Using this approach, we were able to confirm known metabolic signatures and also identify new candidates such as carnosine and glycocholic acid, and pathways that have been previously associated with cardiovascular or related diseases. This computational strategy may serve as a complementary tool for the discovery of dynamic metabolic or proteomic biomarkers in the field of clinical medicine.
Project description:A key barrier to the realization of personalized medicine for cancer is the identification of biomarkers. Here we describe a two-stage strategy for the discovery of serum biomarker signatures corresponding to specific cancer-causing mutations and its application to prostate cancer (PCa) in the context of the commonly occurring phosphatase and tensin homolog (PTEN) tumor-suppressor gene inactivation. In the first stage of our approach, we identified 775 N-linked glycoproteins from sera and prostate tissue of wild-type and Pten-null mice. Using label-free quantitative proteomics, we showed that Pten inactivation leads to measurable perturbations in the murine prostate and serum glycoproteome. Following bioinformatic prioritization, in a second stage we applied targeted proteomics to detect and quantify 39 human ortholog candidate biomarkers in the sera of PCa patients and control individuals. The resulting proteomic profiles were analyzed by machine learning to build predictive regression models for tissue PTEN status and diagnosis and grading of PCa. Our approach suggests a general path to rational cancer biomarker discovery and initial validation guided by cancer genetics and based on the integration of experimental mouse models, proteomics-based technologies, and computational modeling.
Project description:The biological fingerprint of environmental adversity may be key to understanding health and disease, as it encompasses the damage induced as well as the compensatory reactions of the organism. Metabolic and hormonal changes may be an informative but incomplete window into the underlying biology. We endeavored to identify objective blood gene expression biomarkers for psychological stress, a subjective sensation with biological roots. To quantify the stress perception at a particular moment in time, we used a simple visual analog scale for life stress in psychiatric patients, a high-risk group. Then, using a stepwise discovery, prioritization, validation, and testing in independent cohort design, we were successful in identifying gene expression biomarkers that were predictive of high-stress states and of future psychiatric hospitalizations related to stress, more so when personalized by gender and diagnosis. One of the top biomarkers that survived discovery, prioritization, validation, and testing was FKBP5, a well-known gene involved in stress response, which serves as a de facto reassuring positive control. We also compared our biomarker findings with telomere length (TL), another well-established biological marker of psychological stress and show that newly identified predictive biomarkers such as NUB1, APOL3, MAD1L1, or NKTR are comparable or better state or trait predictors of stress than TL or FKBP5. Over half of the top predictive biomarkers for stress also had prior evidence of involvement in suicide, and the majority of them had evidence in other psychiatric disorders, providing a molecular underpinning for the effects of stress in those disorders. Some of the biomarkers are targets of existing drugs, of potential utility in patient stratification, and pharmacogenomics approaches. Based on our studies and analyses, the biomarkers with the best overall convergent functional evidence (CFE) for involvement in stress were FKBP5, DDX6, B2M, LAIR1, RTN4, and NUB1. Moreover, the biomarker gene expression signatures yielded leads for possible new drug candidates and natural compounds upon bioinformatics drug repurposing analyses, such as calcium folinate and betulin. Our work may lead to improved diagnosis and treatment for stress disorders such as PTSD, that result in decreased quality of life and adverse outcomes, including addictions, violence, and suicide.
Project description:Optimal health is maintained by interaction of multiple intrinsic and environmental factors at different levels of complexity-from molecular, to physiological, to social. Understanding and quantification of these interactions will aid design of successful health interventions. We introduce the reference network concept as a platform for multi-level exploration of biological relations relevant for metabolic health, by integration and mining of biological interactions derived from public resources and context-specific experimental data. A White Adipose Tissue Health Reference Network (WATRefNet) was constructed as a resource for discovery and prioritization of mechanism-based biomarkers for white adipose tissue (WAT) health status and the effect of food and drug compounds on WAT health status. The WATRefNet (6,797 nodes and 32,171 edges) is based on (1) experimental data obtained from 10 studies addressing different adiposity states, (2) seven public knowledge bases of molecular interactions, (3) expert's definitions of five physiologically relevant processes key to WAT health, namely WAT expandability, Oxidative capacity, Metabolic state, Oxidative stress and Tissue inflammation, and (4) a collection of relevant biomarkers of these processes identified by BIOCLAIMS ( http://bioclaims.uib.es ). The WATRefNet comprehends multiple layers of biological complexity as it contains various types of nodes and edges that represent different biological levels and interactions. We have validated the reference network by showing overrepresentation with anti-obesity drug targets, pathology-associated genes and differentially expressed genes from an external disease model dataset. The resulting network has been used to extract subnetworks specific to the above-mentioned expert-defined physiological processes. Each of these process-specific signatures represents a mechanistically supported composite biomarker for assessing and quantifying the effect of interventions on a physiological aspect that determines WAT health status. Following this principle, five anti-diabetic drug interventions and one diet intervention were scored for the match of their expression signature to the five biomarker signatures derived from the WATRefNet. This confirmed previous observations of successful intervention by dietary lifestyle and revealed WAT-specific effects of drug interventions. The WATRefNet represents a sustainable knowledge resource for extraction of relevant relationships such as mechanisms of action, nutrient intervention targets and biomarkers and for assessment of health effects for support of health claims made on food products.
Project description:Early diagnosis of inborn errors of metabolism is commonly performed through biofluid metabolomics, which detects specific metabolic biomarkers whose concentration is altered due to genomic mutations. The identification of new biomarkers is of major importance to biomedical research and is usually performed through data mining of metabolomic data. After the recent publication of the genome-scale network model of human metabolism, we present a novel computational approach for systematically predicting metabolic biomarkers in stochiometric metabolic models. Applying the method to predict biomarkers for disruptions of red-blood cell metabolism demonstrates a marked correlation with altered metabolic concentrations inferred through kinetic model simulations. Applying the method to the genome-scale human model reveals a set of 233 metabolites whose concentration is predicted to be either elevated or reduced as a result of 176 possible dysfunctional enzymes. The method's predictions are shown to significantly correlate with known disease biomarkers and to predict many novel potential biomarkers. Using this method to prioritize metabolite measurement experiments to identify new biomarkers can provide an order of a 10-fold increase in biomarker detection performance.
Project description:<h4>Purpose</h4>The aim of this study is to identify differential metabolomic signatures in plasma samples of distinct subtypes of breast cancer patients that could be used in clinical practice as diagnostic biomarkers for these molecular phenotypes and to provide a more individualized and accurate therapeutic procedure.<h4>Methods</h4>Untargeted LC-HRMS metabolomics approach in positive and negative electrospray ionization mode was used to analyze plasma samples from LA, LB, HER2+ and TN breast cancer patients and healthy controls in order to determine specific metabolomic profiles through univariate and multivariate statistical data analysis.<h4>Results</h4>We tentatively identified altered metabolites displaying concentration variations among the four breast cancer molecular subtypes. We found a biomarker panel of 5 candidates in LA, 7 in LB, 5 in HER2 and 3 in TN that were able to discriminate each breast cancer subtype with a false discovery range corrected <i>p</i>-value < 0.05 and a fold-change cutoff value > 1.3. The model clinical value was evaluated with the AUROC, providing diagnostic capacities above 0.85.<h4>Conclusion</h4>Our study identifies metabolic profiling differences in molecular phenotypes of breast cancer. This may represent a key step towards therapy improvement in personalized medicine and prioritization of tailored therapeutic intervention strategies.
Project description:The discovery of new and unexpected biomarkers in cardiovascular disease is a highly data-driven process that requires the complementary power of modern metabolite profiling technologies, bioinformatics and biostatistics. Clinical biomarkers of early myocardial injury are lacking. A prospective biomarker cohort study was carried out to identify, categorize and profile kinetic patterns of early metabolic biomarkers of planned myocardial infarction (PMI) and spontaneous (SMI) myocardial infarction. We applied a targeted mass spectrometry (MS)-based metabolite profiling platform to serial blood samples drawn from carefully phenotyped patients undergoing alcohol septal ablation for hypertrophic obstructive cardiomyopathy serving as a human model of PMI. Patients with SMI and patients undergoing catheterization without induction of myocardial infarction served as positive and negative controls to assess generalizability of markers identified in PMI.To identify metabolites of high predictive value in tandem mass spectrometry data, we introduced a new feature selection method for the categorization of metabolic signatures into three classes of weak, moderate and strong predictors, which can be easily applied to both paired and unpaired samples. Our paradigm outperformed standard null-hypothesis significance testing and other popular methods for feature selection in terms of the area under the receiver operating curve and the product of sensitivity and specificity. Our results emphasize that this new method was able to identify, classify and validate alterations of levels in multiple metabolites participating in pathways associated with myocardial injury as early as 10 min after PMI.The algorithm as well as supplementary material is available for download at: www.umit.at/page.cfm?vpath=departments/technik/iebe/tools/bi
Project description:We endeavored to identify objective blood biomarkers for pain, a subjective sensation with a biological basis, using a stepwise discovery, prioritization, validation, and testing in independent cohorts design. We studied psychiatric patients, a high risk group for co-morbid pain disorders and increased perception of pain. For discovery, we used a powerful within-subject longitudinal design. We were successful in identifying blood gene expression biomarkers that were predictive of pain state, and of future emergency department (ED) visits for pain, more so when personalized by gender and diagnosis. MFAP3, which had no prior evidence in the literature for involvement in pain, had the most robust empirical evidence from our discovery and validation steps, and was a strong predictor for pain in the independent cohorts, particularly in females and males with PTSD. Other biomarkers with best overall convergent functional evidence for involvement in pain were GNG7, CNTN1, LY9, CCDC144B, and GBP1. Some of the individual biomarkers identified are targets of existing drugs. Moreover, the biomarker gene expression signatures were used for bioinformatic drug repurposing analyses, yielding leads for possible new drug candidates such as SC-560 (an NSAID), and amoxapine (an antidepressant), as well as natural compounds such as pyridoxine (vitamin B6), cyanocobalamin (vitamin B12), and apigenin (a plant flavonoid). Our work may help mitigate the diagnostic and treatment dilemmas that have contributed to the current opioid epidemic.
Project description:Multivariate biomarkers that can predict the effectiveness of targeted therapy in individual patients are highly desired. Previous biomarker discovery studies have largely focused on the identification of single biomarker signatures, aimed at maximizing prediction accuracy. Here, we present a different approach that identifies multiple biomarkers by simultaneously optimizing their predictive power, number of features, and proximity to the drug target in a protein-protein interaction network. To this end, we incorporated NSGA-II, a fast and elitist multi-objective optimization algorithm that is based on the principle of Pareto optimality, into the biomarker discovery workflow. The method was applied to quantitative phosphoproteome data of 19 non-small cell lung cancer (NSCLC) cell lines from a previous biomarker study. The algorithm successfully identified a total of 77 candidate biomarker signatures predicting response to treatment with dasatinib. Through filtering and similarity clustering, this set was trimmed to four final biomarker signatures, which then were validated on an independent set of breast cancer cell lines. All four candidates reached the same good prediction accuracy (83%) as the originally published biomarker. Although the newly discovered signatures were diverse in their composition and in their size, the central protein of the originally published signature - integrin ?4 (ITGB4) - was also present in all four Pareto signatures, confirming its pivotal role in predicting dasatinib response in NSCLC cell lines. In summary, the method presented here allows for a robust and simultaneous identification of multiple multivariate biomarkers that are optimized for prediction performance, size, and relevance.
Project description:BACKGROUND:Modern medicine is rapidly moving towards a data-driven paradigm based on comprehensive multimodal health assessments. Integrated analysis of data from different modalities has the potential of uncovering novel biomarkers and disease signatures. METHODS:We collected 1385 data features from diverse modalities, including metabolome, microbiome, genetics, and advanced imaging, from 1253 individuals and from a longitudinal validation cohort of 1083 individuals. We utilized a combination of unsupervised machine learning methods to identify multimodal biomarker signatures of health and disease risk. RESULTS:Our method identified a set of cardiometabolic biomarkers that goes beyond standard clinical biomarkers. Stratification of individuals based on the signatures of these biomarkers identified distinct subsets of individuals with similar health statuses. Subset membership was a better predictor for diabetes than established clinical biomarkers such as glucose, insulin resistance, and body mass index. The novel biomarkers in the diabetes signature included 1-stearoyl-2-dihomo-linolenoyl-GPC and 1-(1-enyl-palmitoyl)-2-oleoyl-GPC. Another metabolite, cinnamoylglycine, was identified as a potential biomarker for both gut microbiome health and lean mass percentage. We identified potential early signatures for hypertension and a poor metabolic health outcome. Additionally, we found novel associations between a uremic toxin, p-cresol sulfate, and the abundance of the microbiome genera Intestinimonas and an unclassified genus in the Erysipelotrichaceae family. CONCLUSIONS:Our methodology and results demonstrate the potential of multimodal data integration, from the identification of novel biomarker signatures to a data-driven stratification of individuals into disease subtypes and stages-an essential step towards personalized, preventative health risk assessment.