biostudies-arrayexpressMetabolomicsUnknownTranscriptomicsGenomicsProteomicsTaisha Victoria Joseph-Rischmethylation profiling by arrayHomo sapiensHomo sapienshttps://www.ebi.ac.uk/biostudies/studies/E-MTAB-16584Unraveling the complex associations between human phenotypes and molecular pathways can pave the way to improved health and performance, but faces a fundamental challenge: the measurable genes, proteins, and metabolites vastly outnumber the participants in even the largest studies, yielding spurious correlations. To address this, we developed PhenoMol, a bioinformatic framework that integrates comprehensive phenotypic data predictive of outcomes and reduces multi-omic dimensionality using graph theory constrained by prior biological knowledge. This approach generates biologically informed \"expression circuits\" to identify causal patterns. Applied to a deeply characterized healthy cohort, PhenoMol successfully predicted elite physical performance and outperformed regression models lacking network-based dimensionality reduction.biostudies-arrayexpressScaning - After completing the Array QC steps using the built-in controls, the dataset was imported into the Bioconductor R package Chip Analysis Methylation Pipeline (ChAMP) for further QC and bioinformatic analysis. The additional QC steps evaluate each samples' proportion of failed probes, distance clustering between samples (using all probes), methylation distribution plots, and multidimensional scaling. The primary steps in the pipeline include filtering, normalization with BMIQ, singular value decomposition analysis, and batch effect correction with the COMBAT algorithm. After the filtering and processing, the final dataset consisted of 741,605 probes with methylation values. The final data matrix was exported into a csv containing the methylation values (beta values) for all samples.Sample Collection - Venous blood was collected via venipuncture in the antecubital fossa (inner elbow) from participants at baseline (before exercise) as well as approximately 2-5 min, 10-20 min, and 25-40 min post-exercise. Specimens were immediately processed after collection. For separation of serum, blood was collected in a gold top 5 mL SST vacutainer™ (Cat# 367989, Becton Dickinson), inverted 5x to mix, and allowed to clot at room temperature for 15 min. Vacutainers™ were then spun at 1500xg for 15 min at room temperature. The upper serum layer was pipetted off, aliquoted and immediately frozen on dry ice. For plasma separation, blood was collected in a purple top 4 mL K2EDTA vacutainer™ (Cat# 368047, Becton Dickinson) and inverted 8x to mix. At the baseline blood draw, an aliquot of whole blood was taken from the K2EDTA vacutainers™ and stored in 1.5 mL Eppendorf tubes at 4°C for 4 to 16 hours prior to hematology analysis. For plasma collection, the K2EDTA vacutainers™ were spun at 1500xg for 15 min at room temperature. The top plasma layer was pipetted off, aliquoted and immediately frozen on dry ice. Blood for iSTAT analysis was collected in a green-top 2 mL Lithium Heparin vacutainer™ (Cat# 366664, Becton Dickinson) and inverted 4x to mix. Blood was collected in 8mL CPT vacutainers™ (Cat# 362753, Becton Dickinson) for separation of peripheral blood mononuclear cells (PBMCs) from whole blood. A 15 min centrifugation was performed (1500xg) to separate the mononuclear cells from the erythrocytes and granulocytes. The PBMCs were washed in PBS (5 min, 250xg) at room temperature to deplete residual contaminants such as platelets and erythrocytes and to dilute the plasma. PBMCs were then resuspended in 1mL PBS and counted on a Countess™3 automated cell counter. Aliquots of PBMCs were flash-frozen as dry pellets for DNA methylation analysis. For immunophenotyping, PBMCs were resuspended in 1mL Bambanker cryopreservation reagent (Cat# BB01, Bulldog Bio) and immediately stored in a Mr. Frosty freezing container on dry ice then transferred to liquid nitrogen. Whole blood was stabilized in 2.5mL PAXgene RNA vacutainers™ (Cat# 762165, Becton Dickinson) for 4-16 hours at room temperature then stored at -20°C prior to RNA transcriptome analysis.Nucleic Acid Extraction - In brief, the PBMCs were thawed on ice and extracted in a randomized order using the AllPrep DNA/RNA Mini 96 kit (Cat# 80311, Qiagen) across four plates. The plates were extracted according to an adapted protocol from the manufacturer to increase DNA recovery. The approximate elution volume of the extracted gDNA was 70 µL. After extraction, the samples’ gDNA were quantified using the Lumiprobe PicoGreen 488 (Cat# 4210, Lumiprobe) on a Varioskan Lux Multimode Microplate Reader Instrument (Cat# VL0000D0, Thermo Fisher Scientific). Samples that had a concentration above 10 ng/µL went into the bisulfite conversion step with a gDNA input of 500 ng. Samples that had concentrations below 10 ng/µL went through a concentration step using the Eppendorf Concentrator Plus (Cat# EP5305000100, Eppendorf) and entered the bisulfite conversion with a gDNA input of up to 500 ng.Labeling - Samples that had a concentration above 10 ng/µL went into the bisulfite conversion step with a gDNA input of 500 ng. Samples that had concentrations below 10 ng/µL went through a concentration step using the Eppendorf Concentrator Plus (Cat# EP5305000100, Eppendorf) and entered the bisulfite conversion with a gDNA input of up to 500 ng. During the bisulfite conversion step, the gDNA was deaminated utilizing the EZ DNA Methylation kit (Cat# D5003, Zymo Research) in accordance with Illumina’s preferred deamination protocol.Hybridization - The samples were loaded on the Infinium MethylationEPIC v1.0 Kit (Cat# WG-317-1003, Illumina) and inserted into an iScan System (Cat# SY-101-1001, Illumina) for an array scan as per manufacturer protocol. Samples that entered the bisulfite conversion workflow with gDNA inputs below 250 ng were excluded from downstream analysis. The GenomeStudio Software was next used to examine the built-in Infinium controls and understand the data quality. Controls were included for staining, extension, hybridization, target removal, bisulfite conversion efficiency, and non-specific primer extension, as well as negative controls lacking CpG dinucleotides to define the background signal intensity. Non-polymorphic controls were used to assess the sample quality and overall assay performance.MIAME ScoreRaw DataOrganizationAssays and DataProcessed DataMAGE-TAB FilesArray DesignsTaisha Victoria Joseph-RischData Transformation - The primary steps in the pipeline include filtering, normalization with BMIQ, singular value decomposition analysis, and batch effect correction with the COMBAT algorithm. After the filtering and processing, the final dataset consisted of 741,605 probes with methylation values. The final data matrix was exported into a csv containing the methylation values for all samples for the 741,605 probes and was integrated into the omics integrator pipeline.falseIntegration of Multiomic and Multi-phenotypic Data Identifies Biological Pathways Associated with Physical FitnessUnraveling the complex associations between human phenotypes and molecular pathways can pave the way to improved health and performance, but faces a fundamental challenge: the measurable genes, proteins, and metabolites vastly outnumber the participants in even the largest studies, yielding spurious correlations. To address this, we developed PhenoMol, a bioinformatic framework that integrates comprehensive phenotypic data predictive of outcomes and reduces multi-omic dimensionality using graph theory constrained by prior biological knowledge. This approach generates biologically informed \"expression circuits\" to identify causal patterns. Applied to a deeply characterized healthy cohort, PhenoMol successfully predicted elite physical performance and outperformed regression models lacking network-based dimensionality reduction.2026-01-27T00:00:00Z2026-05-26T23:02:57.318Z2026-01-26T13:58:14.536ZE-MTAB-16584EFO_0002944EFO_0003814EFO_0003813EFO_0002759EFO_0005518EFO_0003816EFO_0003815