<HashMap><database>biostudies-literature</database><scores/><additional><omics_type>Unknown</omics_type><submitter>Xu Z</submitter><funding>NHLBI NIH HHS</funding><pubmed_abstract>Mediation analysis is a useful tool in investigating how molecular phenotypes such as gene expression mediate the effect of exposure on health outcomes. However, commonly used mean-based total mediation effect measures may suffer from cancellation of component-wise mediation effects in opposite directions in the presence of high-dimensional omics mediators. To overcome this limitation, we recently proposed a variance-based R-squared total mediation effect measure that relies on the computationally intensive nonparametric bootstrap for confidence interval estimation. In the work described herein, we formulated a more efficient two-stage, cross-fitted estimation procedure for the R2 measure. To avoid potential bias, we performed iterative Sure Independence Screening (iSIS) in two subsamples to exclude the non-mediators, followed by ordinary least squares regressions for the variance estimation. We then constructed confidence intervals based on the newly derived closed-form asymptotic distribution of the R2 measure. Extensive simulation studies demonstrated that this proposed procedure is much more computationally efficient than the resampling-based method, with comparable coverage probability. Furthermore, when applied to the Framingham Heart Study, the proposed method replicated the established finding of gene expression mediating age-related variation in systolic blood pressure and identified the role of gene expression profiles in the relationship between sex and high-density lipoprotein cholesterol level. The proposed estimation procedure is implemented in R package CFR2M.</pubmed_abstract><journal>bioRxiv : the preprint server for biology</journal><pagination>2023.02.06.527391</pagination><full_dataset_link>https://www.ebi.ac.uk/biostudies/studies/S-EPMC9934518</full_dataset_link><repository>biostudies-literature</repository><pubmed_title>Speeding up interval estimation for R 2 -based mediation effect of high-dimensional mediators via cross-fitting.</pubmed_title><pmcid>PMC9934518</pmcid><funding_grant_id>N01 HC025195</funding_grant_id><funding_grant_id>HHSN268201500001C</funding_grant_id><funding_grant_id>75N92019D00031</funding_grant_id><funding_grant_id>R01 HL116720</funding_grant_id><funding_grant_id>HHSN268201500001I</funding_grant_id><pubmed_authors>Wei P</pubmed_authors><pubmed_authors>Li C</pubmed_authors><pubmed_authors>Chi S</pubmed_authors><pubmed_authors>Yang T</pubmed_authors><pubmed_authors>Xu Z</pubmed_authors></additional><is_claimable>false</is_claimable><name>Speeding up interval estimation for R 2 -based mediation effect of high-dimensional mediators via cross-fitting.</name><description>Mediation analysis is a useful tool in investigating how molecular phenotypes such as gene expression mediate the effect of exposure on health outcomes. However, commonly used mean-based total mediation effect measures may suffer from cancellation of component-wise mediation effects in opposite directions in the presence of high-dimensional omics mediators. To overcome this limitation, we recently proposed a variance-based R-squared total mediation effect measure that relies on the computationally intensive nonparametric bootstrap for confidence interval estimation. In the work described herein, we formulated a more efficient two-stage, cross-fitted estimation procedure for the R2 measure. To avoid potential bias, we performed iterative Sure Independence Screening (iSIS) in two subsamples to exclude the non-mediators, followed by ordinary least squares regressions for the variance estimation. We then constructed confidence intervals based on the newly derived closed-form asymptotic distribution of the R2 measure. Extensive simulation studies demonstrated that this proposed procedure is much more computationally efficient than the resampling-based method, with comparable coverage probability. Furthermore, when applied to the Framingham Heart Study, the proposed method replicated the established finding of gene expression mediating age-related variation in systolic blood pressure and identified the role of gene expression profiles in the relationship between sex and high-density lipoprotein cholesterol level. The proposed estimation procedure is implemented in R package CFR2M.</description><dates><release>2024-01-01T00:00:00Z</release><publication>2024 Sep</publication><modification>2026-04-08T13:22:40.389Z</modification><creation>2025-02-19T04:54:22.694Z</creation></dates><accession>S-EPMC9934518</accession><cross_references><pubmed>36798366</pubmed><doi>10.1101/2023.02.06.527391</doi></cross_references></HashMap>