Unknown

Dataset Information

0

DrFARM: identification of pleiotropic genetic variants in genome-wide association studies.


ABSTRACT: In a standard analysis, pleiotropic variants are identified by running separate genome-wide association studies (GWAS) and combining results across traits. But such statistical approach based on marginal summary statistics may lead to spurious results. We propose a new statistical approach, Debiased-regularized Factor Analysis Regression Model (DrFARM), through a joint regression model for simultaneous analysis of high-dimensional genetic variants and multilevel dependencies. This joint modeling strategy controls overall error to permit universal false discovery rate (FDR) control. DrFARM uses the strengths of the debiasing technique and the Cauchy combination test, both being theoretically justified, to establish a valid post selection inference on pleiotropic variants. Through extensive simulations, we show that DrFARM appropriately controls overall FDR. Applying DrFARM to data on 1031 metabolites measured on 6135 men from the Metabolic Syndrome in Men (METSIM) study, we identify five first-time reported putative causal genes, none of which had been implicated in any prior metabolite GWAS (including the prior METSIM analysis).

SUBMITTER: Chan LS 

PROVIDER: S-EPMC12216935 | biostudies-literature | 2025 Jul

REPOSITORIES: biostudies-literature

altmetric image

Publications

DrFARM: identification of pleiotropic genetic variants in genome-wide association studies.

Chan Lap Sum LS   Li Gen G   Fauman Eric B EB   Yin Xianyong X   Laakso Markku M   Boehnke Michael M   Song Peter X K PXK  

Nature communications 20250701 1


In a standard analysis, pleiotropic variants are identified by running separate genome-wide association studies (GWAS) and combining results across traits. But such statistical approach based on marginal summary statistics may lead to spurious results. We propose a new statistical approach, Debiased-regularized Factor Analysis Regression Model (DrFARM), through a joint regression model for simultaneous analysis of high-dimensional genetic variants and multilevel dependencies. This joint modeling  ...[more]

Similar Datasets

| S-EPMC3245026 | biostudies-literature
| S-EPMC10582758 | biostudies-literature
| S-EPMC9219007 | biostudies-literature
| S-EPMC3052933 | biostudies-literature
| S-EPMC3287479 | biostudies-literature