Unknown

Dataset Information

0

The effect of sample size on polygenic hazard models for prostate cancer.


ABSTRACT: We determined the effect of sample size on performance of polygenic hazard score (PHS) models in prostate cancer. Age and genotypes were obtained for 40,861 men from the PRACTICAL consortium. The dataset included 201,590 SNPs per subject, and was split into training and testing sets. Established-SNP models considered 65 SNPs that had been previously associated with prostate cancer. Discovery-SNP models used stepwise selection to identify new SNPs. The performance of each PHS model was calculated for random sizes of the training set. The performance of a representative Established-SNP model was estimated for random sizes of the testing set. Mean HR98/50 (hazard ratio of top 2% to average in test set) of the Established-SNP model increased from 1.73 [95% CI: 1.69-1.77] to 2.41 [2.40-2.43] when the number of training samples was increased from 1 thousand to 30 thousand. Corresponding HR98/50 of the Discovery-SNP model increased from 1.05 [0.93-1.18] to 2.19 [2.16-2.23]. HR98/50 of a representative Established-SNP model using testing set sample sizes of 0.6 thousand and 6 thousand observations were 1.78 [1.70-1.85] and 1.73 [1.71-1.76], respectively. We estimate that a study population of 20 thousand men is required to develop Discovery-SNP PHS models while 10 thousand men should be sufficient for Established-SNP models.

SUBMITTER: Karunamuni RA 

PROVIDER: S-EPMC7608255 | biostudies-literature | 2020 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

The effect of sample size on polygenic hazard models for prostate cancer.

Karunamuni Roshan A RA   Huynh-Le Minh-Phuong MP   Fan Chun C CC   Eeles Rosalind A RA   Easton Douglas F DF   Kote-Jarai ZSofia Z   Amin Al Olama Ali A   Benlloch Garcia Sara S   Muir Kenneth K   Gronberg Henrik H   Wiklund Fredrik F   Aly Markus M   Schleutker Johanna J   Sipeky Csilla C   Tammela Teuvo L J TLJ   Nordestgaard Børge G BG   Key Tim J TJ   Travis Ruth C RC   Neal David E DE   Donovan Jenny L JL   Hamdy Freddie C FC   Pharoah Paul P   Pashayan Nora N   Khaw Kay-Tee KT   Thibodeau Stephen N SN   McDonnell Shannon K SK   Schaid Daniel J DJ   Maier Christiane C   Vogel Walther W   Luedeke Manuel M   Herkommer Kathleen K   Kibel Adam S AS   Cybulski Cezary C   Wokolorczyk Dominika D   Kluzniak Wojciech W   Cannon-Albright Lisa L   Brenner Hermann H   Schöttker Ben B   Holleczek Bernd B   Park Jong Y JY   Sellers Thomas A TA   Lin Hui-Yi HY   Slavov Chavdar C   Kaneva Radka R   Mitev Vanio V   Batra Jyotsna J   Clements Judith A JA   Spurdle Amanda A   Teixeira Manuel R MR   Paulo Paula P   Maia Sofia S   Pandha Hardev H   Michael Agnieszka A   Mills Ian G IG   Andreassen Ole A OA   Dale Anders M AM   Seibert Tyler M TM  

European journal of human genetics : EJHG 20200608 10


We determined the effect of sample size on performance of polygenic hazard score (PHS) models in prostate cancer. Age and genotypes were obtained for 40,861 men from the PRACTICAL consortium. The dataset included 201,590 SNPs per subject, and was split into training and testing sets. Established-SNP models considered 65 SNPs that had been previously associated with prostate cancer. Discovery-SNP models used stepwise selection to identify new SNPs. The performance of each PHS model was calculated  ...[more]

Similar Datasets

| S-EPMC7902617 | biostudies-literature
| S-EPMC8157993 | biostudies-literature
| S-EPMC8135907 | biostudies-literature
| S-EPMC5759091 | biostudies-literature
| S-EPMC8756152 | biostudies-literature
| S-EPMC5758043 | biostudies-literature
| S-EPMC7868330 | biostudies-literature
| S-EPMC6874355 | biostudies-literature
| S-EPMC8993880 | biostudies-literature
| S-EPMC10560131 | biostudies-literature