Unknown

Dataset Information

0

Identification of individuals by trait prediction using whole-genome sequencing data.


ABSTRACT: Prediction of human physical traits and demographic information from genomic data challenges privacy and data deidentification in personalized medicine. To explore the current capabilities of phenotype-based genomic identification, we applied whole-genome sequencing, detailed phenotyping, and statistical modeling to predict biometric traits in a cohort of 1,061 participants of diverse ancestry. Individually, for a large fraction of the traits, their predictive accuracy beyond ancestry and demographic information is limited. However, we have developed a maximum entropy algorithm that integrates multiple predictions to determine which genomic samples and phenotype measurements originate from the same person. Using this algorithm, we have reidentified an average of >8 of 10 held-out individuals in an ethnically mixed cohort and an average of 5 of either 10 African Americans or 10 Europeans. This work challenges current conceptions of personal privacy and may have far-reaching ethical and legal implications.

SUBMITTER: Lippert C 

PROVIDER: S-EPMC5617305 | biostudies-literature | 2017 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications


Prediction of human physical traits and demographic information from genomic data challenges privacy and data deidentification in personalized medicine. To explore the current capabilities of phenotype-based genomic identification, we applied whole-genome sequencing, detailed phenotyping, and statistical modeling to predict biometric traits in a cohort of 1,061 participants of diverse ancestry. Individually, for a large fraction of the traits, their predictive accuracy beyond ancestry and demogr  ...[more]

Similar Datasets

| S-EPMC7863413 | biostudies-literature
| S-EPMC4143683 | biostudies-literature
| S-EPMC5347377 | biostudies-literature
| S-EPMC7555865 | biostudies-literature
| S-EPMC6137601 | biostudies-literature
| S-EPMC5868770 | biostudies-other
| S-EPMC7997805 | biostudies-literature
| S-EPMC4059462 | biostudies-literature