Unknown

Dataset Information

0

The landscape of tolerated genetic variation in humans and primates.


ABSTRACT: Personalized genome sequencing has revealed millions of genetic differences between individuals, but our understanding of their clinical relevance remains largely incomplete. To systematically decipher the effects of human genetic variants, we obtained whole genome sequencing data for 809 individuals from 233 primate species, and identified 4.3 million common protein-altering variants with orthologs in human. We show that these variants can be inferred to have non-deleterious effects in human based on their presence at high allele frequencies in other primate populations. We use this resource to classify 6% of all possible human protein-altering variants as likely benign and impute the pathogenicity of the remaining 94% of variants with deep learning, achieving state-of-the-art accuracy for diagnosing pathogenic variants in patients with genetic diseases.

One sentence summary

Deep learning classifier trained on 4.3 million common primate missense variants predicts variant pathogenicity in humans.

SUBMITTER: Gao H 

PROVIDER: S-EPMC10187174 | biostudies-literature | 2023 May

REPOSITORIES: biostudies-literature

altmetric image

Publications

The landscape of tolerated genetic variation in humans and primates.

Gao Hong H   Hamp Tobias T   Ede Jeffrey J   Schraiber Joshua G JG   McRae Jeremy J   Singer-Berk Moriel M   Yang Yanshen Y   Dietrich Anastasia A   Fiziev Petko P   Kuderna Lukas L   Sundaram Laksshman L   Wu Yibing Y   Adhikari Aashish A   Field Yair Y   Chen Chen C   Batzoglou Serafim S   Aguet Francois F   Lemire Gabrielle G   Reimers Rebecca R   Balick Daniel D   Janiak Mareike C MC   Kuhlwilm Martin M   Orkin Joseph D JD   Manu Shivakumara S   Valenzuela Alejandro A   Bergman Juraj J   Rouselle Marjolaine M   Silva Felipe Ennes FE   Agueda Lidia L   Blanc Julie J   Gut Marta M   de Vries Dorien D   Goodhead Ian I   Harris R Alan RA   Raveendran Muthuswamy M   Jensen Axel A   Chuma Idriss S IS   Horvath Julie J   Hvilsom Christina C   Juan David D   Frandsen Peter P   de Melo Fabiano R FR   Bertuol Fabricio F   Byrne Hazel H   Sampaio Iracilda I   Farias Izeni I   do Amaral João Valsecchi JV   Messias Mariluce M   da Silva Maria N F MNF   Trivedi Mihir M   Rossi Rogerio R   Hrbek Tomas T   Andriaholinirina Nicole N   Rabarivola Clément J CJ   Zaramody Alphonse A   Jolly Clifford J CJ   Phillips-Conroy Jane J   Wilkerson Gregory G   Abee Christian C   Simmons Joe H JH   Fernandez-Duque Eduardo E   Kanthaswamy Sree S   Shiferaw Fekadu F   Wu Dongdong D   Zhou Long L   Shao Yong Y   Zhang Guojie G   Keyyu Julius D JD   Knauf Sascha S   Le Minh D MD   Lizano Esther E   Merker Stefan S   Navarro Arcadi A   Batallion Thomas T   Nadler Tilo T   Khor Chiea Chuen CC   Lee Jessica J   Tan Patrick P   Lim Weng Khong WK   Kitchener Andrew C AC   Zinner Dietmar D   Gut Ivo I   Melin Amanda A   Guschanski Katerina K   Schierup Mikkel Heide MH   Beck Robin M D RMD   Umapathy Govindhaswamy G   Roos Christian C   Boubli Jean P JP   Lek Monkol M   Sunyaev Shamil S   O'Donnell Anne A   Rehm Heidi H   Xu Jinbo J   Rogers Jeffrey J   Marques-Bonet Tomas T   Kai-How Farh Kyle K  

bioRxiv : the preprint server for biology 20230502


Personalized genome sequencing has revealed millions of genetic differences between individuals, but our understanding of their clinical relevance remains largely incomplete. To systematically decipher the effects of human genetic variants, we obtained whole genome sequencing data for 809 individuals from 233 primate species, and identified 4.3 million common protein-altering variants with orthologs in human. We show that these variants can be inferred to have non-deleterious effects in human ba  ...[more]

Similar Datasets

| S-EPMC10713091 | biostudies-literature
| S-EPMC3317143 | biostudies-literature
| S-EPMC2651622 | biostudies-literature
| S-EPMC4874600 | biostudies-literature
| S-EPMC6905649 | biostudies-literature
| S-EPMC3789121 | biostudies-literature
| S-EPMC5018207 | biostudies-literature
| S-EPMC1448813 | biostudies-literature
| S-EPMC4486519 | biostudies-literature
2016-05-27 | E-MTAB-3656 | biostudies-arrayexpress