Unknown

Dataset Information

0

A draft human pangenome reference.


ABSTRACT: Here the Human Pangenome Reference Consortium presents a first draft of the human pangenome reference. The pangenome contains 47 phased, diploid assemblies from a cohort of genetically diverse individuals1. These assemblies cover more than 99% of the expected sequence in each genome and are more than 99% accurate at the structural and base pair levels. Based on alignments of the assemblies, we generate a draft pangenome that captures known variants and haplotypes and reveals new alleles at structurally complex loci. We also add 119 million base pairs of euchromatic polymorphic sequences and 1,115 gene duplications relative to the existing reference GRCh38. Roughly 90 million of the additional base pairs are derived from structural variation. Using our draft pangenome to analyse short-read data reduced small variant discovery errors by 34% and increased the number of structural variants detected per haplotype by 104% compared with GRCh38-based workflows, which enabled the typing of the vast majority of structural variant alleles per sample.

SUBMITTER: Liao WW 

PROVIDER: S-EPMC10172123 | biostudies-literature | 2023 May

REPOSITORIES: biostudies-literature

altmetric image

Publications

A draft human pangenome reference.

Liao Wen-Wei WW   Asri Mobin M   Ebler Jana J   Doerr Daniel D   Haukness Marina M   Hickey Glenn G   Lu Shuangjia S   Lucas Julian K JK   Monlong Jean J   Abel Haley J HJ   Buonaiuto Silvia S   Chang Xian H XH   Cheng Haoyu H   Chu Justin J   Colonna Vincenza V   Eizenga Jordan M JM   Feng Xiaowen X   Fischer Christian C   Fulton Robert S RS   Garg Shilpa S   Groza Cristian C   Guarracino Andrea A   Harvey William T WT   Heumos Simon S   Howe Kerstin K   Jain Miten M   Lu Tsung-Yu TY   Markello Charles C   Martin Fergal J FJ   Mitchell Matthew W MW   Munson Katherine M KM   Mwaniki Moses Njagi MN   Novak Adam M AM   Olsen Hugh E HE   Pesout Trevor T   Porubsky David D   Prins Pjotr P   Sibbesen Jonas A JA   Sirén Jouni J   Tomlinson Chad C   Villani Flavia F   Vollger Mitchell R MR   Antonacci-Fulton Lucinda L LL   Baid Gunjan G   Baker Carl A CA   Belyaeva Anastasiya A   Billis Konstantinos K   Carroll Andrew A   Chang Pi-Chuan PC   Cody Sarah S   Cook Daniel E DE   Cook-Deegan Robert M RM   Cornejo Omar E OE   Diekhans Mark M   Ebert Peter P   Fairley Susan S   Fedrigo Olivier O   Felsenfeld Adam L AL   Formenti Giulio G   Frankish Adam A   Gao Yan Y   Garrison Nanibaa' A NA   Giron Carlos Garcia CG   Green Richard E RE   Haggerty Leanne L   Hoekzema Kendra K   Hourlier Thibaut T   Ji Hanlee P HP   Kenny Eimear E EE   Koenig Barbara A BA   Kolesnikov Alexey A   Korbel Jan O JO   Kordosky Jennifer J   Koren Sergey S   Lee HoJoon H   Lewis Alexandra P AP   Magalhães Hugo H   Marco-Sola Santiago S   Marijon Pierre P   McCartney Ann A   McDaniel Jennifer J   Mountcastle Jacquelyn J   Nattestad Maria M   Nurk Sergey S   Olson Nathan D ND   Popejoy Alice B AB   Puiu Daniela D   Rautiainen Mikko M   Regier Allison A AA   Rhie Arang A   Sacco Samuel S   Sanders Ashley D AD   Schneider Valerie A VA   Schultz Baergen I BI   Shafin Kishwar K   Smith Michael W MW   Sofia Heidi J HJ   Abou Tayoun Ahmad N AN   Thibaud-Nissen Françoise F   Tricomi Francesca Floriana FF   Wagner Justin J   Walenz Brian B   Wood Jonathan M D JMD   Zimin Aleksey V AV   Bourque Guillaume G   Chaisson Mark J P MJP   Flicek Paul P   Phillippy Adam M AM   Zook Justin M JM   Eichler Evan E EE   Haussler David D   Wang Ting T   Jarvis Erich D ED   Miga Karen H KH   Garrison Erik E   Marschall Tobias T   Hall Ira M IM   Li Heng H   Paten Benedict B  

Nature 20230510 7960


Here the Human Pangenome Reference Consortium presents a first draft of the human pangenome reference. The pangenome contains 47 phased, diploid assemblies from a cohort of genetically diverse individuals<sup>1</sup>. These assemblies cover more than 99% of the expected sequence in each genome and are more than 99% accurate at the structural and base pair levels. Based on alignments of the assemblies, we generate a draft pangenome that captures known variants and haplotypes and reveals new allel  ...[more]

Similar Datasets

| S-EPMC8350888 | biostudies-literature
| S-EPMC10363913 | biostudies-literature
| S-EPMC10322713 | biostudies-literature
| PRJEB54791 | ENA
| S-EPMC10603592 | biostudies-literature
| S-EPMC10882747 | biostudies-literature
| S-EPMC7568353 | biostudies-literature
| S-EPMC11514843 | biostudies-literature
| S-EPMC10954445 | biostudies-literature
| S-EPMC11804092 | biostudies-literature