Ontology highlight
ABSTRACT:
SUBMITTER: Rosenman ETR
PROVIDER: S-EPMC10199086 | biostudies-literature | 2023 May
REPOSITORIES: biostudies-literature
Rosenman Evan T R ETR Olivella Santiago S Imai Kosuke K
Scientific data 20230519 1
We provide the largest compiled publicly available dictionaries of first, middle, and surnames for the purpose of imputing race and ethnicity using, for example, Bayesian Improved Surname Geocoding (BISG). The dictionaries are based on the voter files of six U.S. Southern States that collect self-reported racial data upon voter registration. Our data cover the racial make-up of a larger set of names than any comparable dataset, containing 136 thousand first names, 125 thousand middle names, and ...[more]