Unknown

Dataset Information

0

Data Representativeness in Accessibility Datasets: A Meta-Analysis.


ABSTRACT: As data-driven systems are increasingly deployed at scale, ethical concerns have arisen around unfair and discriminatory outcomes for historically marginalized groups that are underrepresented in training data. In response, work around AI fairness and inclusion has called for datasets that are representative of various demographic groups. In this paper, we contribute an analysis of the representativeness of age, gender, and race & ethnicity in accessibility datasets-datasets sourced from people with disabilities and older adults-that can potentially play an important role in mitigating bias for inclusive AI-infused applications. We examine the current state of representation within datasets sourced by people with disabilities by reviewing publicly-available information of 190 datasets, we call these accessibility datasets. We find that accessibility datasets represent diverse ages, but have gender and race representation gaps. Additionally, we investigate how the sensitive and complex nature of demographic variables makes classification difficult and inconsistent (e.g., gender, race & ethnicity), with the source of labeling often unknown. By reflecting on the current challenges and opportunities for representation of disabled data contributors, we hope our effort expands the space of possibility for greater inclusion of marginalized communities in AI-infused systems.

SUBMITTER: Kamikubo R 

PROVIDER: S-EPMC10024595 | biostudies-literature | 2022 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

Data Representativeness in Accessibility Datasets: A Meta-Analysis.

Kamikubo Rie R   Wang Lining L   Marte Crystal C   Mahmood Amnah A   Kacorri Hernisa H  

ASSETS. Annual ACM Conference on Assistive Technologies 20221022


As data-driven systems are increasingly deployed at scale, ethical concerns have arisen around unfair and discriminatory outcomes for historically marginalized groups that are underrepresented in training data. In response, work around AI fairness and inclusion has called for datasets that are representative of various demographic groups. In this paper, we contribute an analysis of the representativeness of age, gender, and race & ethnicity in accessibility datasets-datasets sourced from people  ...[more]

Similar Datasets

| S-EPMC6267811 | biostudies-literature
| S-EPMC8375514 | biostudies-literature
2019-12-18 | GSE122517 | GEO
| S-EPMC10961983 | biostudies-literature
| S-EPMC9229142 | biostudies-literature
| PRJEB30378 | ENA
| S-EPMC8855358 | biostudies-literature
2023-01-05 | GSE214252 | GEO
| S-EPMC6618769 | biostudies-literature
| S-EPMC4724289 | biostudies-literature