Unknown

Dataset Information

0

A dataset for voice-based human identity recognition.


ABSTRACT: This paper introduces a new English speech dataset suitable for training and evaluating speaker recognition systems. Samples were obtained from non-native English speakers from the Arab region over the course of two months. The dataset was divided into two sub-datasets. Ten samples were collected from each speaker for each sub-dataset. The first sub-dataset contains samples of speakers repeating the phrase "Machine learning 1, 2, 3, 4, 5, 6, 7, 8, 9, 10". The second sub-dataset contains samples for the same speakers speaking randomly for five to ten seconds for each sample. The dataset consists of 150 speakers with a total of 3,000 data samples and about six hours of speech.

SUBMITTER: Alsaify BA 

PROVIDER: S-EPMC8958529 | biostudies-literature | 2022 Jun

REPOSITORIES: biostudies-literature

altmetric image

Publications

A dataset for voice-based human identity recognition.

Alsaify Baha' A BA   Arja Hadeel S Abu HSA   Maayah Baskal Y BY   Al-Taweel Masa M MM  

Data in brief 20220318


This paper introduces a new English speech dataset suitable for training and evaluating speaker recognition systems. Samples were obtained from non-native English speakers from the Arab region over the course of two months. The dataset was divided into two sub-datasets. Ten samples were collected from each speaker for each sub-dataset. The first sub-dataset contains samples of speakers repeating the phrase "Machine learning 1, 2, 3, 4, 5, 6, 7, 8, 9, 10". The second sub-dataset contains samples  ...[more]

Similar Datasets

| S-EPMC5837691 | biostudies-literature
| S-EPMC3690478 | biostudies-literature
| S-EPMC5091681 | biostudies-literature
| S-EPMC8288083 | biostudies-literature
| S-EPMC4242590 | biostudies-literature
| S-EPMC7240209 | biostudies-literature
| S-EPMC11449991 | biostudies-literature
| S-EPMC9334438 | biostudies-literature
| S-EPMC7341857 | biostudies-literature
| S-EPMC4766561 | biostudies-literature