Unknown

Dataset Information

0

Supervised Classes, Unsupervised Mixing Proportions: Detection of Bots in a Likert-Type Questionnaire.


ABSTRACT: Administering Likert-type questionnaires to online samples risks contamination of the data by malicious computer-generated random responses, also known as bots. Although nonresponsivity indices (NRIs) such as person-total correlations or Mahalanobis distance have shown great promise to detect bots, universal cutoff values are elusive. An initial calibration sample constructed via stratified sampling of bots and humans-real or simulated under a measurement model-has been used to empirically choose cutoffs with a high nominal specificity. However, a high-specificity cutoff is less accurate when the target sample has a high contamination rate. In the present article, we propose the supervised classes, unsupervised mixing proportions (SCUMP) algorithm that chooses a cutoff to maximize accuracy. SCUMP uses a Gaussian mixture model to estimate, unsupervised, the contamination rate in the sample of interest. A simulation study found that, in the absence of model misspecification on the bots, our cutoffs maintained accuracy across varying contamination rates.

SUBMITTER: Ilagan MJ 

PROVIDER: S-EPMC9972131 | biostudies-literature | 2023 Apr

REPOSITORIES: biostudies-literature

altmetric image

Publications

Supervised Classes, Unsupervised Mixing Proportions: Detection of Bots in a Likert-Type Questionnaire.

Ilagan Michael John MJ   Falk Carl F CF  

Educational and psychological measurement 20220730 2


Administering Likert-type questionnaires to online samples risks contamination of the data by malicious computer-generated random responses, also known as bots. Although nonresponsivity indices (NRIs) such as person-total correlations or Mahalanobis distance have shown great promise to detect bots, universal cutoff values are elusive. An initial calibration sample constructed via stratified sampling of bots and humans-real or simulated under a measurement model-has been used to empirically choos  ...[more]

Similar Datasets

| S-EPMC3359096 | biostudies-literature
| S-EPMC9019745 | biostudies-literature
| S-EPMC3956069 | biostudies-literature
| S-EPMC5532435 | biostudies-other
| S-EPMC9890099 | biostudies-literature
| S-EPMC8496329 | biostudies-literature
| S-EPMC8042883 | biostudies-literature
| S-EPMC5435816 | biostudies-literature
| S-EPMC9601423 | biostudies-literature
| S-EPMC4072515 | biostudies-literature