Unknown

Dataset Information

0

A Benford's law based method for fraud detection using R Library.


ABSTRACT: Benford Law (BL) states that the occurrence of significant digits in many natural and human phenomena data sets are not uniformly scattered, as one could naively expect, but follow a logarithmic-type distribution. Here, we present a method that consists of the use of BL analysis over first and first-two digits, three statistical conformity tests - Z-statistics, Mean Absolute Deviation (MAD) and Chi-square (χ2) as well as the summation test which looks for excessively large numbers, having fraud detection as one of its application. We developed the method for fraud detection in the case of the Brazilian Bolsa Familia welfare program. In this case, we submitted four periods of Brazilian welfare program payments to the method with a dataset of 13,442,529 records. We provide a practical implementation of the method based on open-source R library released on a public repository. Furthermore, code implementation of the algorithm as well as datasets are freely available. Advantages of the algorithm are listed below: • The method was developed based on open source libraries • The technique is simple, rapid and ease of use • Easily applicable to other social welfare program auditing.

SUBMITTER: Azevedo CDS 

PROVIDER: S-EPMC8720889 | biostudies-literature | 2021

REPOSITORIES: biostudies-literature

altmetric image

Publications

A Benford's law based method for fraud detection using R Library.

Azevedo Caio da Silva CDS   Gonçalves Rodrigo Franco RF   Gava Vagner Luiz VL   Spinola Mauro de Mesquita MM  

MethodsX 20211111


Benford Law (BL) states that the occurrence of significant digits in many natural and human phenomena data sets are not uniformly scattered, as one could naively expect, but follow a logarithmic-type distribution. Here, we present a method that consists of the use of BL analysis over first and first-two digits, three statistical conformity tests - Z-statistics, Mean Absolute Deviation (MAD) and Chi-square (χ2) as well as the summation test which looks for excessively large numbers, having fraud  ...[more]

Similar Datasets

| S-EPMC10088595 | biostudies-literature
| S-EPMC8967852 | biostudies-literature
| S-EPMC4496784 | biostudies-other
| S-EPMC4452586 | biostudies-literature
| S-EPMC4331362 | biostudies-literature
| S-EPMC9289319 | biostudies-literature
| S-EPMC8356649 | biostudies-literature
| S-EPMC11215073 | biostudies-literature
| S-EPMC6361435 | biostudies-literature
| S-EPMC9584388 | biostudies-literature