Unknown

Dataset Information

0

A globally synthesised and flagged bee occurrence dataset and cleaning workflow.


ABSTRACT: Species occurrence data are foundational for research, conservation, and science communication, but the limited availability and accessibility of reliable data represents a major obstacle, particularly for insects, which face mounting pressures. We present BeeBDC, a new R package, and a global bee occurrence dataset to address this issue. We combined >18.3 million bee occurrence records from multiple public repositories (GBIF, SCAN, iDigBio, USGS, ALA) and smaller datasets, then standardised, flagged, deduplicated, and cleaned the data using the reproducible BeeBDC R-workflow. Specifically, we harmonised species names (following established global taxonomy), country names, and collection dates and, we added record-level flags for a series of potential quality issues. These data are provided in two formats, "cleaned" and "flagged-but-uncleaned". The BeeBDC package with online documentation provides end users the ability to modify filtering parameters to address their research questions. By publishing reproducible R workflows and globally cleaned datasets, we can increase the accessibility and reliability of downstream analyses. This workflow can be implemented for other taxa to support research and conservation.

SUBMITTER: Dorey JB 

PROVIDER: S-EPMC10622554 | biostudies-literature | 2023 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications


Species occurrence data are foundational for research, conservation, and science communication, but the limited availability and accessibility of reliable data represents a major obstacle, particularly for insects, which face mounting pressures. We present BeeBDC, a new R package, and a global bee occurrence dataset to address this issue. We combined >18.3 million bee occurrence records from multiple public repositories (GBIF, SCAN, iDigBio, USGS, ALA) and smaller datasets, then standardised, fl  ...[more]

Similar Datasets

| S-EPMC4267104 | biostudies-other
| S-EPMC6366754 | biostudies-literature
| S-EPMC9259668 | biostudies-literature
| S-EPMC8897392 | biostudies-literature
| S-EPMC10557005 | biostudies-literature
| S-EPMC4700860 | biostudies-literature
| S-EPMC11770145 | biostudies-literature
| PRJNA930680 | ENA
| S-EPMC8384828 | biostudies-literature
| S-EPMC7019919 | biostudies-literature