Unknown

Dataset Information

0

A large dataset of annotated incident reports on medication errors.


ABSTRACT: Incident reports of medication errors are valuable learning resources for improving patient safety. However, pertinent information is often contained within unstructured free text, which prevents automated analysis and limits the usefulness of these data. Natural language processing can structure this free text automatically and retrieve relevant past incidents and learning materials, but to be able to do so requires a large, fully annotated and validated corpus of incident reports. We present a corpus of 58,658 machine-annotated incident reports of medication errors that can be used to advance the development of information extraction models and subsequent incident learning. We report the best F1-scores for the annotated dataset: 0.97 and 0.76 for named entity recognition and intention/factuality analysis, respectively, for the cross-validation exercise. Our dataset contains 478,175 named entities and differentiates between incident types by recognising discrepancies between what was intended and what actually occurred. We explain our annotation workflow and technical validation and provide access to the validation datasets and machine annotator for labelling future incident reports of medication errors.

SUBMITTER: Wong ZSY 

PROVIDER: S-EPMC10904777 | biostudies-literature | 2024 Feb

REPOSITORIES: biostudies-literature

altmetric image

Publications

A large dataset of annotated incident reports on medication errors.

Wong Zoie S Y ZSY   Waters Neil N   Liu Jiaxing J   Ushiro Shin S  

Scientific data 20240229 1


Incident reports of medication errors are valuable learning resources for improving patient safety. However, pertinent information is often contained within unstructured free text, which prevents automated analysis and limits the usefulness of these data. Natural language processing can structure this free text automatically and retrieve relevant past incidents and learning materials, but to be able to do so requires a large, fully annotated and validated corpus of incident reports. We present a  ...[more]

Similar Datasets

| S-EPMC7451761 | biostudies-literature
| S-EPMC10280557 | biostudies-literature
| S-EPMC3608041 | biostudies-other
| S-EPMC7651942 | biostudies-literature
| S-EPMC10491224 | biostudies-literature
| S-EPMC10764906 | biostudies-literature
| S-EPMC11503142 | biostudies-literature
| S-EPMC10984512 | biostudies-literature
| S-EPMC7424444 | biostudies-literature
| S-EPMC9710490 | biostudies-literature