Dataset Information

A large dataset of annotated incident reports on medication errors.

ABSTRACT: Incident reports of medication errors are valuable learning resources for improving patient safety. However, pertinent information is often contained within unstructured free text, which prevents automated analysis and limits the usefulness of these data. Natural language processing can structure this free text automatically and retrieve relevant past incidents and learning materials, but to be able to do so requires a large, fully annotated and validated corpus of incident reports. We present a corpus of 58,658 machine-annotated incident reports of medication errors that can be used to advance the development of information extraction models and subsequent incident learning. We report the best F1-scores for the annotated dataset: 0.97 and 0.76 for named entity recognition and intention/factuality analysis, respectively, for the cross-validation exercise. Our dataset contains 478,175 named entities and differentiates between incident types by recognising discrepancies between what was intended and what actually occurred. We explain our annotation workflow and technical validation and provide access to the validation datasets and machine annotator for labelling future incident reports of medication errors.

SUBMITTER: Wong ZSY

PROVIDER: S-EPMC10904777 | biostudies-literature | 2024 Feb

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

A large dataset of annotated incident reports on medication errors.

Wong Zoie S Y ZSY Waters Neil N Liu Jiaxing J Ushiro Shin S

Scientific data 20240229 1

Incident reports of medication errors are valuable learning resources for improving patient safety. However, pertinent information is often contained within unstructured free text, which prevents automated analysis and limits the usefulness of these data. Natural language processing can structure this free text automatically and retrieve relevant past incidents and learning materials, but to be able to do so requires a large, fully annotated and validated corpus of incident reports. We present a ...[more]

PMID: 38424103

Similar Datasets

Project description:Our sense of hearing is mediated by cochlear hair cells, localized within the sensory epithelium called the organ of Corti. There are two types of hair cells in the cochlea, which are organized in one row of inner hair cells and three rows of outer hair cells. Each cochlea contains a few thousands of hair cells, and their survival is essential for our perception of sound because they are terminally differentiated and do not regenerate after insult. It is often desirable in hearing research to quantify the number of hair cells within cochlear samples, in both pathological conditions, and in response to treatment. However, the sheer number of cells along the cochlea makes manual quantification impractical. Machine learning can be used to overcome this challenge by automating the quantification process but requires a vast and diverse dataset for effective training. In this study, we present a large collection of annotated cochlear hair-cell datasets, labeled with commonly used hair-cell markers and imaged using various fluorescence microscopy techniques. The collection includes samples from mouse, human, pig and guinea pig cochlear tissue, from normal conditions and following in-vivo and in-vitro ototoxic drug application. The dataset includes over 90'000 hair cells, all of which have been manually identified and annotated as one of two cell types: inner hair cells and outer hair cells. This dataset is the result of a collaborative effort from multiple laboratories and has been carefully curated to represent a variety of imaging techniques. With suggested usage parameters and a well-described annotation procedure, this collection can facilitate the development of generalizable cochlear hair cell detection models or serve as a starting point for fine-tuning models for other analysis tasks. By providing this dataset, we aim to supply other groups within the hearing research community with the opportunity to develop their own tools with which to analyze cochlear imaging data more fully, accurately, and with greater ease.

Project description:ObjectivesTo compare medication errors identified at audit and via direct observation with medication errors reported to an incident reporting system at paediatric hospitals and to investigate differences in types and severity of errors detected and reported by staff.MethodsThis is a comparison study at two tertiary referral paediatric hospitals between 2016 and 2020 in Australia. Prescribing errors were identified from a medication chart audit of 7785 patient records. Medication administration errors were identified from a prospective direct observational study of 5137 medication administration doses to 1530 patients. Medication errors reported to the hospitals' incident reporting system were identified and matched with errors identified at audit and observation.ResultsOf 11 302 clinical prescribing errors identified at audit, 3.2 per 1000 errors (95% CI 2.3 to 4.4, n=36) had an incident report. Of 2224 potentially serious prescribing errors from audit, 26.1% (95% CI 24.3 to 27.9, n=580) were detected by staff and 11.2 per 1000 errors (95% CI 7.6 to 16.5, n=25) were reported to the incident system. Although the prescribing error detection rates varied between the two hospitals, there was no difference in incident reporting rates regardless of error severity. Of 40 errors associated with actual patient harm, only 7 (17.5%; 95% CI 8.7% to 31.9%) were detected by staff and 4 (10.0%; 95% CI 4.0% to 23.1%) had an incident report. None of the 2883 clinical medication administration errors observed, including 903 potentially serious errors and 144 errors associated with actual patient harm, had incident reports.ConclusionIncident reporting data do not provide an accurate reflection of medication errors and related harm to children in hospitals. Failure to detect medication errors is likely to be a significant contributor to low error reporting rates. In an era of electronic health records, new automated approaches to monitor medication safety should be pursued to provide real-time monitoring.

Project description:We present Audiovisual Moments in Time (AVMIT), a large-scale dataset of audiovisual action events. In an extensive annotation task 11 participants labelled a subset of 3-second audiovisual videos from the Moments in Time dataset (MIT). For each trial, participants assessed whether the labelled audiovisual action event was present and whether it was the most prominent feature of the video. The dataset includes the annotation of 57,177 audiovisual videos, each independently evaluated by 3 of 11 trained participants. From this initial collection, we created a curated test set of 16 distinct action classes, with 60 videos each (960 videos). We also offer 2 sets of pre-computed audiovisual feature embeddings, using VGGish/YamNet for audio data and VGG16/EfficientNetB0 for visual data, thereby lowering the barrier to entry for audiovisual DNN research. We explored the advantages of AVMIT annotations and feature embeddings to improve performance on audiovisual event recognition. A series of 6 Recurrent Neural Networks (RNNs) were trained on either AVMIT-filtered audiovisual events or modality-agnostic events from MIT, and then tested on our audiovisual test set. In all RNNs, top 1 accuracy was increased by 2.71-5.94% by training exclusively on audiovisual events, even outweighing a three-fold increase in training data. Additionally, we introduce the Supervised Audiovisual Correspondence (SAVC) task whereby a classifier must discern whether audio and visual streams correspond to the same action label. We trained 6 RNNs on the SAVC task, with or without AVMIT-filtering, to explore whether AVMIT is helpful for cross-modal learning. In all RNNs, accuracy improved by 2.09-19.16% with AVMIT-filtered data. We anticipate that the newly annotated AVMIT dataset will serve as a valuable resource for research and comparative experiments involving computational models and human participants, specifically when addressing research questions where audiovisual correspondence is of critical importance.

Dataset Information

A large dataset of annotated incident reports on medication errors.

Publications

A large dataset of annotated incident reports on medication errors.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets