Unknown

Dataset Information

0

Towards understanding policy design through text-as-data approaches: The policy design annotations (POLIANNA) dataset.


ABSTRACT: Despite the importance of ambitious policy action for addressing climate change, large and systematic assessments of public policies and their design are lacking as analysing text manually is labour-intensive and costly. POLIANNA is a dataset of policy texts from the European Union (EU) that are annotated based on theoretical concepts of policy design, which can be used to develop supervised machine learning approaches for scaling policy analysis. The dataset consists of 20,577 annotated spans, drawn from 18 EU climate change mitigation and renewable energy policies. We developed a novel coding scheme translating existing taxonomies of policy design elements to a method for annotating text spans that consist of one or several words. Here, we provide the coding scheme, a description of the annotated corpus, and an analysis of inter-annotator agreement, and discuss potential applications. As understanding policy texts is still difficult for current text-processing algorithms, we envision this database to be used for building tools that help with manual coding of policy texts by automatically proposing paragraphs containing relevant information.

SUBMITTER: Sewerin S 

PROVIDER: S-EPMC10719256 | biostudies-literature | 2023 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

Towards understanding policy design through text-as-data approaches: The policy design annotations (POLIANNA) dataset.

Sewerin Sebastian S   Kaack Lynn H LH   Küttel Joel J   Sigurdsson Fride F   Martikainen Onerva O   Esshaki Alisha A   Hafner Fabian F  

Scientific data 20231213 1


Despite the importance of ambitious policy action for addressing climate change, large and systematic assessments of public policies and their design are lacking as analysing text manually is labour-intensive and costly. POLIANNA is a dataset of policy texts from the European Union (EU) that are annotated based on theoretical concepts of policy design, which can be used to develop supervised machine learning approaches for scaling policy analysis. The dataset consists of 20,577 annotated spans,  ...[more]

Similar Datasets

| S-EPMC10636146 | biostudies-literature
| S-EPMC10749236 | biostudies-literature
| S-EPMC2335285 | biostudies-literature
| S-EPMC11871503 | biostudies-literature
| S-EPMC1769513 | biostudies-literature
| S-EPMC6584868 | biostudies-literature
| S-EPMC11840378 | biostudies-literature
| S-EPMC7680666 | biostudies-literature
| S-EPMC4009763 | biostudies-other
| S-EPMC3698532 | biostudies-literature