Dataset Information

Dataset of controversial news posts in Spanish from the reader's perspective.

ABSTRACT: This paper presents a corpus of Spanish news posts obtained from X with the annotation of controversy made via crowdsourcing. A total of 60 tweets were obtained from 8 different newspapers. For the annotation task, a survey was developed and sent to 31 different participants to answer it with the controversy level they perceived from the news post summary and headline presented on the post. The most frequent selected option was assigned as the initial controversy level of the post. The final annotation of the corpus was made via an analysis of the raw data by computing the Inter Annotator Agreement (IAA). The analysis showed that the binarization of the data was the most convenient way to annotate it. A potential use for this dataset is detailed in further sections.

SUBMITTER: Macias C

PROVIDER: S-EPMC10912592 | biostudies-literature | 2024 Apr

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Dataset of controversial news posts in Spanish from the reader's perspective.

Macias Cesar C Calvo Hiram H Gambino Omar Juárez OJ

Data in brief 20240223

This paper presents a corpus of Spanish news posts obtained from X with the annotation of controversy made via crowdsourcing. A total of 60 tweets were obtained from 8 different newspapers. For the annotation task, a survey was developed and sent to 31 different participants to answer it with the controversy level they perceived from the news post summary and headline presented on the post. The most frequent selected option was assigned as the initial controversy level of the post. The final ann ...[more]

PMID: 38445194

Dataset Information

Dataset of controversial news posts in Spanish from the reader's perspective.

Publications

Dataset of controversial news posts in Spanish from the reader's perspective.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Spanish-Language Tobacco-Related Posts on Twitter: Content Analysis.
| S-EPMC11519019 | biostudies-literature

Dataset of network simulator related-question posts in stack overflow.
| S-EPMC8857582 | biostudies-literature

Breaking news: Unveiling a new dataset for Portuguese news classification and comparative analysis of approaches.
| S-EPMC10817201 | biostudies-literature

Dataset of depressive posts in Russian language collected from social media.
| S-EPMC7016367 | biostudies-literature

A Bengali news and public opinion dataset from YouTube.
| S-EPMC10762352 | biostudies-literature

A social and news media benchmark dataset for topic modeling.
| S-EPMC9289850 | biostudies-literature

Radon Risk Communication through News Stories: A Multi-Perspective Approach.
| S-EPMC11506878 | biostudies-literature

FibVID: Comprehensive fake news diffusion dataset during the COVID-19 period.
| S-EPMC9759652 | biostudies-literature

FakeNewsPerception: An eye movement dataset on the perceived believability of news stories.
| S-EPMC7966981 | biostudies-literature

Examining Exposure to Messaging, Content, and Hate Speech from Partisan News Social Media Posts on Racial and Ethnic Health Disparities.
| S-EPMC9960309 | biostudies-literature