Classification of Twitter Users Who Tweet About E-Cigarettes.
ABSTRACT: Despite concerns about their health risks, e‑cigarettes have gained popularity in recent years. Concurrent with the recent increase in e‑cigarette use, social media sites such as Twitter have become a common platform for sharing information about e-cigarettes and to promote marketing of e‑cigarettes. Monitoring the trends in e‑cigarette-related social media activity requires timely assessment of the content of posts and the types of users generating the content. However, little is known about the diversity of the types of users responsible for generating e‑cigarette-related content on Twitter.The aim of this study was to demonstrate a novel methodology for automatically classifying Twitter users who tweet about e‑cigarette-related topics into distinct categories.We collected approximately 11.5 million e‑cigarette-related tweets posted between November 2014 and October 2016 and obtained a random sample of Twitter users who tweeted about e‑cigarettes. Trained human coders examined the handles' profiles and manually categorized each as one of the following user types: individual (n=2168), vaper enthusiast (n=334), informed agency (n=622), marketer (n=752), and spammer (n=1021). Next, the Twitter metadata as well as a sample of tweets for each labeled user were gathered, and features that reflect users' metadata and tweeting behavior were analyzed. Finally, multiple machine learning algorithms were tested to identify a model with the best performance in classifying user types.Using a classification model that included metadata and features associated with tweeting behavior, we were able to predict with relatively high accuracy five different types of Twitter users that tweet about e‑cigarettes (average F1 score=83.3%). Accuracy varied by user type, with F1 scores of individuals, informed agencies, marketers, spammers, and vaper enthusiasts being 91.1%, 84.4%, 81.2%, 79.5%, and 47.1%, respectively. Vaper enthusiasts were the most challenging user type to predict accurately and were commonly misclassified as marketers. The inclusion of additional tweet-derived features that capture tweeting behavior was found to significantly improve the model performance-an overall F1 score gain of 10.6%-beyond metadata features alone.This study provides a method for classifying five different types of users who tweet about e‑cigarettes. Our model achieved high levels of classification performance for most groups, and examining the tweeting behavior was critical in improving the model performance. Results can help identify groups engaged in conversations about e‑cigarettes online to help inform public health surveillance, education, and regulatory efforts.
Project description:OBJECTIVE:To determine the relative correlations of Twitter and Google Search user trends concerning smell loss with daily coronavirus disease 2019 (COVID-19) incidence in the United States, compared to other severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) symptoms. To describe the effect of mass media communications on Twitter and Google Search user trends. STUDY DESIGN:Retrospective observational study. SETTING:United States. SUBJECTS AND METHODS:Google Search and "tweet" frequency concerning COVID-19, smell, and nonsmell symptoms of COVID-19 generated between January 1 and April 8, 2020, were collected using Google Trends and Crimson Hexagon, respectively. Spearman coefficients linking each of these user trends to COVID-19 incidence were compared. Correlations obtained after excluding a short timeframe (March 22 to March 24) corresponding to the publication of a widely read lay media publication reporting anosmia as a symptom of infection was performed for comparative analysis. RESULTS:Google searches and tweets concerning all nonsmell symptoms (0.744 and 0.761, respectively) and COVID-19 (0.899 and 0.848) are more strongly correlated with disease incidence than smell loss (0.564 and 0.539). Twitter users tweeting about smell loss during the study period were more likely to be female (52%) than users tweeting about COVID-19 more generally (47%). Tweet and Google Search frequency pertaining to smell loss increased significantly (>2.5 standard deviations) following a widely read media publication linking smell loss and SARS-CoV-2 infection. CONCLUSIONS:Google Search and tweet frequency regarding fever and shortness of breath are more robust indicators of COVID-19 incidence than anosmia. Mass media communications represent important confounders that should be considered in future analyses.
Project description:Social media data are increasingly used by researchers to gain insights on individuals' behaviors and opinions. Platforms like Twitter provide access to individuals' postings, networks of friends and followers, and the content to which they are exposed. This article presents the methods and results of an exploratory study to supplement survey data with respondents' Twitter postings, networks of Twitter friends and followers, and information to which they were exposed about e-cigarettes. Twitter use is important to consider in e-cigarette research and other topics influenced by online information sharing and exposure. Further, Twitter metadata provide direct measures of user's friends and followers as opposed to survey self-reports. We find that Twitter metadata provide similar information to survey questions on Twitter network size without inducing recall error or other measurement issues. Using sentiment coding and machine learning methods, we find Twitter can elucidate on topics difficult to measure via surveys such as online expressed opinions and network composition. We present and discuss models predicting whether respondents' tweet positively about e-cigarettes using survey and Twitter data, finding the combined data to provide broader measures than either source alone.
Project description:<h4>Background</h4>Menthol cigarettes are used disproportionately by African American, female, and adolescent smokers. Twitter is also used disproportionately by minority and younger populations, providing a unique window into conversations reflecting social norms, behavioral intentions, and sentiment toward menthol cigarettes.<h4>Objective</h4>Our purpose was to identify the content and frequency of conversations about menthol cigarettes, including themes, populations, user smoking status, other tobacco or substances, tweet characteristics, and sentiment. We also examined differences in menthol cigarette sentiment by prevalent categories, which allowed us to assess potential perceptions, misperceptions, and social norms about menthol cigarettes on Twitter. This approach can inform communication about these products, particularly to subgroups who are at risk for menthol cigarette use.<h4>Methods</h4>Through a combination of human and machine classification, we identified 94,627 menthol cigarette-relevant tweets from February 1, 2012 to January 31, 2013 (1 year) from over 47 million tobacco-related messages gathered prospectively from the Twitter Firehose of all public tweets and metadata. Then, 4 human coders evaluated a random sample of 7000 tweets for categories, including sentiment toward menthol cigarettes.<h4>Results</h4>We found that 47.98% (3194/6657) of tweets expressed positive sentiment, while 40.26% (2680/6657) were negative toward menthol cigarettes. The majority of tweets by likely smokers (2653/4038, 65.70%) expressed positive sentiment, while 91.2% (320/351) of nonsmokers and 71.7% (91/127) of former smokers indicated negative views. Positive views toward menthol cigarettes were predominant in tweets that discussed addiction or craving, marijuana, smoking, taste or sensation, song lyrics, and tobacco industry or marketing or tweets that were commercial in nature. Negative views toward menthol were more common in tweets about smoking cessation, health, African Americans, women, and children and adolescents-largely due to expression of negative stereotypes associated with these groups' use of menthol cigarettes.<h4>Conclusions</h4>Examinations of public opinions toward menthol cigarettes through social media can help to inform the framing of public communication about menthol cigarettes, particularly in light of potential regulation by the European Union, US Food and Drug Administration, other jurisdictions, and localities.
Project description:INTRODUCTION:It is unclear whether warnings on electronic cigarette (e-cigarette) advertisements required by the US Food and Drug Administration (FDA) will apply to social media. Given the key role of social media in marketing e-cigarettes, we seek to inform FDA decision making by exploring how warnings on various tweet content influence perceived healthiness, nicotine harm, likelihood to try e-cigarettes, and warning recall. METHODS:In this 2 × 4 between-subjects experiment participants viewed a tweet from a fictitious e-cigarette brand. Four tweet content versions (e-cigarette product, e-cigarette use, e-cigarette in social context, unrelated content) were crossed with two warning versions (absent, present). Adult e-cigarette users (N = 994) were recruited via social media ads to complete a survey and randomized to view one of eight tweets. Multivariable regressions explored effects of tweet content and warning on perceived healthiness, perceived harm, and likelihood to try e-cigarettes, and tweet content on warning recall. Covariates were tobacco and social media use and demographics. RESULTS:Tweets with warnings elicited more negative health perceptions of the e-cigarette brand than tweets without warnings (p < .05). Tweets featuring e-cigarette products (p < .05) or use (p < .001) elicited higher warning recall than tweets featuring unrelated content. CONCLUSIONS:This is the first study to examine warning effects on perceptions of e-cigarette social media marketing. Warnings led to more negative e-cigarette health perceptions, but no effect on perceived nicotine harm or likelihood to try e-cigarettes. There were differences in warning recall by tweet content. Research should explore how varying warning content (text, size, placement) on tweets from e-cigarette brands influences health risk perceptions. IMPLICATIONS:FDA's 2016 ruling requires warnings on advertisements for nicotine-containing e-cigarettes, but does not specify whether this applies to social media. This study is the first to examine how e-cigarette warnings in tweets influence perceived healthiness and harm of e-cigarettes, which is important because e-cigarette brands are voluntarily including warnings on Twitter and Instagram. Warnings influenced perceived healthiness of the e-cigarette brand, but not perceived nicotine harm or likelihood to try e-cigarettes. We also saw higher recall of warning statements for tweets featuring e-cigarettes. Findings suggest that expanding warning requirements to e-cigarette social media marketing warrants further exploration and FDA consideration.
Project description:As e-cigarette use rapidly increases in popularity, data from online social systems (Twitter, Instagram, Google Web Search) can be used to capture and describe the social and environmental context in which individuals use, perceive, and are marketed this tobacco product. Social media data may serve as a massive focus group where people organically discuss e-cigarettes unprimed by a researcher, without instrument bias, captured in near real time and at low costs.This study documents e-cigarette-related discussions on Twitter, describing themes of conversations and locations where Twitter users often discuss e-cigarettes, to identify priority areas for e-cigarette education campaigns. Additionally, this study demonstrates the importance of distinguishing between social bots and human users when attempting to understand public health-related behaviors and attitudes.E-cigarette-related posts on Twitter (N=6,185,153) were collected from December 24, 2016, to April 21, 2017. Techniques drawn from network science were used to determine discussions of e-cigarettes by describing which hashtags co-occur (concept clusters) in a Twitter network. Posts and metadata were used to describe where geographically e-cigarette-related discussions in the United States occurred. Machine learning models were used to distinguish between Twitter posts reflecting attitudes and behaviors of genuine human users from those of social bots. Odds ratios were computed from 2x2 contingency tables to detect if hashtags varied by source (social bot vs human user) using the Fisher exact test to determine statistical significance.Clusters found in the corpus of hashtags from human users included behaviors (eg, #vaping), vaping identity (eg, #vapelife), and vaping community (eg, #vapenation). Additional clusters included products (eg, #eliquids), dual tobacco use (eg, #hookah), and polysubstance use (eg, #marijuana). Clusters found in the corpus of hashtags from social bots included health (eg, #health), smoking cessation (eg, #quitsmoking), and new products (eg, #ismog). Social bots were significantly more likely to post hashtags that referenced smoking cessation and new products compared to human users. The volume of tweets was highest in the Mid-Atlantic (eg, Pennsylvania, New Jersey, Maryland, and New York), followed by the West Coast and Southwest (eg, California, Arizona and Nevada).Social media data may be used to complement and extend the surveillance of health behaviors including tobacco product use. Public health researchers could harness these data and methods to identify new products or devices. Furthermore, findings from this study demonstrate the importance of distinguishing between Twitter posts from social bots and humans when attempting to understand attitudes and behaviors. Social bots may be used to perpetuate the idea that e-cigarettes are helpful in cessation and to promote new products as they enter the marketplace.
Project description:BACKGROUND:As the majority of Twitter content is publicly available, the platform has become a rich data source for public health surveillance, providing insights into emergent phenomena, such as vaping. Although there is a growing body of literature that has examined the content of vaping-related tweets, less is known about the people who generate and disseminate these messages and the role of e-cigarette advocates in the promotion of these devices. OBJECTIVE:This study aimed to identify key conversation trends and patterns over time, and discern the core voices, message frames, and sentiment surrounding e-cigarette discussions on Twitter. METHODS:A random sample of data were collected from Australian Twitter users who referenced at least one of 15 identified e-cigarette related keywords during 2012, 2014, 2016, or 2018. Data collection was facilitated by TrISMA (Tracking Infrastructure for Social Media Analysis) and analyzed by content analysis. RESULTS:A sample of 4432 vaping-related tweets posted and retweeted by Australian users was analyzed. Positive sentiment (3754/4432, 84.70%) dominated the discourse surrounding e-cigarettes, and vape retailers and manufacturers (1161/4432, 26.20%), the general public (1079/4432, 24.35%), and e-cigarette advocates (1038/4432, 23.42%) were the most prominent posters. Several tactics were used by e-cigarette advocates to communicate their beliefs, including attempts to frame e-cigarettes as safer than traditional cigarettes, imply that federal government agencies lack sufficient competence or evidence for the policies they endorse about vaping, and denounce as propaganda "gateway" claims of youth progressing from e-cigarettes to combustible tobacco. Some of the most common themes presented in tweets were advertising or promoting e-cigarette products (2040/4432, 46.03%), promoting e-cigarette use or intent to use (970/4432, 21.89%), and discussing the potential of e-cigarettes to be used as a smoking cessation aid or tobacco alternative (716/4432, 16.16%), as well as the perceived health and safety benefits and consequences of e-cigarette use (681/4432, 15.37%). CONCLUSIONS:Australian Twitter content does not reflect the country's current regulatory approach to e-cigarettes. Rather, the conversation on Twitter generally encourages e-cigarette use, promotes vaping as a socially acceptable practice, discredits scientific evidence of health risks, and rallies around the idea that e-cigarettes should largely be outside the bounds of health policy. The one-sided nature of the discussion is concerning, as is the lack of disclosure and transparency, especially among vaping enthusiasts who dominate the majority of e-cigarette discussions on Twitter, where it is unclear if comments are endorsed, sanctioned, or even supported by the industry.
Project description:<h4>Background</h4>Information and misinformation on the internet about e-cigarette harms may increase smokers' misperceptions of e-cigarettes. There is limited research on smokers' engagement with information and misinformation about e-cigarettes on social media.<h4>Objective</h4>This study assessed smokers' likelihood to engage with-defined as replying, retweeting, liking, and sharing-tweets that contain information and misinformation and uncertainty about the harms of e-cigarettes.<h4>Methods</h4>We conducted a web-based randomized controlled trial among 2400 UK and US adult smokers who did not vape in the past 30 days. Participants were randomly assigned to view four tweets in one of four conditions: (1) e-cigarettes are as harmful or more harmful than smoking, (2) e-cigarettes are completely harmless, (3) uncertainty about e-cigarette harms, or (4) control (physical activity). The outcome measure was participants' likelihood of engaging with tweets, which comprised the sum of whether they would reply, retweet, like, and share each tweet. We fitted Poisson regression models to predict the likelihood of engagement with tweets among 974 Twitter users and 1287 non-Twitter social media users, adjusting for covariates and stratified by UK and US participants.<h4>Results</h4>Among Twitter users, participants were more likely to engage with tweets in condition 1 (e-cigarettes are as harmful or more harmful than smoking) than in condition 2 (e-cigarettes are completely harmless). Among other social media users, participants were more likely to likely to engage with tweets in condition 1 than in conditions 2 and 3 (e-cigarettes are completely harmless and uncertainty about e-cigarette harms).<h4>Conclusions</h4>Tweets stating information and misinformation that e-cigarettes were as harmful or more harmful than smoking regular cigarettes may receive higher engagement than tweets indicating e-cigarettes were completely harmless.<h4>Trial registration</h4>International Standard Randomized Controlled Trial Number (ISRCTN) 16082420; https://doi.org/10.1186/ISRCTN16082420.
Project description:Health organizations are increasingly using social media, such as Twitter, to disseminate health messages to target audiences. Determining the extent to which the target audience (e.g., age groups) was reached is critical to evaluating the impact of social media education campaigns. The main objective of this study was to examine the separate and joint predictive validity of linguistic and metadata features in predicting the age of Twitter users. We created a labeled dataset of Twitter users across different age groups (youth, young adults, adults) by collecting publicly available birthday announcement tweets using the Twitter Search application programming interface. We manually reviewed results and, for each age-labeled handle, collected the 200 most recent publicly available tweets and user handles' metadata. The labeled data were split into training and test datasets. We created separate models to examine the predictive validity of language features only, metadata features only, language and metadata features, and words/phrases from another age-validated dataset. We estimated accuracy, precision, recall, and F1 metrics for each model. An L1-regularized logistic regression model was conducted for each age group, and predicted probabilities between the training and test sets were compared for each age group. Cohen's d effect sizes were calculated to examine the relative importance of significant features. Models containing both Tweet language features and metadata features performed the best (74% precision, 74% recall, 74% F1) while the model containing only Twitter metadata features were least accurate (58% precision, 60% recall, and 57% F1 score). Top predictive features included use of terms such as "school" for youth and "college" for young adults. Overall, it was more challenging to predict older adults accurately. These results suggest that examining linguistic and Twitter metadata features to predict youth and young adult Twitter users may be helpful for informing public health surveillance and evaluation research.
Project description:Introduction:Given increasing efforts to regulate e-cigarettes, it is important to understand factors associated with support for tobacco regulatory policies. We investigate such factors found in social media and hypothesize that greater online engagement with tobacco content would be associated with less support for e-cigarette regulatory policies. Methods:We constructed social networks of Twitter users who tweet about tobacco and categorized them using a combination of social network and Twitter metrics. Twitter users were identified as representing leaders, followers or general users in online discussions of tobacco products, and invited to complete an online survey. Participants responded to questions about their engagement with tobacco-related content online, degree of support for e-cigarette regulations, exposure to tobacco marketing, e-cigarette use and other demographic information. We examined links between their reported engagement with tobacco-related content and support for e-cigarette regulatory policies using structural equation modelling. Results:The analytic sample consisted of 470 participants. The conceptualized structural equation model had a good fit (?2 (32)?=?24.85, p?=?0.09, CFI?=?0.99, RMSEA?=?0.03). Findings support our hypothesis: engagement with online tobacco content was negatively associated with support for e-cigarette policies, while controlling for e-cigarette use, tobacco marketing exposure, social media use frequency and demographic factors. Conclusions:Findings suggest that our hypothesis was supported. Twitter users engaging with tobacco-related content and harboring negative attitudes toward e-cigarette regulatory policies could be an important audience segment to reach with tailored e-cigarette policy education messages.
Project description:We evaluated a novel Twitter-delivered intervention for smoking cessation, Tweet2Quit, which sends daily, automated communications to small, private, self-help groups to encourage high-quality, online, peer-to-peer discussions.A 2-group randomised controlled trial assessed the net benefit of adding a Tweet2Quit support group to a usual care control condition of nicotine patches and a cessation website.Participants were 160 smokers (4 cohorts of 40/cohort), aged 18-59?years, who intended to quit smoking, used Facebook daily, texted weekly, and had mobile phones with unlimited texting.All participants received 56?days of nicotine patches, emails with links to the smokefree.gov cessation website, and instructions to set a quit date within 7?days. Additionally, Tweet2Quit participants were enrolled in 20-person, 100-day Twitter groups, and received daily discussion topics via Twitter, and daily engagement feedback via text.The primary outcome was sustained abstinence at 7, 30 and 60?days post-quit date.Participants (mean age 35.7?years, 26.3% male, 31.2% college degree, 88.7% Caucasian) averaged 18.0 (SD=8.2) cigarettes per day and 16.8 (SD=9.8) years of smoking. Participants randomised to Tweet2Quit averaged 58.8 tweets/participant and the average tweeting duration was 47.4?days/participant. Tweet2Quit doubled sustained abstinence out to 60?days follow-up (40.0%, 26/65) versus control (20.0%, 14/70), OR=2.67, CI 1.19 to 5.99, p=0.017. Tweeting via phone predicted tweet volume, and tweet volume predicted sustained abstinence (p<0.001). The daily autocommunications caused tweeting spikes accounting for 24.0% of tweets.Tweet2Quit was engaging and doubled sustained abstinence. Its low cost and scalability makes it viable as a global cessation treatment.NCT01602536.