4.8 Article

Estimating geographic subjective well-being from Twitter: A comparison of dictionary and data-driven language methods

Publisher

NATL ACAD SCIENCES
DOI: 10.1073/pnas.1906364117

Keywords

Twitter; subjective well-being; language analysis; big data; machine learning

Funding

  1. Nanyang Presidential Postdoctoral Award
  2. Adobe Research Award
  3. RobertWood Johnson Foundation Pioneer Award
  4. Templeton Religion Trust [TRT0048]

Ask authors/readers for more resources

Researchers and policy makers worldwide are interested in measuring the subjective well-being of populations. When users post on social media, they leave behind digital traces that reflect their thoughts and feelings. Aggregation of such digital traces may make it possible to monitor well-being at large scale. However, social media-based methods need to be robust to regional effects if they are to produce reliable estimates. Using a sample of 1.53 billion geotagged English tweets, we provide a systematic evaluation of word-level and data-driven methods for text analysis for generating well-being estimates for 1,208 US counties. We compared Twitter-based county-level estimates with well-being measurements provided by the Gallup-Sharecare Well-Being Index survey through 1.73 million phone surveys. We find that word-level methods (e.g., Linguistic Inquiry and Word Count [LIWC] 2015 and Language Assessment by Mechanical Turk [LabMT]) yielded inconsistent county-level wellbeing measurements due to regional, cultural, and socioeconomic differences in language use. However, removing as few as three of the most frequent words led to notable improvements in well-being prediction. Data-driven methods provided robust estimates, approximating the Gallup data at up to r = 0.64. We show that the findings generalized to county socioeconomic and health outcomes and were robust when poststratifying the samples to be more representative of the general US population. Regional well-being estimation from social media data seems to be robust when supervised data-driven methods are used.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Psychology, Clinical

World Trade Center responders in their own words: predicting PTSD symptom trajectories with AI-based language analyses of interviews

Youngseo Son, Sean A. P. Clouston, Roman Kotov, Johannes C. Eichstaedt, Evelyn J. Bromet, Benjamin J. Luft, H. Andrew Schwartz

Summary: This study demonstrates the value of AI in understanding PTSD in a vulnerable population. Future studies should extend this application to other trauma exposures and to other demographic groups, especially under-represented minorities.

PSYCHOLOGICAL MEDICINE (2023)

Article Psychology, Multidisciplinary

Beyond Beliefs: Multidimensional Aspects of Religion and Spirituality in Language

David B. Yaden, Salvatore Giorgi, Margaret L. Kern, Alejandro Adler, Lyle H. Ungar, Martin E. P. Seligman, Johannes C. Eichstaedt

Summary: Religion and spirituality are multidimensional constructs involving practices, rituals, and experiences. This study explores the language associations of different dimensions and finds divergent profiles. Additionally, it reveals that non-believers focus more on emotions such as inspiration and gratitude rather than religious doctrine.

PSYCHOLOGY OF RELIGION AND SPIRITUALITY (2023)

Article Multidisciplinary Sciences

Cross-platform- and subgroup-differences in the well-being effects of Twitter, Instagram, and Facebook in the United States

Kokil Jaidka

Summary: Spatial aggregates of survey and web search data reveal the heterogeneous well-being effects of social media platforms. The study finds that frequent visits to Facebook have consistently positive well-being effects, while visits to Instagram have negative effects. Furthermore, these effects vary across different population groups, with white and high-income individuals experiencing more positive effects and younger and Black populations experiencing adverse effects.

SCIENTIFIC REPORTS (2022)

Article Psychology, Social

Incivility Is Rising Among American Politicians on Twitter

Jeremy A. Frimer, Harinder Aujla, Matthew Feinberg, Linda J. Skitka, Karl Aquino, Johannes C. Eichstaedt, Robb Willer

Summary: This study provides the first systematic investigation of the incivility trends of American politicians on Twitter. The research reveals a 23% increase in incivility over the past decade on Twitter, partly driven by politicians engaging in greater incivility following positive feedback. Uncivil tweets tend to receive more approval and attention, leading to more uncivil tweets thereafter.

SOCIAL PSYCHOLOGICAL AND PERSONALITY SCIENCE (2023)

Article Communication

The Political Landscape of the US Twitterverse

Subhayan Mukerjee, Kokil Jaidka, Yphtach Lelkes

Summary: Prior research suggests that Twitter users in the United States are more politically engaged and partisan than the general American population. However, this study finds little evidence to support this claim. The study analyzes the most popular Twitter accounts in the United States and concludes that the platform is not as political as previously thought. Ordinary Americans are more likely to follow nonpolitical opinion leaders on Twitter, and there is no significant polarization among these opinion leaders.

POLITICAL COMMUNICATION (2022)

Letter Psychology, Biological

Reply to: Local news in Google News

Sean Fischer, Kokil Jaidka, Yphtach Lelkes

NATURE HUMAN BEHAVIOUR (2022)

Review Psychology, Developmental

Effectiveness of peer support programmes for improving well-being and quality of life in parents/carers of children with disability or chronic illness: A systematic review

Katharine Lancaster, Anoo Bhopti, Margaret L. Kern, Rachel Taylor, Annick Janson, Katherine Harding

Summary: This systematic review examines the quantitative evidence from the past decade on the effectiveness of peer support programmes in improving the well-being and quality of life for parents/carers of children with disability/chronic illnesses. The results suggest that peer support is effective in reducing distress and improving well-being and quality of life for parents, but the included studies have limitations in terms of bias.

CHILD CARE HEALTH AND DEVELOPMENT (2023)

Article Communication

Silenced on social media: the gatekeeping functions of shadowbans in the American Twitterverse

Kokil Jaidka, Subhayan Mukerjee, Yphtach Lelkes

Summary: Algorithms play a critical role in directing online attention on social media, which has raised concerns about perpetuating bias. This study examined shadowbanning on Twitter, where users or their content are temporarily hidden. By testing a stratified random sample of American Twitter accounts, the study identified the factors predicting shadowbans. The findings showed that accounts with bot-like behavior were more likely to be shadowbanned, while verified accounts were less likely. Offensive and politically-focused tweets also faced potential downgrading. These findings have implications for algorithmic accountability and future audits of social media platforms.

JOURNAL OF COMMUNICATION (2023)

Article Computer Science, Interdisciplinary Applications

Confirmation Bias in Seeking Climate Information: Employing Relative Search Volume to Predict Partisan Climate Opinions

Yifei Wang, Kokil Jaidka

Summary: In an increasingly digitalized world, online information-seeking behaviors play a critical role in synthesizing public opinion. This study investigates whether search strategies align with the expected confirmation biases of regions with different partisan beliefs. The results show significant differences in search keywords adopted by Democrat-majority and Republican-majority regions in the United States, and suggest that the preferential use of certain search keywords can predict climate opinions.

SOCIAL SCIENCE COMPUTER REVIEW (2023)

Article Public, Environmental & Occupational Health

Measuring disadvantage: A systematic comparison of United States small-area disadvantage indices

Sophia Lou, Salvatore Giorgi, Tingting Liu, Johannes C. Eichstaedt, Brenda Curtis

Summary: Extensive evidence shows that area-based disadvantage has negative effects on various life outcomes, including increased mortality and low economic mobility. However, the measurement of disadvantage using composite indices is inconsistent across studies. To address this issue, we compared 5 U.S. disadvantage indices at the county-level and their relationships with 24 diverse life outcomes. The Area Deprivation Index (ADI) and Child Opportunity Index 2.0 (COI) were found to be most related to a wide range of life outcomes, particularly physical health.

HEALTH & PLACE (2023)

Article Multidisciplinary Sciences

Predicting US county opioid poisoning mortality from multi-modal social media and psychological self-report data

Salvatore Giorgi, David B. Yaden, Johannes C. Eichstaedt, Lyle H. Ungar, H. Andrew Schwartz, Amy Kwarteng, Brenda Curtis

Summary: Opioid poisoning is a major public health crisis in the United States, responsible for 75% of the drug-related deaths. A lack of measurement tools for social and psychological factors hinder research in this area. This study uses a multi-modal data set, including Twitter language, psychometric self-reports, and area-based measures, to predict and understand opioid poisoning. The results show that Twitter language predicted opioid poisoning mortality better than socio-demographics, healthcare access, physical pain, and psychological well-being factors.

SCIENTIFIC REPORTS (2023)

Article Psychology, Experimental

Characterizing Empathy and Compassion Using Computational Linguistic Analysis

David B. Yaden, Salvatore Giorgi, Matthew Jordan, Anneke Buffone, Johannes C. Eichstaedt, H. Andrew Schwartz, Lyle Ungar, Paul Bloom

Summary: Many scholars argue that empathy is crucial for other-regarding sentiments and plays a significant role in our moral lives, while compassion is also seen as a relevant force for prosocial motivation and action. This study, using computational linguistics, explores the relationship between empathy and compassion. Analysis of Facebook posts reveals that individuals high in empathy use different language than those high in compassion, and high empathy without compassion is associated with negative health outcomes, while high compassion without empathy is related to positive health outcomes, lifestyle choices, and charitable giving. These findings support a moral motivation grounded in compassion rather than empathy.

EMOTION (2023)

Review Psychology, Multidisciplinary

The value of social media language for the assessment of wellbeing: a systematic review and meta-analysis

S. Sametoglu, D. H. M. Pelt, J. C. Eichstaedt, L. H. Ungar, M. Bartels

Summary: This article presents a systematic review and meta-analysis of studies on the effectiveness of social media text mining in measuring well-being. The results show that there is a correlation between social media text mining and survey-based assessments of well-being, making it a valuable tool for evaluating individual and regional well-being. The article also provides recommendations for future research, including considering language diversity and careful selection of data collection methods.

JOURNAL OF POSITIVE PSYCHOLOGY (2023)

Article Psychology, Applied

Comparison of wellbeing structures based on survey responses and social media language: A network analysis

Selim Sametoglu, Dirk H. M. Pelt, Johannes C. Eichstaedt, Lyle H. Ungar, Meike Bartels

Summary: Wellbeing can be measured through surveys and social media text mining (SMTM). Comparing survey data and social media language features, the networks derived from both methods showed similar structures, consisting of five wellbeing dimensions. This suggests that survey and SMTM methods can complement each other to understand differences in human wellbeing.

APPLIED PSYCHOLOGY-HEALTH AND WELL BEING (2023)

Article Psychology, Clinical

Depression and Anxiety Have Distinct and Overlapping Language Patterns: Results From a Clinical Interview

Elizabeth C. Stade, Lyle Ungar, Johannes C. Eichstaedt, Garrick Sherman, Ayelet Meron Ruscio

Summary: Depression is associated with increased use of first-person pronouns and negative emotion words. However, previous studies have not differentiated between depression and anxiety. This study interviewed individuals with varying levels of depression and anxiety and found that certain language features are specific to depression, while others are specific to anxiety.

JOURNAL OF PSYCHOPATHOLOGY AND CLINICAL SCIENCE (2023)

No Data Available