4.7 Article

A linguistic/game-theoretic approach to detection/explanation of propaganda

Journal

EXPERT SYSTEMS WITH APPLICATIONS
Volume 189, Issue -, Pages -

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.eswa.2021.116069

Keywords

Propaganda; Linguistic analysis; Shapley values; Prediction explanation; Gradient boosting

Ask authors/readers for more resources

Online propaganda poses threats to domestic security, democratic institutions, and public health. This study contributes to countering propaganda in American cyberspace by constructing a rich dataset and using linguistic features to score news for its propagandistic content. The research also addresses the challenge of accuracy-interpretability trade-off in machine learning and provides insights into the linguistic profile of propaganda in today's American news ecosystem.
Online propaganda is a growing menace to domestic security, democratic institutions, and public health. The two most notable examples are the election-fraud propaganda that eventually led to the 2021 insurrection at the US Capitol, and the anti-vaccination propaganda, which is undermining the scientific efforts to turn the tide of the COVID-19 pandemic. The present study, while appreciating the inherent social, legal, and political dilemmas, contributes to the technological developments in countering propaganda in today's American cyberspace. Using a (political) news repository, we first construct a dataset that contains nearly 205,000 articles from 39 propagandistic and 30 trustworthy news outlets-making it one of the richest and least biased datasets used for propaganda research thus far. We subsequently construct a set of models that use linguistic features to score news for its propagandistic content. The superior model, constructed with LightGBM-a state-of-the-art gradient boosting tree-based algorithm, both outperforms (C-index = 0.9 and F1-score = 0.84) and outruns the baseline models. Motivated by the accuracy-interpretability trade-off (and tension) in machine learning, which is also underscored as a challenge in propaganda detection research agenda, we then draw on the coalitional game theory ideas to compute the contribution of each linguistic aspect of an article to its propaganda score, thereby explaining the model decision to the user. Finally, we aggregate the contributions of each linguistic feature across all predictions to provide new insights into the linguistic profile of propaganda in today's American news ecosystem. Future research can apply the linguistic/game-theoretic approach in this study to detecting and explaining anti-vaccination propaganda as well as other forms of information pollution (e.g., fake news) in general.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available