4.6 Article

An Assessment of Deep Learning Models and Word Embeddings for Toxicity Detection within Online Textual Comments

Journal

ELECTRONICS
Volume 10, Issue 7, Pages -

Publisher

MDPI
DOI: 10.3390/electronics10070779

Keywords

deep learning; word embeddings; toxicity detection; binary classification

Funding

  1. NVIDIA Corporation

Ask authors/readers for more resources

With the rise of online interactions, increasing numbers of people are producing textual comments, which can hide hazards such as fake news, insults, and harassment. Detecting this toxicity is crucial for moderating online communication. Recent advancements in deep learning have shown impressive capabilities in Natural Language Processing applications, particularly in Sentiment Analysis. Word embeddings have been widely used and proven effective in sentiment analysis tasks.
Today, increasing numbers of people are interacting online and a lot of textual comments are being produced due to the explosion of online communication. However, a paramount inconvenience within online environments is that comments that are shared within digital platforms can hide hazards, such as fake news, insults, harassment, and, more in general, comments that may hurt someone's feelings. In this scenario, the detection of this kind of toxicity has an important role to moderate online communication. Deep learning technologies have recently delivered impressive performance within Natural Language Processing applications encompassing Sentiment Analysis and emotion detection across numerous datasets. Such models do not need any pre-defined hand-picked features, but they learn sophisticated features from the input datasets by themselves. In such a domain, word embeddings have been widely used as a way of representing words in Sentiment Analysis tasks, proving to be very effective. Therefore, in this paper, we investigated the use of deep learning and word embeddings to detect six different types of toxicity within online comments. In doing so, the most suitable deep learning layers and state-of-the-art word embeddings for identifying toxicity are evaluated. The results suggest that Long-Short Term Memory layers in combination with mimicked word embeddings are a good choice for this task.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available