New study finds Reddit users with toxic usernames are more likely to generate toxic content

4 min readJul 1, 2022

A new study has found that Reddit users with toxic usernames (for example IluvHitler) are more likely to display toxic online behavior, such as personal attacks and sexual harassment.

The study also found that users with toxic usernames are about 2.2 times more likely to have their accounts suspended by moderators.

The study, conducted by researchers from universities in Poland and Japan, was published on July 1, 2022, in the journal Computers in Human Behavior.

What’s in a toxic username?

To determine the toxicity of usernames, the researchers relied on a system developed by a company called Samurai Labs.

The system’s four main categories of “username toxicity” are offensive (which includes racist, homophobic, or violent language), profane (which covers typical swear words), sexual, and inappropriate (which includes drugs or human physiology, but without explicitly vulgar language).

It also detects common tricks used to conceal potentially offensive usernames, such as spelling the name backwards, swapping letters, leetspeak, and intentional misspellings.

Measuring toxicity on Reddit: the datasets

The researchers used two datasets. The first consisted of 122k Reddit users with toxic usernames (as determined by the Samuarai system described above), and 122k random Reddit users with neutral usernames. These 244k users were active on Reddit between February 12 and February 22, 2020.

Likewise, the second dataset consisted of 207k Reddit users with toxic names and 207k with non-toxic names, active at some point between June 20 — July 9, 2020.

The researchers also used a similar methodology to classify these users’ comments into non-toxic or toxic, with toxic categories including personal attack, sexual harassment, profanity, etc.

To determine the correlation between toxic usernames and account suspensions, they randomly chose 50k users with toxic usernames and 50k users with non-toxic usernames from each dataset.

Overall findings

The researchers’ analysis suggests that overall, “users with toxic usernames produce more toxic content and, in turn, are more likely to be suspended by the moderators.”

A “moderately active user with a toxic username is expected to produce 38% more toxic comments in a year” than a neutrally-named counterpart.

Moreover, they found that about 2.7% of users have toxic usernames.

Users with sexual or profane language in their usernames generated on average around 50% more toxic content than similarly active users with neutral usernames.

Perhaps unsurprisingly, the users with toxic usernames that include profanities generated the most personal attacks (45% more than average), and users with sexual language in their usernames generated the most sexual harassment and sexual remarks (250% more than average).

Why do some users create so much toxic content?

The researchers say that their study did not address users’ motivations to produce toxic content, though they do have serval hypotheses.

It might be the case, for example, that “being a high-conflict person or suffering from personality disorder” might compel some users to attack other users more often.

Age might also be a factor, whereby younger users are more likely to choose deliberately provocative usernames and behave in a disruptive manner online.

Likewise, the researchers theorize, perhaps users with controversial usernames “attract more attacks from others, and thus produce more attacks in retaliation.”

Or, it may that be poor performers on Reddit, i.e. those whose comments are more often downvoted, are more inclined to engage in disruptive behavior.

The toxic content is produced by a minority of users

The researchers point out that even among users with toxic usernames, most (between 58% and 65%) do not produce toxic content; this figure is about 70% for users with neutral, non-toxic usernames.

They also point out that there are likely “other types of toxicity that do not involve aggressive or harmful speech” that were therefore not included in this study.

“Concern trolling” is one example, whereby a user disingenuously expresses concern for another user with the actual goal of shaming or otherwise upsetting them, in which case the language used is ostensibly good-willed but is in fact malicious.

“Taking this into consideration,” they write, “it is possible that some of the accounts categorized as non-toxic in fact engaged in other types of toxicity.”

Another issue is how to determine whether certain behavior is actually toxic. Reddit is divided into “millions of diverse communities with their own, unique cultural codes,” they write, so what may be seen as toxic in one community could be considered acceptable elsewhere.

Some Reddit communities, for example, “are created solely for the purpose of mocking and humiliating each other.”

Future directions

As mentioned above, most users with toxic usernames do not display toxic behavior.

Nonetheless, the sheer size of online communities such as Reddit means that they cannot realistically be moderated purely by hand, which means AI likely has to play some role.

Indeed, AI is perhaps even “necessary to preserve the mental health and quality of life of moderators,” they write, especially considering how platforms like Twitch or Discord “allow for multiple display name changes, which makes it even more problematic for moderators to keep up and respond accordingly.”

The research team also presents several future directions for new studies, for example on how to help, rather than merely ban, problematic users.

And finding these solutions matters, especially as more and more aspects of daily life move online.

“Studying and preventing online toxicity is especially important,” they write, “as exposure to its various forms can negatively affect mental health, increase the probability of anxiety, stress, depression, and suicidal thinking, and contribute to substance use and aggression.”

Study: “ Namespotting: Username toxicity and actual toxic behavior on reddit”
Authors: Rafal Urbaniak Patrycja Tempska et al
Published in: Computers in Human Behavior
Publication date: July 1, 2022
DOI: https://doi.org/10.1016/j.chb.2022.107371
Photo: via Canva

Originally published at https://www.psychnewsdaily.com on July 1, 2022.