This is why it’s difficult to take Elon Musk’s assessment of Twitter’s performance as an accurate measure of what’s actually happening at the app.
Earlier this month, at the Morgan Stanley TMT Conference, Elon provided an overview of how Twitter is performing, which included a range of stats and notes, including this chart.
That aligns with Musk’s public statements about tackling problematic elements of the app, including both hate speech and CSAM content - yet experts have raised questions about the truth of this, and how Musk and Co. are actually measuring such, given that external analysis suggests this is not the case.
This has come under question once again, with the Institute for Strategic Dialogue (ISD) recently finding that incidents of anti-Semitic speech in the app have risen sharply since Musk took the helm.
As reported by The Washington Post:
“The study, which used machine-learning tools to identify likely antisemitic tweets, found that the average weekly number of such posts ‘more than doubled after Musk’s acquisition’ - a trend that has held in the months after Musk took over. The analysis found an average of over 6,200 posts per week appearing to contain antisemitic language between June 1 and Oct. 27, the day Musk completed his $44 billion deal to buy Twitter. But that figure rose to over 12,700 through early February - a 105 percent increase”
Indeed, according to ISD’s analysis, the volume of antisemitic hate speech on Twitter has remained steady since the Musk takeover, indicating that, contrary to Twitter’s own data, hate speech is actually up significantly in the app.
Which makes some sense. Musk has overseen the reinstatement of over 60,000 Twitter accounts that had previously been banned, many for hate speech violations, and analysis also suggests that many of these accounts have resumed their previous tweeting habits, sharing hate speech, misinformation, etc.
Within that context, it seems logical that hate speech could only increase – yet, Elon and Co. have repeatedly claimed that they’re doing more to combat negative elements than the company has ever done before, which has won them many supporters in the process, particularly in the case of CSAM material.
Late last year, various CSAM campaigners praised Musk for ‘ridding Twitter of child pornography and child trafficking hashtags’, which, at least on the surface, appears to have had some impact on halting CSAM material entirely in the app.
Twitter itself has also claimed victory on this, sharing this chart on its CSAM performance.
But experts have also refuted these figures, noting that CSAM content is still prevalent in the app, while Musk has also eliminated 15% of the platform’s trust and safety staff, the team that’s responsible for detecting and actioning such cases.
Again, on balance, it doesn’t seem possible that Twitter could be improving its performance in these critical areas at the rates that it’s claiming, given reduced resources and funding. But it’s hard to know exactly what is happening, because Elon and Co. are saying one thing, while external analysis suggests that the real outcome is very different.
It does, however, seem safe to say that Twitter’s performance is likely not as amazing as the Twitter 2.0 team wants to claim, and that these remain significant and pressing issues, which will continue to be key areas that require focus, even if certain aspects are being addressed as part of the new team’s updated push.
Ideally, Twitter will continue to refine and improve its processes, and find new ways to tackle hate speech. But it’s an ongoing, and ever-evolving problem, for which there is no true victory that can be claimed.
Hopefully, Twitter’s charts reflect ongoing effort, not a singular push.
UPDATE: A day after the release of this report, Twitter published its own update on the state of hate speech in the app, noting a variance in methodology for its tracking of this element.
As per Twitter:
“Sprinklr [whom Twitter has worked with on its assessment] defines hate speech by evaluating slurs in the nuanced context of their use. Twitter has, to this point, taken a broader view of the potential toxicity of slur usage. To quantify hate speech, Twitter & Sprinklr start with 300 of the most common English-language slurs. We count not only how often they’re tweeted but how often they’re seen (impressions). Our models score slur Tweets on 'toxicity', the likelihood that they constitute hate speech.”
According to Twitter, most slur usage is not hate speech, but when it is, Twitter’s system is proving increasingly effective in in reducing its reach.
"Sprinklr’s analysis found that hate speech receives 67% fewer impressions per Tweet than non-toxic slur Tweets. No model is ever perfect, and this work is never done. We’ll continue to combat hate speech by incorporating other languages, new terms, and more precise methodologies - all while increasing transparency."
Essentially, Twitter’s saying that counting mentions of potential slurs, as identified in the ISD study, is not an effective means to measure the impact of such, because it’s not the mentions themselves, but the reach they get that’s important.
I’m not sure that I fully understand how ‘most slur usage is not considered hate speech’, as Twitter states (Sprinklr does provide some examples here), but under the platform’s new ‘freedom of speech, not freedom of reach’ approach, it does make some sense that Twitter would be looking to shift the thinking around this measurement.
It’ll be interesting to see the full data, and how it defines non-toxic slurs, if Twitter or Sprinklr makes that available.