To Stop Online Hate, Big Tech Must Let Those Being Targeted Lead the Way
Billions of people log onto social media platforms every day. As we spend an increasing portion of our lives online, our exposure to hate-based content becomes routine. The Anti-Defamation League’s 2021 survey of hate and harassment on social media found that 41 percent of Americans experienced online harassment, while 27 percent experience severe harassment, which includes sexual harassment, stalking, physical threats, swatting, doxing, and sustained harassment. We are inundated with conspiracy theories, scams, misinformation, or racist speech that frustrate users — or, worse, threaten our safety.
One way technology companies can create safer and more equitable online spaces is to moderate content more consistently and comprehensively. Tech companies have been frequently criticized for inconsistently enforcing their stated policies at the scale of billions of users, causing seismic levels of harm. It is unclear how content moderation teams are trained to recognize and address various forms of hate, such as antisemitism. Their training materials or even operational definitions have not been made public or shared privately with civil society. Additionally, as tech companies increasingly rely on artificial intelligence to remove offensive posts on social media platforms, we have no idea whether the perspective of targets of online hate is used to create these technologies.
For example, evidence from the leaked Facebook documents submitted by whistleblower Frances Haughen to the SEC in 2021 suggest that the haphazard way automated content moderation technologies are developed are overbroad in their understanding of hate and ineffective. The document states that current automated methods are removing “less than 5% of all of the hate speech posted to Facebook.” Furthermore, studies have shown that algorithms that detect hate speech online can often be racially biased.
It is crucial to find ways for communities that are often the targets of hate to contribute to the creation of technology tools that automate and augment content moderation. To model what this process might look like, the ADL Center for Technology and Society is building the Online Hate Index (OHI), a set of machine learning classifiers that detect hate targeting marginalized groups on online platforms. The first of this set, the OHI antisemitism classifier, draws upon the knowledge of both ADL’s antisemitism experts and Jewish community volunteers who may have experienced antisemitism. Together, these groups are the best positioned to understand and operationalize a definition of antisemitism.
To better grasp how machine learning classifiers work, imagine a child taking in information that, through practice, helps them discern and understand their world. Machine learning works similarly. In the case of OHI, our machine learning antisemitism classifier takes in pieces of information (here, text) that have been determined to be antisemitic, or not, by ADL experts and Jewish volunteers.
Through practice, the algorithm learns to recognize antisemitic content and starts to generalize language patterns when given numerous examples of both antisemitism and not antisemitic content. In the same way, a child might take in specific information about a situation (“This cup is orange”) and start to generalize to their broader experience of the world (“This is what the color orange looks like”). Over time, the model gets better at predicting the likelihood that a piece of content it has never seen before — a tweet, comment, or post — is or is not antisemitic.
In August 2021, ADL conducted what we believe to be the first independent, AI-assisted, community-rooted measurement of identity-based hate across Reddit and Twitter. We found that the rate of antisemitic content on Twitter during the week we investigated was 25% higher than on Reddit. The potential reach of the antisemitic content we found on Twitter in that one week alone was 130 million people. If this is the case on some of the most responsible tech platforms, it stands to reason that these issues are much more dire on platforms run by other less forward-thinking tech companies, such as Facebook.
If all platforms were as open to sharing data as Twitter and Reddit, the future might be brighter. Groups like ADL would be able to employ tools like the OHI to audit all social platforms, rooted in the perspective of targeted communities, and ascertain the prevalence of hate against those groups on those platforms. We would then be able to evaluate whether efforts by the tech company were sufficient in decreasing hate on their platforms. We would be able to compare rates between platforms using the same measurements and determine what methods of mitigating hate have been most successful.
Unfortunately, as we described in our data accessibility scorecard, platforms other than Reddit and Twitter are not providing the necessary data to make this a reality. Platforms should provide the necessary data to make this a reality — and if they do not, governments should find thoughtful means to require it.
ADL hopes the way that the OHI combines machine learning and human expertise ± and centers targeted communities in technology development — offers a practical path to holding platforms accountable. The potential exists for other civil society organizations to develop similar tools using volunteers to label homophobia, transphobia, misogyny, and racism.
We need more technology that detects identity-based hate. If social media platforms are to effectively fight hate, they must allow the people most affected by it to lead the way.
Daniel Kelley is the Director of Strategy and Operations for the ADL Center for Technology and Society