AI still sucks at moderating hate speech
For all of the recent advances in language AI technology, it still struggles with one of the most basic applications, writes Karen Hao in MIT Technology Review (and meanwhile also reported by Luigi Mischitelli in Italian in Agenda Digitale).
In a new study, scientists tested four of the best AI systems for detecting hate speech and found that all of them struggled in different ways to distinguish toxic and innocuous sentences.
The results are not surprising—creating AI that understands the nuances of natural language is hard. But the way the researchers diagnosed the problem is important. They developed 29 different tests targeting different aspects of hate speech to more precisely pinpoint exactly where each system fails. This makes it easier to understand how to overcome a system’s weaknesses and is already helping one commercial service improve its AI.