Talk
in
Workshop: Disinformation Countermeasures and Machine Learning (DisCoML)
Multilingual Disinformation Detection for Digital Advertising
Maryline CHEN
In today's world, the spread of disinformation and propaganda online is more widespread than ever. Most of the publisher's revenue comes from advertising, therefore placing ads on these web pages directly funds the publisher, which has been brought under scrutiny by various media. The question of how to remove the publishers from advertising inventory has long been ignored, despite the negative consequences on the open internet. In this work, we make the first step to quickly detect and red-flag the publishers that potentially manipulate the public with disinformation or falsehoods. We build a machine learning model based on multilingual text embeddings that first detects the topic of interest and then estimates the likelihood of the page being malicious. Our systems empower internal teams to proactively, rather than defensively, blacklist unsafe content.